Consider this:
I have a database with 10 rows.
Each row has a unique id (int) paired with some value e.g. name (varchar).
These ids are incremented from 1 to 10.
I delete 2 of the records - 2 and 8.
I add 2 more records 11 and 12.
Questions:
Is there a good way to redistribute unique ids in this database so it would go from 1 to 10 again ?
Would this be considered bad practice ?
I ask this question, because after some use of this database: adding and deleting values the ids would differ significantly.
One way to approach this would be to just generate the row numbers you want at the time you actually query, something like this:
SET #rn = 0;
SELECT
(#rn:=#rn + 1) AS rn, name
FROM yourTable;
ORDER BY id;
Generally speaking, you should not be worrying about the auto increment values which MySQL is assigning. MySQL will make sure that the values are unique without your intervention.
If you set the ID column to be primary key and an auto-increment as well, "resetting" is not really necessary because it will keep assigning unique IDs anyways.
If the thing that bothers you are the "gaps" among the existing values, then you might resort to "sort deletion", by employing the is_deleted column with bit/boolean values. Default value would be 0 (or b0), of course. In fact, soft-deleting is advised if there are some really important data that might be useful later on, especially if it involves possibility for payment-related entries where user can delete one of such entries either by omission or deliberately.
There is no simple way to employ the deletion where you simply remove one value and re-arrange the remaining IDs to retain the sequence. A workaround might be to do the following steps:
DELETE entry first. i.e. delete from <table> where ID = _value
INSERT INTO SELECT (without id column). please note that the table need to be identical in terms of columns and types in order for this query to work properly, so to speak... and you can also utilize temporary as the backup_table. i.e. insert into <backup_table> select <coluum1, column2, ...> from <table>
TRUNCATE your table, i.e. truncate table <table>
copy the values from the temp table back into the existing table. You can utilize the INSERT INTO SELECT once again, but make sure to drop the temp table in the end
Please note that I would NOT advise you to do this, mainly because most people utilize some sort of caching in their applications and they also utilize the specific ways to evaluate whether a specific object is the same.
I.e. in Java, the equals() and hashCode() methods for POJOs are overriden and programmers generally rely on IDs to be permanent way of identifying a specific object. By utilizing the above method, you essentially break the whole concept and I would not advise you to change the object's autoincrement ID value for this reason, before anything else.
Essentially, what you want to do is simply an anti-pattern and will generally make common patterns and practices employed by experienced programmers into solutions that are prone to unexpected issues and/or failures... and this especially applies if/when advanced features are involved, such as employing this such anti-pattern into an application that utilizes galera cluster and/or application caching.
Related
I am trying to have a table with an "order" column to allow rearranging the order of data. Is this possible using jpa? Maybe something similar to #OrderColumn but on the table itself.
Basically I want to add a new column called "order" that saves the order the records. If a record is added, it would automatically get a "order" value. If a record was deleted, the "order" of the remaining would be automatically updated. Additionally if possible, to rearrange the orders by moving one record to an lower "order" and it would push the others
There is no way to do this out of the box, but you can implement this yourself if you want. Just query for the count of objects right before persisting and set the count + 1 as value for that order column. Make sure that the order column is declared as being unique i.e. with a unique constraint.
Note that your requirement is pretty exotic and will likely require some kind of table lock or retry mechanism if you have high concurrency.
IMO you should ask whoever gave you this requirement what the goal is that should be achieved. I bet that you will find out you don't need this after all.
sorry, if the question title is misleading or not accurate enough, but i didn't see how to ask it in one sentence.
Let's say we have a table where the PK is a String (numbers from '100,000' to '999,999', comma is for readability only).
Let's also say, the PK is not sequentially used.
Now i want to insert a new row into the table using java.sql and show the PK of the inserted row to the User. Since the PK is not generated by default (e.g. insert values without the PK didn't work, something like generated_keys is not available in the given environment) i've seen two different approaches:
in two different statements, first find a possible next key, then try to insert (and expect that another transaction used the same key in the time between the two statements) - is it valid to retry until success or could any sql trick with transaction-settings/locks help here? how can i realize that in java.sql?
for me, that's a disappointing solution, because of the non-deterministic behaviour (perhaps you could convince me of the contrary), so i searched for another one:
insert with a nested select statement that looks up the next possible PK. looking up other answers on generating the PK myself I came close to a working solution with that statement (left out the casts from string to int):
INSERT INTO mytable (pk,othercolumns)
VALUES(
(SELECT MIN(empty_numbers.empty_number)
FROM (SELECT t1.pk + 1 as empty_number
FROM mytable t1
LEFT OUTER JOIN mytable t2
ON t1.pk + 1 = t2.pk
WHERE t2.pk IS NULL
AND t1.pk > 100000)
as empty_numbers),
othervalues);
that works like a charm and has (afaik) a more predictable and stable solution than my first approach, but: how can i possibly retrieve the generated PK from that statement? I've read that there is no way to return the inserted row (or any columns) directly and most of the google results i've found, point to returning generated keys - even though my key is generated, it's not generated by the DBMS directly, but by my statement.
Note, that the DBMS used in development is MSSQL 2008 and the productive system is currently a DB2 on AS/400 (don't know which version) so i have to stick close to SQL standards. i can't change the db-structure in any way (e.g. use generated keys, i'm not sure about stored procedures).
DB2 for i allows generated keys, stored procedures, user defined functions - pretty much all of the things SQL Server can do. The exact implementation is different, but that's what manuals are for :-) Ask your admin what version of IBM i they're running, then hit up the Infocenter for specifics.
The constraining factor is that you can't alter the database design; you are stuck with apparently multiple processes trying to INSERT while backfilling 'holes' in the existing keyspace. That's a very tough nut to crack. Because you can't change the DB design, there's nothing to be done except to allow for and handle PK collisions. There's no SQL trick that'll help - the SQL way is to have the DB generate the PK, not the application.
There are several alternatives to suggest, in the event that some change is allowed. All have issues needing a workaround, but that is unavoidable at this point due to the application design.
Create a UDF that all INSERT clients use to retrieve the next available PK. Use a table of 'available numbers' and delete them as they are issued.
Pre-INSERT all the available numbers. Force clients to do an UPDATE. Make them FETCH...FOR UPDATE where (rest of data = not populated). This will lock the row, avoiding collisions as well as make the PK immediately available.
Leave the DB and the other application programs using this table as-is, but have your INSERT process draw from a block of keys that's been set aside for your use. Keep the next available number in an SQL SEQUENCE or an IBM i data area. This only works if there's a very large hole in the keyspace that's not yet used.
I suggest this should be one of common cases but probably I use wrong keywords when googling around.
I just need to create new table record with completely random key. Assume I obtained key with good randomness (almost random). However I can't be 100% sure no row yet exists. So what I need to do atomically:
Having row key check no row exists yet.
Reject operation if row exists.
Create row if it does not exit.
Most useful piece of information I found on this topic is article about HBase row locks.
I see HBase row locks as suitable solution but I'd like to do it better way without explicit row locking.
ICV looks not suitable because I really do want key to be random.
CAS would be great if they could work on 'row does not exists' condition but it looks they can't.
Explicit row locks have disadvantages like issues on region split.
Could somebody please add useful advice?
Preferable API is Java based but actually it is more about concept rather than implementation.
'Good enough' solution for this case happened to be based on checkAndPut() method. What I intended to do is new row insertion with key duplication check and for individual inserts solution is perfect:
HTable checkAndPut() method can check certain column is not set (check it for null value).
As rows anyway contain some 'ID' field which is mandatory for all
objects (you can use any other field that you always set for your
object) it is possible to check if row exists.
Put object passed to checkAndPut() is to contain initial
object state with mandatory field set.
Well, for bulk insertion (what I really needed) it happened to be too slow so I moved to UUID used as row keys without any checks on new row insertion. For me it is much better. The only consideration in this case is really good random generator. Standard Java java.util.UUID class contains everything I need including it is based on somewhat slow but pretty strong java.security.SecureRandom generator.
Just note: it looks like HBase user row locking feature is going to be dropped due to security / other risks related to its usage.
Im programming a program in java and i have a database in a JTable just like the ones below. I wanted to know if it is possible to refresh the primaryID location from 1 on the GUI interface form one when a row is deleted? for example below the LoactionID is deleted for London and added again with an id 4. Is this possible?
Im using SQL in java
To answer your question, yes it is possible.
There is no good reason for you to do this though, and I highly recommend you don't do this.
The only reason to do this would be for cosmetic ones - the database doesn't care if records are sequential, only that they relate to one another consistently. There's no need to "correct" the values for the database's sake.
If you use these Id's for some kind of numbering on the UI (cosmetic reason):
Do not use your identity for this. Separate the visual row number, order or anything else from the internal database key.
If you REALLY want to do this,
Google "reseeding or resetting auto increment primary ID's" for your sql product.
Be aware for some solutions if you reset the identity seed below values that you currently have in the table, that you will violate the indentity column's uniqueness constraint as soon as the values start to overlap
Thanks Andriy for mentioning my blindly pasting a mysql solution :)
Some examples:
ALTER TABLE table_name ALTER COLUMN auto_increment_column_name RESTART WITH 8 Java DB
DBCC CHECKIDENT (mytable, RESEED, 0)
Altering the sequence
I need to generate encoding String for each item I inserted into the database. for example:
x00001 for the first item
x00002 for the sencond item
x00003 for the third item
The way I chose to do this is counting the rows. Before I insert the third item, I count against the database, I know there're already 2 rows, so the next encoding is ended with 3.
But there is a problem. If I delete the second item, the forth item will not be the x00004,but x00003.
I can add additional columns to table, to store the next encoding, I don't know if there's other better solutions ?
Most databases support some sort of auto incrementing identity field. This field is normally also setup to be unique, so duplicate ids do not occur.
Consult your database documentation to see how it is done in your database and use that - don't reinvent the wheel when you have a good mechanism in place already.
What you want is SELECT MAX(id) or SELECT MAX(some_function(id)) inside the transaction.
As suggested in Oded's answer a lot of databases have their own methods of providing sequences which are more efficient and depending on the DBMS might support non numeric ids.
Also you could have id broken down into Y and 00001 as separate columns and having both columns make up primary key; then most databases would be able to provide the sequence.
However this leads to the question if your primary key should have a meaning or not; Y suggest that there is some meaning in the part of the key (otherwise you would be content with a plain integer id).