Currently I have a database that is not managed by me and I can't do any changes to it, the id field is a smallint 2 unsigned that gives you up to 65535 id entries.
My problem is that I need to reuse the ids because of the above limitations, how could I get the next usable ID in order or what would you do to manage the inserts with the above limitations ?
Check if 1 is free. If not:
SELECT MIN(a.id) + 1 AS smallestAvailableId
FROM your_table AS a
LEFT JOIN your_table AS a2
ON a2.id = a.id + 1
WHERE a2.id IS NULL
From the tags I deduce that you need the id in Java.
I personally would avoid joining the table with itself. Since you have at most 64K rows, I would select id from table into Java and search for id in Java. One way to search for gaps is by sorting the array first (either in SQL or in Java); finding gaps then becomes trivial.
If you do this repeatedly, you can cache the array and avoid having to run an SQL statement every time you need an id.
Regardless of what you do, if there are multiple clients writing to the database you have to be prepared to deal with race conditions, where multiple clients would attempt to use the same id. You code would need to either use locking or be able to recover gracefully to re-trying the failed insert with a different id (I assume there is a uniqueness constraint on the id column.)
Whichever approach you take is likely to cause you problems because of race conditions unless you know you will have exactly one client accessing the db at any single moment.
To answer your question, what do you consider an "usable" id? Please shed some light on that. Until all id's have been used a simple
SELECT MAX(id) + 1 FROM table;
should do. If you establish a criterion for "usable" ids such as for example, reuse all ids that have been flagged old then you can do:
SELECT MIN(id) FROM table WHERE is_old = 1;
Then just unflag the selected id.
Related
I am trying to have a table with an "order" column to allow rearranging the order of data. Is this possible using jpa? Maybe something similar to #OrderColumn but on the table itself.
Basically I want to add a new column called "order" that saves the order the records. If a record is added, it would automatically get a "order" value. If a record was deleted, the "order" of the remaining would be automatically updated. Additionally if possible, to rearrange the orders by moving one record to an lower "order" and it would push the others
There is no way to do this out of the box, but you can implement this yourself if you want. Just query for the count of objects right before persisting and set the count + 1 as value for that order column. Make sure that the order column is declared as being unique i.e. with a unique constraint.
Note that your requirement is pretty exotic and will likely require some kind of table lock or retry mechanism if you have high concurrency.
IMO you should ask whoever gave you this requirement what the goal is that should be achieved. I bet that you will find out you don't need this after all.
I'm dealing with up to a billion records in oracle and I really need efficiency.
The first table is notification. I need to obtain the following data.
src_data_id | match_data_id
The second table is person_info. id is same as src_data_id and match_data_id from the notification table.
id | name
The third table is sample_info, in which self_object_id is the foreign key for person_info.
id | self_object_id
The forth table is sample_dna_gene where sample_id is same as id in sample_id.
sample_id | gene_info
I am writing a program in Java and I want to encalsulate a list of objects. Each object contains the name (from person_info) and gene_info (from gene_info).
Originally, I did it in 2 steps. I joined notification and person_info to obtain the ids. Then I joined person_info, sample_info and gene_info to obtain the names and their corresponding gene_info.
This would be fine for a smaller database, but dealing with up to a billion records, I need to worry about speed. I should not join the three tables like I did, but use simple sqls for each table, and join the pieces in Java instead.
It was easy to get ids from person_info with separate sqls, but I'm having trouble with obtaining their corresponding gene_info. I can get sample_info.id with a simple sql using in(id1,id2,id3...). I can then find gene_info with another simple sql using in(id1,id2,id3...).
I can obtain all these lists in java, but how do I put them together? I'm using spring and mybatis. Originally I could make one big messy sql and encapsulates all elements in the mapper. I'm not sure what to do now.
edit: The messy sql I have right now is
select to_char(sdg.gene_info), max(aa.pid), max(aa.sid), max(aa.id_card_no)
from (select max(pi.person_name),
max(pi.id) pid,
si.id sid,
max(pi.id_card_no),
max(pi.race)
from person_info pi
join sample_info si
on pi.id = si.self_object_id
group by si.id) aa
join sample_dna_gene sdg
on sdg.sample_id = aa.sid
group by to_char(sdg.gene_info)
where aa.pid in ('...')
It's a little more complicated than the orginal question. I need to group by id in sample_id first, then group by gene_info in sample_data_gene. I had to use a lot of max() so group by would work, and even then, I still could not get the gene_info group by to work properly. I'm not sure how ineffcient the max() is and how much it will slow down the query, but you can clearly see the point why I wanted to avoid such a messy sql now.
I had similar case. It was delt with 4 separate readers one for each table and merging was done on java side. Unfortunately prerequisite for that was sorting income streams on database side.
You read single record from stream one then you read records from stream 2 until key changes (as you sorted by that key and key is common for all tabs) then same for following streams. In my case that made sense as first table was very wide and next 3 had many rows for single key in table 1. If in your case there are no 1:n (where n is big) relations I don't see why such approach can be better than join.
Consider this:
I have a database with 10 rows.
Each row has a unique id (int) paired with some value e.g. name (varchar).
These ids are incremented from 1 to 10.
I delete 2 of the records - 2 and 8.
I add 2 more records 11 and 12.
Questions:
Is there a good way to redistribute unique ids in this database so it would go from 1 to 10 again ?
Would this be considered bad practice ?
I ask this question, because after some use of this database: adding and deleting values the ids would differ significantly.
One way to approach this would be to just generate the row numbers you want at the time you actually query, something like this:
SET #rn = 0;
SELECT
(#rn:=#rn + 1) AS rn, name
FROM yourTable;
ORDER BY id;
Generally speaking, you should not be worrying about the auto increment values which MySQL is assigning. MySQL will make sure that the values are unique without your intervention.
If you set the ID column to be primary key and an auto-increment as well, "resetting" is not really necessary because it will keep assigning unique IDs anyways.
If the thing that bothers you are the "gaps" among the existing values, then you might resort to "sort deletion", by employing the is_deleted column with bit/boolean values. Default value would be 0 (or b0), of course. In fact, soft-deleting is advised if there are some really important data that might be useful later on, especially if it involves possibility for payment-related entries where user can delete one of such entries either by omission or deliberately.
There is no simple way to employ the deletion where you simply remove one value and re-arrange the remaining IDs to retain the sequence. A workaround might be to do the following steps:
DELETE entry first. i.e. delete from <table> where ID = _value
INSERT INTO SELECT (without id column). please note that the table need to be identical in terms of columns and types in order for this query to work properly, so to speak... and you can also utilize temporary as the backup_table. i.e. insert into <backup_table> select <coluum1, column2, ...> from <table>
TRUNCATE your table, i.e. truncate table <table>
copy the values from the temp table back into the existing table. You can utilize the INSERT INTO SELECT once again, but make sure to drop the temp table in the end
Please note that I would NOT advise you to do this, mainly because most people utilize some sort of caching in their applications and they also utilize the specific ways to evaluate whether a specific object is the same.
I.e. in Java, the equals() and hashCode() methods for POJOs are overriden and programmers generally rely on IDs to be permanent way of identifying a specific object. By utilizing the above method, you essentially break the whole concept and I would not advise you to change the object's autoincrement ID value for this reason, before anything else.
Essentially, what you want to do is simply an anti-pattern and will generally make common patterns and practices employed by experienced programmers into solutions that are prone to unexpected issues and/or failures... and this especially applies if/when advanced features are involved, such as employing this such anti-pattern into an application that utilizes galera cluster and/or application caching.
I need something which I dont know if it's possible to achieve. Basically I'll be adding new rows of information from java to a database with the table structure , ex:
Number | Color | Size
0 | Red | Big
1 | Green | Small
2 | Yellow| Medium
I'm using java and I'll only input the Color and Size, and I would like to know if it's possible to create a trigger that will store the variable Number(id) on the database side, and increment it each time I do an Insert of a new row into the db. I was thinking doing something like, " INSERT INTO table ((null),'Red', 'Big'), and then the database would update the value with the proper Number.
Also should be possible to bare with fact that some rows can be deleted, but it won'shouldn't affect anything, example: if I have ID's 0 , 1 ,2 and I delete 1, next row should still be 3.
Another thing, I'm using Sybase SQL Anywhere 12 to do this.
You should use autoincrement column in your database.
See this.
http://www.sqlines.com/sybase-asa/autoincrement_identity
As #Gordon Linoff said ...
Identity columns are doing that, for example ...
create table T1 (ID int identity(1,1), Name nvarchar(100))
In this case you would go ...
insert into T1 (Name) values ('John')
So, you would insert Name 'John' and DB itself would give him ID 1
Next time you do insert, ID 2 would be set ... and so on and on ..
Identity(1,1) - it means start from 1 and increment it by 1 on new insert
Thing about this is that once number is taken, there is no going back, so if you have IDs 1, 2, 3 .. and delete ID 3 .. on next insert ID will go up to 4, it will not populate "missing number"
there are several solutions that satisfy your requirements but those are different in several aspects and you should decide to select the best one.
some solutions are exists in DB context. (for example #Gregory answer),
but some other solutions are independent of the DB type and specific features. it means that you implement your solution independent of your db type and you could change your db (oracle, sql-server, my-sql , ...) and there is no need to change your java code.
in jpa there are three Sequence Strategies to solving this problem with #GeneratedValue.
1) Table sequencing: you use a separate table for this purpose in your db. this table hold the proper ids for other tables that have auto-increment-columns with this strategy.
2) Sequence objects: you use a sequence object in your db and jpa handle it. Sequence objects are only supported in some databases, such as Oracle, DB2, and Postgres.
3) Identity sequencing: uses special IDENTITY columns in the database to allow the database to automatically assign an id to the object when its row is inserted. Identity columns are supported in many databases, such as MySQL, DB2, SQL Server, Sybase, and PostgreSQL. Oracle does not support IDENTITY columns but it is possible to simulate them using sequence objects and triggers.
if you want to be independent from your db type, I recommend you to use "table strategy" in jpa.
see Java Persistence/Identity and Sequencing for details.
you asked:
I would like to know if it's possible to create a trigger that will
store the variable Number(id) on the database side, and increment it
each time I do an Insert of a new row into the db.
yes, you could use trigger, but as i mentioned there are some simpler solutions rather it.
Also should be possible to bare with fact that some rows can be
deleted, but it won'shouldn't affect anything
in jpa solutions the deleted ids don't be used in next usage, but if you implement your own solution you could use them.
I hope this answer help you.
sorry, if the question title is misleading or not accurate enough, but i didn't see how to ask it in one sentence.
Let's say we have a table where the PK is a String (numbers from '100,000' to '999,999', comma is for readability only).
Let's also say, the PK is not sequentially used.
Now i want to insert a new row into the table using java.sql and show the PK of the inserted row to the User. Since the PK is not generated by default (e.g. insert values without the PK didn't work, something like generated_keys is not available in the given environment) i've seen two different approaches:
in two different statements, first find a possible next key, then try to insert (and expect that another transaction used the same key in the time between the two statements) - is it valid to retry until success or could any sql trick with transaction-settings/locks help here? how can i realize that in java.sql?
for me, that's a disappointing solution, because of the non-deterministic behaviour (perhaps you could convince me of the contrary), so i searched for another one:
insert with a nested select statement that looks up the next possible PK. looking up other answers on generating the PK myself I came close to a working solution with that statement (left out the casts from string to int):
INSERT INTO mytable (pk,othercolumns)
VALUES(
(SELECT MIN(empty_numbers.empty_number)
FROM (SELECT t1.pk + 1 as empty_number
FROM mytable t1
LEFT OUTER JOIN mytable t2
ON t1.pk + 1 = t2.pk
WHERE t2.pk IS NULL
AND t1.pk > 100000)
as empty_numbers),
othervalues);
that works like a charm and has (afaik) a more predictable and stable solution than my first approach, but: how can i possibly retrieve the generated PK from that statement? I've read that there is no way to return the inserted row (or any columns) directly and most of the google results i've found, point to returning generated keys - even though my key is generated, it's not generated by the DBMS directly, but by my statement.
Note, that the DBMS used in development is MSSQL 2008 and the productive system is currently a DB2 on AS/400 (don't know which version) so i have to stick close to SQL standards. i can't change the db-structure in any way (e.g. use generated keys, i'm not sure about stored procedures).
DB2 for i allows generated keys, stored procedures, user defined functions - pretty much all of the things SQL Server can do. The exact implementation is different, but that's what manuals are for :-) Ask your admin what version of IBM i they're running, then hit up the Infocenter for specifics.
The constraining factor is that you can't alter the database design; you are stuck with apparently multiple processes trying to INSERT while backfilling 'holes' in the existing keyspace. That's a very tough nut to crack. Because you can't change the DB design, there's nothing to be done except to allow for and handle PK collisions. There's no SQL trick that'll help - the SQL way is to have the DB generate the PK, not the application.
There are several alternatives to suggest, in the event that some change is allowed. All have issues needing a workaround, but that is unavoidable at this point due to the application design.
Create a UDF that all INSERT clients use to retrieve the next available PK. Use a table of 'available numbers' and delete them as they are issued.
Pre-INSERT all the available numbers. Force clients to do an UPDATE. Make them FETCH...FOR UPDATE where (rest of data = not populated). This will lock the row, avoiding collisions as well as make the PK immediately available.
Leave the DB and the other application programs using this table as-is, but have your INSERT process draw from a block of keys that's been set aside for your use. Keep the next available number in an SQL SEQUENCE or an IBM i data area. This only works if there's a very large hole in the keyspace that's not yet used.