This is a common scenario, but i wanted to find out which way is the performance optimized way and best practice.
I have a table with 4 columns: id, name, and two other fields. Id is the PK and name is a unique key. I'm reading data from excel file, populate the values of each row in a Domain object and then saving it. When saving, i want to see whether a record already exists for the same name and if exists, i want to update it. Else save it as a new record.
I can do it with normal select query for the name and check for null, and based on that insert or update but i have thousands of rows to be read from excel files and a non-functional requirement requested is the performance.
So please advice me on which is the best way to handle this senario? i haven't started coding my persistence layer part yet, so i can switch to an ORM or plain jdbc according to your suggestion.
Edited:
If i use name as primary key, then i think i can use saveOrUpdate or merge from an ORM, to fullfill my need. Is it a good idea???
Thanks & regards,
Prasath.
I think the fastest way would be to carry out all the insert/updates in the database itself rather than connecting to it and using a large number of statements.
Note, this is Oracle specific, but other databases may have similar concepts.
I would use the following approach: First save the Excel data as a CSV file on the database server (/mydatadir/mydata.csv), then in Oracle I would be using an external table:
create or replace directory data_dir as '/mydatadir/';
create table external_table (
id number(18),
name varchar2(30),
otherfield1 varchar2(40),
otherfield2 varchar2(40))
organization external (
type oracle_loader
default directory data_dir
access parameters
( fields terminated by ',' )
location ('mydata.csv')
)
(Note, the external table wouldn't have to be set up every time)
Then you can use the following command to merge the data into your table:
merge into yourtable t
using external_table e
on t.name = e.name
when matched then
update set t.id = e.id,
t.otherfield1 = e.otherfield1,
t.otherfield2 = t.otherfield2
when not matched then
insert (t.id, t.name, t.otherfield1, t.otherfield2)
values (e.id, e.name, e.otherfield1, e.otherfield2)
This will upsert the rows in yourtable in one Oracle command, so all the work will be carried out by the database.
EDIT:
This merge command can be issued over plain JDBC (though I prefer using Spring's SimpleJdbcTemplate)
EDIT2:
In MySQL you can use the following construct to perform the merge:
insert into yourtable (id, name, otherfield1, otherfield2)
values (?, ?, ?, ?),
(?, ?, ?, ?),
(?, ?, ?, ?) --repeat for each row in the Excel sheet...
on duplicate Key update
set otherfield1 = values(otherfield1),
otherfield2 = values(otherfield2)
This can be issued as a plain JDBC statement and is going to be better than a separate update and insert, and you can call these in batches of (say) a hundred rows from the spreadsheet. This would mean 1 JDBC call for every 100 rows in your Excel sheet and should perform well. That'll allow you to do it without external tables (you'd need a UNIQUE index on the name column for this to work, I wouldn't change the primary key as this could cause you problems with foreign keys if you needed to change somebody's name).
MySQL also has the concept of external tables, which I think would be faster still than inserting the data as batches as per above. As long as the csv file is uploaded to the correct location, the import should work quickly.
May be it's reasonable to read all names in a Set and subtract the use combinations with Set of names read from the excel file.
Set dbSet=//fill it from SQl query;
Set newSet//fill it from the file;
newSet.removeAll(dbSet); //left non existing ones to be inserted.
originalNewSet (could be clone of initial)
originalNewSet.removeAll(insertingSet); //left records to be updated.
Related
Let's assume that I have an Oracle database with a table called RUN_LOG I am using to record when jobs have been executed.
The table has a primary key JOB_NAME which uniquely identifies the job that has been executed, and a column called LAST_RUN_TIMESTAMP which reflects when the job was last executed.
When an job starts, I would like to update the existing row for a job (if it exists), or otherwise insert a new row into the table.
Given Oracle does not support a REPLACE INTO-style query, it is necessary to try an UPDATE, and if zero rows are affected follow this up with an INSERT.
This is typically achieved with jdbc using something like the following:
PreparedStatement updateStatement = connection.prepareStatement("UPDATE ...");
PreparedStatement insertStatement = connection.prepareStatement("INSERT ...");
updateStatement.setString(1, "JobName");
updateStatement.setTimestamp(2, timestamp);
// If there are no rows to update, it must be a new job...
if (updateStatement.executeUpdate() == 0) {
// Follow-up
insertStatement.setString(1, "JobName");
insertStatement.setTimestamp(2, timestamp);
insertStatement.executeUpdate();
}
This is a fairly well-trodden path, and I am very comfortable with this approach.
However, let's assume my use-case requires me to insert a very large number of these records. Performing individual SQL queries against the database would be far too "chatty". Instead, I would like to start batching these INSERT / UPDATE queries
Given the execution of the UPDATE queries will be deferred until the batch is committed, I cannot observe how many rows are affected until a later date.
What is the best mechanism for achieving this REPLACE INTO-like result?
I'd rather avoid using a stored procedure, as I'd prefer to keep my persistence logic in this one place (class), rather than distributing it between the Java code and the database.
What about the SQL MERGE statement. You can insert large number of records to temporary table, then merge temp table with RUN_LOG For example:
merge into RUN_LOG tgt using (
select job_name, timestamp
from my_new_temp_table
) src
on (src.job_name = tgt.job_name)
when matched then update set
tgt.timestamp = src.timestamp
when not matched then insert values (src.job_name, src.timestamp)
;
I'm relatively new to working with JDBC and SQL. I have two tables, CustomerDetails and Cakes. I want to create a third table, called Transactions, which uses the 'Names' column from CustomerDetails, 'Description' column from Cakes, as well as two new columns of 'Cost' and 'Price'. I'm aware this is achievable through the use of relational databases, but I'm not exactly sure about how to go about it. One website I saw said this can be done using ResultSet, and another said using the metadata of the column. However, I have no idea how to go about either.
What you're probably looking to do is to create a 'SQL View' (to simplify - a virtual table), see this documentation
CREATE VIEW view_transactions AS
SELECT Name from customerdetails, Description from cakes... etc.
FROM customerdetails;
Or something along those lines
That way you can then query the View view_transactions for example as if it was a proper table.
Also why have you tagged this as mysql when you are using sqlite.
You should create the new table manually, i.e. outside of your program. Use the commandline 'client' sqlite3 for example.
If you need to, you can use the command .schema CustomerDetails in that tool to show the DDL ("metadata" if you want) of the table.
Then you can write your new CREATE TABLE Transactions (...) defining your new columns, plus those from the old tables as they're shown by the .schema command before.
Note that the .schema is only used here to show you the exact column definitions of the existing tables, so you can create matching columns in your new table. If you already know the present column definitions, because you created those tables yourself, you can of course skip that step.
Also note that SELECT Name from CUSTOMERDETAILS will always return the data from that table, but never the structure, i.e. the column definition. That data is useless when trying to derive a column definition from it.
If you really want/have to access the DB's metadata programatically, the documented way is to do so by querying the sqlite_master system table. See also SQLite Schema Information Metadata for example.
You should read up on the concept of data modelling and how relational databases can help you with it, then your transaction table might look just like this:
CREATE TABLE transactions (
id int not null primary key
, customer_id int not null references customerdetails( id )
, cake_id int not null references cakes( id )
, price numeric( 8, 2 ) not null
, quantity int not null
);
This way, you can ensure, that for each transaction (which is in this case would be just a single position of an invoice), the cake and customer exist.
And I agree with #hanno-binder, that it's not the best idea to create all this in plain JDBC.
i would like to insert into a local table (in a local database), all the rows from a distant table. here's what i'm looking for :
insert into LocalTable (Column1,Column2,...,ColumnN) values (select * from DistantTable);
does anybody knows how could i do this (if there is a way)??
i'm aweare that there is a way using a java program, by copying the DistantTable rows in a file, then extracting those rows using a StringTokenizer then putting them to LocalTable. but it would be really good if i can perform this using only SQL queries.
You can create a database link in the local database, pointing at the remote database, and then type:
INSERT INTO LocalTable SELECT * FROM RemoteTable#DBLink;
I have a table named "preference" which includes more than 100 columns in oracle,I wrote a little bit complicated SQL which need use keyword UNION/INTERSECT/MINUS to do a query.
Take a simple example:
select a.* from preference a where a.id = ? union
select a.* from preference a where a.id = ?
The business care have been changed due to unlimited length string storage on demand. one column need to be re-defined to Clob type. Oracle don't allow union on the clob type, so ideally the a.* cannot be used here.
I changed SQL to like below:
select a.a,a.b,a.c... from preference a where a.id = ? union
select a.a,a.b,a.c... from preference a where a.id = ?
It lists all columns except clob and then I have to do another selection to append the Clob value together. Is that a good idea?
The Another issue brought from above case is that: as I mentioned this table has large columns, list all columns in sql it make SQL much longer. Is there expression I can select all columns but getting rid of specific one?
Oracle when delaing with log does not allow union/minus but allows union all, may be you can rewrite your query using union all and use a select . in the select clause you can issue a select a. or list every column.
After reading your question my main concern is memory usage on Java, are you using an orm to load the data? or are you using the jdbc api?
If you are loading all the clobs into some strings you could end with an OutOfMemoryError. My advice is to load the clob only for rows you need to show to the user (or for the rows where the clob filed has to be processed).
Can you give more insight about your application (the numer fo rows it has to process) and your data (epsecially the clob size)?
I have a project in java where i want that certain data from one table (that is in Sql management studio) is selected and that is inserted in other table. So that i can access data on a jsp page from second table. How to do this?
One method would be to iterate through the table while writing the values into an array. Once the data has been stored into the array you can re-iterate through the array but this time inserting the values into a new table.
This may not be the most efficient method, I am sure someone else will chime in if so.
Another method which does not require Java would be to use the Select As statement in SQL, see example.
CREATE TABLE suppliers
AS (SELECT *
FROM companies
WHERE id > 1000);
Or if you already have a table created you can do the following,
INSERT INTO suppliers
(supplier_id, supplier_name)
SELECT account_no, name
FROM customers
WHERE city = 'Newark';
If you use SQL, you can use SELECT INTO statements to achieve this easily:
SELECT Column1,Column2
INTO SecondTable
FROM FirstTable
WHERE Column3='Whatever'
This will copy the data from FirstTable into SecondTable.
See This Link for more examples