INSERT ALL - For more than 1000 rows - java

I have an INSERT ALL query in my program like:
<insert id="insertRecord" parameterType="java.util.List">
INSERT ALL
<foreach collection="myList" item="addrElement" index="index">
INTO MYTABLE (COLUMN1,COLUMN2,COLUMN3) values (#{addrElement.element1},#{addrElement.element2},#{addrElement.element3})
</foreach>
SELECT * FROM dual
</insert>
The list will hold a minimum of 10000 records.
Obviously, this is throwing an exception since INSERT ALL cannot handle more than 1000 records.
; bad SQL grammar []; nested exception is java.sql.SQLSyntaxErrorException: ORA-00913: too many values
I have checked many answers in SO as well as other sites to check that, picking records more than 1000 rows is specified only for SELECT query and not for an INSERT query.
Can someone lend me a hand on this? Would be much helpful.

You need to perform batch insert.
int batchSize = 100;
try (SqlSession sqlSession = sqlSessionFactory.openSession(ExecutorType.BATCH)) {
YourMapper mapper = sqlSession.getMapper(YourMapper.class);
int size = list.size();
for (int i = 0; i < size;) {
mapper.insertRecord(list.get(i));
i++;
if (i % batchSize == 0 || i == size) {
sqlSession.flushStatements();
sqlSession.clearCache();
}
}
sqlSession.commit();
}
You should find an appropriate value for the batchSize (it depends on various factors).
The insert statement is pretty straight forward.
<insert id="insertRecord">
INSERT INTO MYTABLE (COLUMN1, COLUMN2, COLUMN3)
VALUES (#{addrElement.element1}, #{addrElement.element2}, #{addrElement.element3})
</insert>
We have an FAQ entry.

Related

Will a MYSQL Insert statement ever return something not a 1?

I'm trying to think of some error scenarios, and let's say we have a simple statement like:
INSERT INTO CAT_TABLE (NAME, BREED, AGE) VALUES ("Henry", "Siamese", 2)
The primary key here is the name of the cat. Under most, if not all circumstances this should return a 1. In terms of the JDBC API, is there a time it can return something that isn't a 1? I know if there is an issue with the values itself, it would just throw a SQL exception but when will it throw a 0 or some numerical value not a 1?
UPDATE:
Java code for using this statement would be something like:
Connection connection = getConnection(...);
PreparedStatement ps = connection.prepareStatement(sql);
int result = ps.executeUpdate();
connection.commit();
return result;
I assume you mean the rows affected of the INSERT statement. An INSERT statement itself has no result. The rows affected for an INSERT statement would be the number of rows inserted.
You can INSERT from a SELECT result set:
INSERT INTO mytable (col1, col2, col3) SELECT col1, col2, col3 FROM ...
That SELECT could have a result of zero rows, or one row, or many rows.
It might also not be the number of rows of the SELECT result, if you use INSERT IGNORE and some of the rows succeed while other rows are ignored because of errors.
You can also make an INSERT statement that inserts multiple rows without a SELECT statement:
INSERT INTO mytable (col1, col2, col3)
VALUES (1,2,3), (4,5,6), (7,8,9), ...
That could also report rows affected more than 1.

How can I use exception handling while sending duplicate values in an insert statement, if I'm sending many values in batches?

I've written a class method that will take "batches" of data (each row that makes a "value" to be inserted, via SQL, to the database comes from a two-dimensional array labeled "data_values").
However, there will be instances when my program will be getting redundant data, i.e. data that might already be in the database. Because there's a primary key in the database, the program will break if it cannot upload the data because of a duplicate entry.
Is there a way to use a try/catch so that the program will continue uploading data, effectively "skipping" the duplicates? If so, how can I implement it?
Thank you in advance. If I could clarify my question, please let me know.
My current code is here:
public void insertData(ArrayList<String> data_types, String[][] data_values) {
try{
c.setAutoCommit(false);
// creates insert statement
String insertDataScript = "INSERT INTO "+tableName+" VALUES (";
for(int q = 0; q < data_types.size()-1; q++) {
insertDataScript += "?, ";
}
insertDataScript += "?)";
PreparedStatement stmt = c.prepareStatement(insertDataScript);
for (int i = 0; i < data_values.length; i++) {
for(int j = 1; j < data_types.size()+1; j++) {
if(data_types.get(j-1).toLowerCase().equals("double")) {
stmt.setDouble(j, Double.valueOf(data_values[i][j-1]));
}
else if(data_types.get(j-1).toLowerCase().equals("string")) {
stmt.setString(j, data_values[i][j-1]);
}
else {
System.out.println("Error");
}
}
stmt.addBatch();
}
stmt.executeBatch();
c.commit();
c.setAutoCommit(true);
stmt.close();
}
catch ( Exception e ) {
System.err.println( e.getClass().getName() + ": " + e.getMessage() );
System.exit(0);
}
}
My first suggestion would be to deduplicate the data before inserting it into the db. (Edit: totally missed the "already in the db" part, so this probably won't work unless you want to do a query before every insert. Maybe you can use an INSERT IGNORE?)
If you cannot do this because you do not have control over the primary key or there is no way to ignore duplicates in the insert, then there are ways to catch specific exception types and continue the program instead of calling System.exit. In order to do that you would probably need to have smaller prepared statements and put the try/catch inside the for loop over 'data_values`.
Here is a post talking about catching this type of exception: Catch duplicate key insert exception.
INSERT OR IGNORE
Simply change (albeit it not really exception handling, but rather exception bypassing)
String insertDataScript = "INSERT INTO "+tableName+" VALUES (";
to
String insertDataScript = "INSERT OR IGNORE INTO "+tableName+" VALUES (";
Consider the following demo (equivalent to suggested and then what you currently have) :-
rowid has been used for convenience as it's basically a build in primary key.
the only reason why the columns have been specified i.e.(rowid,othercolumn,mydatecolumn) is that rowid is normally hidden. In your case just VALUES (without the preceding columns) will expect values for all columns and thus include the defined primary key column(s).
shown/actioned in reverse order as both can then run together
:-
INSERT OR IGNORE INTO mytable (rowid,othercolumn,mydatecolumn) -- rowid is a PRIMARY KEY as such
VALUES
(10,'x','x'),
(11,'x','x'),
(12,'x','x'),
(13,'x','x'),
(14,'x','x'),
(10,'x','x')
;
INSERT INTO mytable (rowid,othercolumn,mydatecolumn) -- rowid is a PRIMARY KEY as such
VALUES
(20,'x','x'),
(21,'x','x'),
(22,'x','x'),
(23,'x','x'),
(24,'x','x'),
(20,'x','x')
;
results in :-
INSERT OR IGNORE INTO mytable (rowid,othercolumn,mydatecolumn) -- rowid is a PRIMARY KEY as such
VALUES
(10,'x','x'),
(11,'x','x'),
(12,'x','x'),
(13,'x','x'),
(14,'x','x'),
(10,'x','x')
> Affected rows: 5
> Time: 0.208s
i.e. 5 of the 6 were added the 6th a duplicate (according to the primary key) was skipped.
INSERT INTO mytable (rowid,othercolumn,mydatecolumn) -- rowid is a PRIMARY KEY as such
VALUES
(20,'x','x'),
(21,'x','x'),
(22,'x','x'),
(23,'x','x'),
(24,'x','x'),
(20,'x','x')
> UNIQUE constraint failed: mytable.rowid
> Time: 0.006s
i.e. none are inserted due to 1 duplicate.
INSERT OR REPLACE (may be useful)
If you wanted the data from the duplicates to be applied then instead of INSERT OR IGNORE, you could use INSERT OR REPLACE.
e.g. the following (run after the above i.e. all are duplicates bit with different data):-
INSERT OR REPLACE INTO mytable (rowid,othercolumn,mydatecolumn) -- rowid is a PRIMARY KEY as such
VALUES
(10,'xx','x'),
(11,'x','xx'),
(12,'aa','x'),
(13,'x','aa'),
(14,'x','bb'),
(10,'cc','x')
;
then you get :-
INSERT OR REPLACE INTO mytable (rowid,othercolumn,mydatecolumn) -- rowid is a PRIMARY KEY as such
VALUES
(10,'xx','x'),
(11,'x','xx'),
(12,'aa','x'),
(13,'x','aa'),
(14,'x','bb'),
(10,'cc','x')
> Affected rows: 6
> Time: 0.543s
i.e. now all 6 INSERTs are actioned (5 rows updated as the 1st and last update the same row twice).

How to resolve ORA-01795 in Java code

I am getting ORA-01795 error in my Java code while executing more than 1000 records in IN clause.
I am thinking to break it in the batch of 1000 entries using multiple IN clause separated by OR clause like below:
select * from table_name
where
column_name in (V1,V2,V3,...V1000)
or
column_name in (V1001,V1002,V1003,...V2000)
I have a string id's like -18435,16690,1719,1082,1026,100759... which gets generated dynamically based on user selection. How to write a logic for condition like 1-1000 records ,1001 to 2000 records etc in Java. Can anyone help me here?
There are three potential ways around this limit:
1) As you have already mentioned: split up the statement in batches of 1000
2) Create a derived table using the values and then join them:
with id_list (id) as (
select 'V1' from dual union all
select 'V2' from dual union all
select 'V3' from dual
)
select *
from the_table
where column_name in (select id from id_list);
alternatively you could also join those values - might even be faster:
with id_list (id) as (
select 'V1' from dual union all
select 'V2' from dual union all
select 'V3' from dual
)
select t.*
from the_table t
join id_list l on t.column_name = l.id;
This still generates a really, really huge statement, but doesn't have the limit of 1000 ids. I'm not sure how fast Oracle will parse this though.
3) Insert the values into a (global) temporary table and then use an IN clause (or a JOIN). This is probably going to be the fastest solution.
With so many values I'd avoid both in and or, and the hard-parse penalty of embedded values, in the query if at all possible. You can pass an SQL collection of values and use the table() collection expression as a table you can join your real table to.
This uses a hard-coded array of integers as an example, but you can populate that array from your user input instead. I'm using the built-in collection type definitions, like sys.odcinumberlist, which us a varray of numbers and is limited to 32k values, but you can define your own table type if you prefer or might need to handle more than that.
int[] ids = { -18435,16690,1719,1082,1026,100759 };
ArrayDescriptor aDesc = ArrayDescriptor.createDescriptor("SYS.ODCINUMBERLIST", conn );
oracle.sql.ARRAY ora_ids = new oracle.sql.ARRAY(aDesc, conn, ids);
sql = "select t.* "
+ "from table(?) a "
+ "left join table_name t "
+ "on t.column_name = a.column_value "
+ "order by id";
pStmt = (OraclePreparedStatement) conn.prepareStatement(sql);
pStmt.setArray(1, ora_ids);
rSet = (OracleResultSet) pStmt.executeQuery();
...
Your array can have as many values as you like (well, as many as the collection type you use and your JVM's memory can handle) and isn't subject to the in list's 1000-member limit.
Essentially table(?) ends up looking like a table containing all your values, and this is going to be easier and faster than populating a real or temporary table with all the values and joining to that.
Of course, don't really use t.*, list the columns you need; I'm assuming you used * to simolify the question...
(Here is a more complete example, but for a slightly different scenario.)
I very recently hit this wall myself:
Oracle has an architectural limit of a maximum number of 1000 terms inside an IN()
There are two workarounds:
Refactor the query to become a join
Leave the query as it is, but call it multiple times in a loop, each call using less than 1000 terms
Option 1 depends on the situation. If your list of values comes from a query, you can refactor to a join
Option 2 is also easy, but less performant:
List<String> terms;
for (int i = 0; i <= terms.size() / 1000; i++) {
List<String> next1000 = terms.subList(i * 1000, Math.min((i + 1) * 1000, terms.size());
// build and execute query using next1000 instead of terms
}
In such situations, when I have ids in a List in Java, I use a utility class like this to split the list to partitions and generate the statement from those partitions:
public class ListUtils {
public static <T> List<List<T>> partition(List<T> orig, int size) {
if (orig == null) {
throw new NullPointerException("The list to partition must not be null");
}
if (size < 1) {
throw new IllegalArgumentException("The target partition size must be 1 or greater");
}
int origSize = orig.size();
List<List<T>> result = new ArrayList<>(origSize / size + 1);
for (int i = 0; i < origSize; i += size) {
result.add(orig.subList(i, Math.min(i + size, origSize)));
}
return result;
}
}
Let's say your ids are in a list called ids, you could get sublists of size at most 1000 with:
ListUtils.partition(ids, 1000)
Then you could iterate over the results to construct the final query string.

preparedstatement with multiple rows where some row values are not set

I have a prepared statement like so
insert into mytable (id, name) values (?,?) , (?,?);
I am using multiple rows per preparedStatement because i was seeing massive speed gains.
Now if i have an odd number of rows to enter then the preparedStatement.executeBatch() does not enter any rows in the DB. It does not throw any error.
here is how i insert the values
int count =0;
for(int i=0; i<size; i++) {
statement.setObject(1, id[i]);
statement.setObject(2, name[i]);
//second row
if(i+1 != size) {
statement.setObject(1, id[i+1]);
statement.setObject(2, name[i+1]);
}
statement.addBatch();
if (count % 200 == 0 && count >0) {
statement.executeBatch();
}
}
statement.executeBatch();
What can i do to make it work?
You can do this automatically using the "rewriteBatchedStatements" option in the MySQL driver. You can write a single insert statement and execute it as a batch and the driver will rewrite it for you automatically to execute in as few round-trips as possible. c.f. http://dev.mysql.com/doc/connector-j/en/connector-j-reference-configuration-properties.html
With this solution, you do not have to use the multiple row form of INSERT.

What is the best alternative to BatchStatement execute for retriving values from database (MSSQL 2008)

I have a SQL query as shown below.
SELECT O_DEF,O_DATE,O_MOD from OBL_DEFINITVE WHERE OBL_DEFINITVE_ID =?
A collection of Ids is passed to this query and ran as Batch query. This executes for 10000
times for retriveing values from Database.(Some one else mess)
public static Map getOBLDefinitionsAsMap(Collection oblIDs)
throws java.sql.SQLException
{
Map retVal = new HashMap();
if (oblIDs != null && (!oblIDs.isEmpty()))
{
BatchStatementObject stmt = new BatchStatementObject();
stmt.setSql(SELECT O_DEF,O_DATE,O_MOD from OBL_DEFINITVE WHERE OBL_DEFINITVE_ID=?);
stmt.setParameters(
PWMUtils.convertCollectionToSubLists(taskIDs, 1));
stmt.setResultsAsArray(true);
QueryResults rows = stmt.executeBatchSelect();
int rowSize = rows.size();
for (int i = 0; i < rowSize; i++)
{
QueryResults.Row aRow = (QueryResults.Row) rows.getRow(i);
CoblDefinition ctd = new CoblDefinition(aRow);
retVal.put(aRow.getLong(0), ctd);
}
}
return retVal;
Now we had identified that if the query is modified to
add as
SELECT O_DEF,O_DATE,O_MOD from OBL_DEFINITVE WHERE OBL_DEFINITVE_ID in (???)
so that we can reduce it to 1 query.
The problem here is MSSQL server is throwing exception that
Prepared or callable statement has more than 2000 parameter
And were struck here . Can some one provide any better alternative to this
There is a maximum number of allowed parameters, let's call it n. You can do one of the following:
If you have m*n + k parameters, you can create m batches (or m+1 batches, if k is not 0). If you have 10000 parameters and 2000 is the maximum allowed parameters, you will only need 5 batches.
Another solution is to generate the query string in your application and adding your parameters as string. This way you will run your query only once. This is an obvious optimization in speed, but you'll have a query string generated in your application. You would set your where clause like this:
String myWhereClause = "where TaskID = " + taskIDs[0];
for (int i = 1; i < numberOfTaskIDs; i++)
{
myWhereClause += " or TaskID = " + taskIDs[i];
}
It looks like you are using your own wrapper around PreparedStatement and addBatch(). You are clearly reaching a limit of how many statements/parameters can be batched at once. You will need to use executeBatch (eg every 100 or 1000) statements, instead of having it build up until the limit is reached.
Edit: Based on the comment below I reread the problem. The solution: make sure you use less than 2000 parameters when building the query. If necessary, breaking it up in two or more queries as required.

Categories

Resources