I have successfully called a Stored Procedure ("Users_info") from my java app, which returns a table that contains two columns: ("user" and "status").
CallableStatement proc_stmt = con.prepareCall("{call Users_info (?)}");
proc_stmt.setInt(1, 1);
ResultSet rs = proc_stmt.executeQuery();
I managed to get all the user names and their respective status ("0" for inactive and "1" for active) into a ArrayList using ResultSet.getString.
ArrayList<String> userName= new ArrayList<String>();
ArrayList<String> userStatus= new ArrayList<String>();
while ( rs.next() ) {
userName.add(rs.getString("user"));
userStatus.add(rs.getString("CHKStatus_Id"));
}
Database shows:
user | status|
------------|--------
Joshua H. | 1 |
Mark J. | 0 |
Jonathan R. | 1 |
Angie I. | 0 |
Doris S. | 0 |
I want to find the best or most efficient way to separate the active users from the inactive users.
I can do this by comparing values using IF STATEMENTS, which, because of my limited knowledge in Java programming I think is not the most efficient way and maybe there is a better way that you guys can help me understand.
Thanks.
Related
I have a table in DynamoDB with my users (Partial key = key, Sort key = no):
key isActive
user1 true
user2 false
... ...
In my code I need to return a next user with status not active (isActive=false). What is the best way to do this, if I need solution is based on that I have
A huge table
Concurrent environment
I wrote code that works, BUT I amn't sure it is a good solution due to Scan and Filter expression:
public String getFreeUser() throws IOException {
Table table = dynamoDB.getTable("usersTableName");
ScanSpec spec = new ScanSpec()
.withFilterExpression("isActive = :is_active")
.withValueMap(new ValueMap().withBoolean(":is_active", false));
ItemCollection<ScanOutcome> items = table.scan(spec);
Iterator<Item> iterator = items.iterator();
Item item = null;
while (iterator.hasNext()) {
item = iterator.next();
try {
UpdateItemSpec updateItemSpec = new UpdateItemSpec()
.withPrimaryKey(new PrimaryKey("key", item.getString("key")))
.withUpdateExpression("set #ian=:is_active_new")
.withConditionExpression("isActive = :is_active_old")
.withNameMap(new NameMap()
.with("#ian", "isActive"))
.withValueMap(new ValueMap()
.withBoolean(":is_active_old", false)
.withBoolean(":is_active_new", true))
.withReturnValues(ReturnValue.ALL_OLD);
UpdateItemOutcome outcome = table.updateItem(updateItemSpec);
return outcome.getItem().getString("key");
} catch (Exception e) {
}
}
throw new IOException("No active users were found");
}
GSI + Query == GOOD
userID (PK) | isActive | otherAttribute | ...
user1 | true | foo | ...
user2 | false | bar | ...
user3 | true | baz | ...
user4 | false | 42 | ...
...
GSI:
userID | isActive (GSI-PK)
user1 | true
user2 | false
user3 | true
user4 | false
Add a GSI with a hash key of isActive. This will allow you to query directly the items where isActive == false.
The benefit vs scan and filter is that reads will be much more efficient. The cost is that your GSI requires it's own storage, so if your table is huge (as per your assumption) then you might want to consider a sparse index.
Sparse Index GSI + Query == BETTER
userID (PK) | isNotActive | otherAttribute | ...
user1 | | foo | ...
user2 | false | bar | ...
user3 | | baz | ...
user4 | false | 42 | ...
...
GSI:
userId | isNotActive (GSI-PK)
user2 | false
user4 | false
Consider replacing the attribute isActive with isNotActive and don't give this attribute to the active users. That is, the inactive users will have true but the active users will not have this attribute at all. You can then create your GSI with this isNotActive attribute. Since it only contains the inactive users, it will be smaller and more efficient to store and query.
Note that when a user becomes active you will need to delete this attribute, and vice versa for active users that become inactive.
Attribute Projections
Regardless of which GSI you decide is best for you, if you know which attribute(s) you will need when querying these inactive users - even if it's just "all of them" - you can project these to your GSI so that you don't need to do the second lookup by key. This will increase the size of your GSI, but may be a tradeoff worth making depending on your table size, the ratio of active to inactive users, and your expected access patterns.
UPDATE
In response to the first comment, to be clear the GSI key (now labelled "GSI-PK") is not the userID. I could put the isActive or active column on the far left in the GSI tables, but that's not how it appears in the AWS console so I've left it in the original order for consistency with the way AWS display it.
Re the second comment on concurrency, you're right I didn't address this. My solution will work in a concurrent environment except for one thing - you can only do eventually consistent reads, not strongly consistent reads. What this means is that a very recent newly inactive user (and by recent I mean a fraction of a second in most circumstances) might not have replicated to the GSI yet. Similarly, a user that has recently changed from inactive to active might not have updated the GSI yet. You'll need to consider whether eventually consistent reads are acceptable for your use case.
Another consideration is that if this is going to be a very large table, if the query results are going to total >1MB you're going to get a paginated result anyway because DynamoDB enforce that limit. Without a global table lock, you're going to get some inconsistency due to updates from other clients between page queries, in which case eventually consistent reads will need to work for you.
I am developing an app for ordering products online. I will first present you a part of the ER Diagram and then explain.
Here the "products" are food items and they will go inside the fresh_products table. As you can see a fresh_product consists of product species, category, type, size and product grade.
Then when I place an order, the order id and order_status will be saved in order table. All the ordered items will be saved in the order_item table. In order_item table you can see I have a direct connection with the fresh_products.
At certain times, the management will decide to change the existing fresh_products. For an example, lets take fish products such as Tuna. This fish item has a category called Loin(fished with Loin), after sometime management decide to remove it or rename it because they no longer fish with Loin.
Now, my design will be affected because of the above change. Why? When you make a change to any of the fresh_product fields, it will directly affect the order_item which holds the information of already ordered items. So if you rename Loin , all order_items which has Loin will now be linked to the new name which is wrong. If you decide to delete a fresh_product you can't do that either, because there are existing order history bound to that fresh_product.
My Suggestion
As a solution for this problem, I am thinking of removing the relationship of fresh_products from order_item. Then, I will add String fields into order_item representing all fields of fresh_products, for an example productType, prodyctCategory and so on.
Now I don't have a direct connection with fresh_products but have all the information needed. At the same time, anyfresh_product item can undertake any change without affecting already purchased items at order_item.
My question is, is my suggestion the best way to solve this issue? If you have better solutions, I am open.
Consider the following; in this example, only the name product changes, but you could easily add columns for each attribute (although at some point this would move from sublime to ridiculous)...
Also, I rarely use correlated subqueries, so apologies if there's an error there...
create table product_history
(product_id int not null
,product_name varchar(12) not null
,date date not null
,primary key (product_id,date)
);
insert into product_history values
(1,'soda','2020-01-01'),
(1,'pop','2020-01-04'),
(1,'cola','2020-01-07');
create table order_detail
(order_id int not null
,product_id int not null
,quantity int not null default 1
,primary key(order_id,product_id)
);
insert into order_detail values
(100,1,4);
create table orders
(order_id serial primary key
,customer_id int not null
,order_date date not null
);
insert into orders values
(100,22,'2020-01-05');
SELECT o.order_id
, o.customer_id
, o.order_date
, ph.product_id
, ph.product_name
, od.quantity
FROM orders o
JOIN order_detail od
ON od.order_id = o.order_id
JOIN product_history ph
ON ph.product_id = od.product_id
WHERE ph.date IN
( SELECT MAX(x.date)
FROM product_history x
WHERE x.product_id = ph.product_id
AND x.date <= o.order_date
);
| order_id | customer_id | order_date | product_id | product_name | quantity |
| -------- | ----------- | ---------- | ---------- | ------------ | -------- |
| 100 | 22 | 2020-01-05 | 1 | pop | 4 |
---
View on DB Fiddle
I need to check from Java if a certain user has at least one group membership. In Oracle (12 by the way) there is a big able that looks like this:
DocId | Group
-----------------
1 | Group-A
1 | Group-E
1 | Group-Z
2 | Group-A
3 | Group-B
3 | Group-W
In Java I have this information:
docId = 1
listOfUsersGroups = { "Group-G", "Group-A", "Group-Of-Something-Totally-Different" }
I have seen solutions like this, but this is not the approach I want to go for. I would like to do something like this (I know this is incorrect syntax) ...
SELECT * FROM PERMSTABLE WHERE DOCID = 1 AND ('Group-G', 'Group-A', 'Group-Of-Something-Totally-Different' ) HASATLEASTONE OF Group
... and not use any temporary SQL INSERTs. The outcome should be that after executing this query I know that my user has a match because he is member of Group-A.
You can do this (using IN condition):
SELECT * FROM PERMSTABLE WHERE DocId = 1 AND Group IN
('Group-G', 'Group-A', 'Group-Of-Something-Totally-Different')
The use case that we are working to solve with Cassandra is this: We need to retrieve a list of entity UUIDs that have been updated within a certain time range within the last 90 days. Imagine that we're building a document tracking system, so our relevant entity is a Document, whose key is a UUID.
The query we need to support in this use case is: Find all Document UUIDs that have changed between StartDateTime and EndDateTime.
Question 1: What's the best Cassandra table design to support this query?
I think the answer is as follows:
CREATE TABLE document_change_events (
event_uuid TIMEUUID,
document_uuid uuid,
PRIMARY KEY ((event_uuid), document_uuid)
) WITH default_time_to_live='7776000';
And given that we can't do range queries on partition keys, we'd need to use the token() method. As such the query would then be:
SELECT document_uuid
WHERE token(event_uuid) > token(minTimeuuid(?))
AND token(event_uuid) < token(maxTimeuuid(?))
For example:
SELECT document_uuid
WHERE token(event_uuid) > token(minTimeuuid('2015-05-10 00:00+0000'))
AND token(event_uuid) < token(maxTimeuuid('2015-05-20 00:00+0000'))
Question 2: I can't seem to get the following Java code using DataStax's driver to reliability return the correct results.
If I run the following code 10 times pausing 30 seconds between, I will then have 10 rows in this table:
private void addEvent() {
String cql = "INSERT INTO document_change_events (event_uuid, document_uuid) VALUES(?,?)";
PreparedStatement preparedStatement = cassandraSession.prepare(cql);
BoundStatement boundStatement = new BoundStatement(preparedStatement);
boundStatement.setConsistencyLevel(ConsistencyLevel.ANY);
boundStatement.setUUID("event_uuid", UUIDs.timeBased());
boundStatement.setUUID("document_uuid", UUIDs.random());
cassandraSession.execute(boundStatement);
}
Here are the results:
cqlsh:> select event_uuid, dateOf(event_uuid), document_uuid from document_change_events;
event_uuid | dateOf(event_uuid) | document_uuid
--------------------------------------+--------------------------+--------------------------------------
414decc0-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:51:09-0500 | 92b6fb6a-9ded-47b0-a91c-68c63f45d338
9abb4be0-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:53:39-0500 | 548b320a-10f6-409f-a921-d4a1170a576e
6512b960-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:52:09-0500 | 970e5e77-1e07-40ea-870a-84637c9fc280
53307a20-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:51:39-0500 | 11b4a49c-b73d-4c8d-9f88-078a6f303167
ac9e0050-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:54:10-0500 | b29e7915-7c17-4900-b784-8ac24e9e72e2
88d7fb30-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:53:09-0500 | c8188b73-1b97-4b32-a897-7facdeecea35
0ba5cf70-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:49:39-0500 | a079b30f-be80-4a99-ae0e-a784d82f0432
76f56dd0-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:52:39-0500 | 3b593ca6-220c-4a8b-8c16-27dc1fb5adde
1d88f910-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:50:09-0500 | ec155e0b-39a5-4d2f-98f0-0cd7a5a07ec8
2f6b3850-0014-11e5-93a9-51f9a7931084 | 2015-05-21 18:50:39-0500 | db42271b-04f2-45d1-9ae7-0c8f9371a4db
(10 rows)
But if I then run this code:
private static void retrieveEvents(Instant startInstant, Instant endInstant) {
String cql = "SELECT document_uuid FROM document_change_events " +
"WHERE token(event_uuid) > token(?) AND token(event_uuid) < token(?)";
PreparedStatement preparedStatement = cassandraSession.prepare(cql);
BoundStatement boundStatement = new BoundStatement(preparedStatement);
boundStatement.setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM);
boundStatement.bind(UUIDs.startOf(Date.from(startInstant).getTime()),
UUIDs.endOf(Date.from(endInstant).getTime()));
ResultSet resultSet = cassandraSession.execute(boundStatement);
if (resultSet == null) {
System.out.println("None found.");
return;
}
while (!resultSet.isExhausted()) {
System.out.println(resultSet.one().getUUID("document_uuid"));
}
}
It only retrieves three results:
3b593ca6-220c-4a8b-8c16-27dc1fb5adde
ec155e0b-39a5-4d2f-98f0-0cd7a5a07ec8
db42271b-04f2-45d1-9ae7-0c8f9371a4db
Why didn't it retrieve all 10 results? And what do I need to change to achieve the correct results to support this use case?
For reference, I've tested this against dsc-2.1.1, dse-4.6 and using the DataStax Java Driver v2.1.6.
First of all, please only ask one question at a time. Both of your questions here could easily stand on their own. I know these are related, but it just makes the readers come down with a case of tl;dr.
I'll answer your 2nd question first, because the answer ties into a fundamental understanding that is central to getting the data model correct. When I INSERT your rows and run the following query, this is what I get:
aploetz#cqlsh:stackoverflow2> SELECT document_uuid FROM document_change_events
WHERE token(event_uuid) > token(minTimeuuid('2015-05-10 00:00-0500'))
AND token(event_uuid) < token(maxTimeuuid('2015-05-22 00:00-0500'));
document_uuid
--------------------------------------
a079b30f-be80-4a99-ae0e-a784d82f0432
3b593ca6-220c-4a8b-8c16-27dc1fb5adde
ec155e0b-39a5-4d2f-98f0-0cd7a5a07ec8
db42271b-04f2-45d1-9ae7-0c8f9371a4db
(4 rows)
Which is similar to what you are seeing. Why didn't that return all 10? Well, the answer becomes apparent when I include token(event_uuid) in my SELECT:
aploetz#cqlsh:stackoverflow2> SELECT token(event_uuid),document_uuid FROM document_change_events WHERE token(event_uuid) > token(minTimeuuid('2015-05-10 00:00-0500')) AND token(event_uuid) < token(maxTimeuuid('2015-05-22 00:00-0500'));
token(event_uuid) | document_uuid
----------------------+--------------------------------------
-2112897298583224342 | a079b30f-be80-4a99-ae0e-a784d82f0432
2990331690803078123 | 3b593ca6-220c-4a8b-8c16-27dc1fb5adde
5049638908563824288 | ec155e0b-39a5-4d2f-98f0-0cd7a5a07ec8
5577339174953240576 | db42271b-04f2-45d1-9ae7-0c8f9371a4db
(4 rows)
Cassandra stores partition keys (event_uuid in your case) in order by their hashed token value. You can see this when using the token function. Cassandra generates partition tokens with a process called consistent hashing to ensure even cluster distribution. In other words, querying by token range doesn't make sense unless the actual (hashed) token values are meaningful to your application.
Getting back to your first question, this means you will have to find a different column to partition on. My suggestion is to use a timeseries mechanism called a "date bucket." Picking the date bucket can be tricky, as it depends on your requirements and query patterns...so that's really up to you to pick a useful one.
For the purposes of this example, I'll pick "month." So I'll re-create your table partitioning on month and clustering by event_uuid:
CREATE TABLE document_change_events2 (
event_uuid TIMEUUID,
document_uuid uuid,
month text,
PRIMARY KEY ((month),event_uuid, document_uuid)
) WITH default_time_to_live='7776000';
Now I can query by a date range, when also filtering by month:
aploetz#cqlsh:stackoverflow2> SELECT document_uuid FROM document_change_events2
WHERE month='201505'
AND event_uuid > minTimeuuid('2015-05-10 00:00-0500')
AND event_uuid < maxTimeuuid('2015-05-22 00:00-0500');
document_uuid
--------------------------------------
a079b30f-be80-4a99-ae0e-a784d82f0432
ec155e0b-39a5-4d2f-98f0-0cd7a5a07ec8
db42271b-04f2-45d1-9ae7-0c8f9371a4db
92b6fb6a-9ded-47b0-a91c-68c63f45d338
11b4a49c-b73d-4c8d-9f88-078a6f303167
970e5e77-1e07-40ea-870a-84637c9fc280
3b593ca6-220c-4a8b-8c16-27dc1fb5adde
c8188b73-1b97-4b32-a897-7facdeecea35
548b320a-10f6-409f-a921-d4a1170a576e
b29e7915-7c17-4900-b784-8ac24e9e72e2
(10 rows)
Again, month may not work for your application. So put some thought behind coming up with an appropriate column to partition on, and then you should be able to solve this.
I'm having a big problem with the embedded firebird database engine and Java SE.
I'm currently developing a filtering tool for users to filter out data.
So I have made two Options for filtering, user can chose one or both:
Filter out from a black list(the black list is controled by user).
Filter out according to a massive list that records every record ever uploaded and filtered out.
The data the user uploads its in plain text comma or token separated like this:
(SET OF COLUMNS)| RECORD TO FILTER |
0-MANY COLUMNS | ABC2 |
0-MANY COLUMNS | ABC5 |
When I Upload it to the DB, I add A FLAG for every filter
(SET OF COLUMNS) | RECORD TO FILTER | FLAG FOR FIlTER A | FLAG FOR FILTER B |
0-MANY COLUMNS | ABC2 | | |
0-MANY COLUMNS | ABC5 | | |
So, when it comes to the second filter, the program has a main empty table on the first run of the software, then it fills that table with all the records from the very first upload.
The main table will have unique records like the following table after a few text uploads made by the user:
Record | Date criteria for filtering |
ABC1 | 08/11/2012:1,07/11/2012:3,06/11/2012:5|
ABC2 | 05/11/2012:1,04/11/2012:0,03/11/2012:0|
ABC3 | 12/11/2012:3,11/11/2012:0,10/11/2012:0|
ABC4 | 12/11/2012:1,11/11/2012:0,10/11/2012:0|
ABC5 | 12/11/2012:3,11/11/2012:0,10/11/2012:3|
ABC9 | 11/11/2012:3,10/11/2012:1,09/11/2012:0|
When the data is processed, for example, the software updates both, the main table and the user table:
(SET OF COLUMNS| RECORD TO FILTER | FLAG FOR FIlTER A | FLAG FOR FILTER B |
0-MANY COLUMNS | ABC4 | | |
0-MANY COLUMNS | ABC9 | | |
So the main table will update:
Record | Day criteria for filtering |
ABC1 | 08/11/2012:1,07/11/2012:3,06/11/2012:5|
ABC2 | 05/11/2012:1,04/11/2012:0,03/11/2012:0|
ABC3 | 12/11/2012:3,11/11/2012:0,10/11/2012:0|
ABC4 | 12/11/2012:1,11/11/2012:0,10/11/2012:0| ->12/11/2012:2,11/11/2012:0,10/11/2012:0
ABC5 | 12/11/2012:3,11/11/2012:0,10/11/2012:3|
ABC9 | 11/11/2012:3,10/11/2012:1,09/11/2012:0| ->12/11/2012:1,11/11/2012:3,10/11/2012:1
If in the last three days the data criteria event has reached more than four, the user table will flag filter B. Notice that each date has an integer next to it.
(SET OF COLUMNS)| RECORD TO FILTER | FLAG FOR FIlTER A | FLAG FOR FILTER B |
0-MANY COLUMNS | ABC4 | | |
0-MANY COLUMNS | ABC9 | | X |
Both updates are in a single transaction, the problem is that when the user uploads more than 800,000 records my program throws the following exception in the while loop.
I use StringBuilder parsing and append methods for maximun performance on the mutable days string.
java.lang.OutOfMemoryError: Java heap space
Here it's my code, I use five days:
FactoriaDeDatos factoryInstace = FactoriaDeDatos.getInstance();
Connection cnx = factoryInstace.getConnection();
cnx.setAutoCommit(false);
PreparedStatement pstmt = null;
ResultSet rs=null;
pstmt = cnx.prepareStatement("SELECT CM.MAIL,CM.FECHAS FROM TCOMERCIALMAIL CM INNER JOIN TEMPMAIL TMP ON CM.MAIL=TMP."+colEmail);
rs=pstmt.executeQuery();
pstmtDet = cnx.prepareStatement("ALTER INDEX IDX_MAIL INACTIVE");
pstmtDet.executeUpdate();
pstmtDet = cnx.prepareStatement("SET STATISTICS INDEX IDX_FECHAS");
pstmtDet.executeUpdate();
pstmtDet = cnx.prepareStatement("ALTER INDEX IDX_FECHAS INACTIVE");
pstmtDet.executeUpdate();
pstmtDet = cnx.prepareStatement("SET STATISTICS INDEX IDX_FECHAS");
pstmtDet.executeUpdate();
sql_com_local_tranx=0;
int trxNum=0;
int ix=0;
int ixE1=0;
int ixAc=0;
StringBuilder sb;
StringTokenizer st;
String fechas;
int pos1,pos2,pos3,pos4,pos5,pos6,pos7,pos8,pos9;
StringBuilder s1,s2,sSQL,s4,s5,s6,s7,s8,s9,s10;
long startLoop = System.nanoTime();
long time2 ;
boolean ejecutoMax=false;
//int paginador_sql=1000;
//int trx_ejecutada=0;
sb=new StringBuilder();
s1=new StringBuilder();
s2=new StringBuilder();
sSQL=new StringBuilder();
s4=new StringBuilder();
s6=new StringBuilder();
s8=new StringBuilder();
s10=new StringBuilder();
while(rs.next()){
//De aqui
actConteoDia=0;
sb.setLength(0);
sb.append(rs.getString(2));
pos1= sb.indexOf(":",0);
pos2= sb.indexOf(",",pos1+1);
pos3= sb.indexOf(":",pos2+1);
pos4= sb.indexOf(",",pos3+1);
pos5= sb.indexOf(":",pos4+1);
pos6= sb.indexOf(",",pos5+1);
pos7= sb.indexOf(":",pos6+1);
pos8= sb.indexOf(",",pos7+1);
pos9= sb.indexOf(":",pos8+1);
s1.setLength(0);
s1.append(sb.substring(0, pos1));
s2.setLength(0);
s2.append(sb.substring(pos1+1, pos2));
s4.setLength(0);
s4.append(sb.substring(pos3+1, pos4));
s6.setLength(0);
s6.append(sb.substring(pos5+1, pos6));
s8.setLength(0);
s8.append(sb.substring(pos7+1, pos8));
s10.setLength(0);
s10.append(sb.substring(pos9+1));
actConteoDia=Integer.parseInt(s2.toString());
actConteoDia++;
sb.setLength(0);
//sb.append(s1).a
if(actConteoDia>MAXIMO_LIMITE_POR_SEMANA){
actConteoDia=MAXIMO_LIMITE_POR_SEMANA+1;
}
sb.append(s1).append(":").append(actConteoDia).append(",").append(rs.getString(2).substring(pos2+1, rs.getString(2).length()));
//For every date record it takes aprox 8.3 milisec by record
sSQL.setLength(0);
sSQL.append("UPDATE TCOMERCIALMAIL SET FECHAS='").append(sb.toString()).append("' WHERE MAIL='").append(rs.getString(1)).append("'");
pstmtDet1.addBatch(sSQL.toString());
//actConteoDia=0;
//actConteoDia+=Integer.parseInt(s2.toString());
actConteoDia+=Integer.parseInt(s4.toString());
actConteoDia+=Integer.parseInt(s6.toString());
actConteoDia+=Integer.parseInt(s8.toString());
actConteoDia+=Integer.parseInt(s10.toString());
if(actConteoDia>MAXIMO_LIMITE_POR_SEMANA){
sSQL.setLength(0);
sSQL.append("UPDATE TEMPMAIL SET DIASLIMITE='S' WHERE ").append(colEmail).append("='").append(rs.getString(1)).append("'");
pstmtDet.addBatch(sSQL.toString());
}
sql_com_local_tranx++;
if(sql_com_local_tranx%2000==0 || sql_com_local_tranx%7000==0 ){
brDias.setString("PROCESANDO "+sql_com_local_tranx);
pstmtDet1.executeBatch();
pstmtDet.executeBatch();
}
if(sql_com_local_tranx%100000==0){
System.gc();
System.runFinalization();
}
}
pstmtDet1.executeBatch();
pstmtDet.executeBatch();
cnx.commit();
I've made telemetry tests so I can trace where the problem lies.
The big while it's the problem, I think, but I don't know where the problem may exactly be.
I'm adding some images of the telemtry tests, please I need to interpretate them properly.
The gc becomes inverse to the time the jvm to keep objects alive:
http://imageshack.us/photo/my-images/849/66780403.png
The memory heap goes from 50 MB to 250 MB, the used heap reaches the 250 MB creating the outOfMemory exception:
50 MB
http://imageshack.us/photo/my-images/94/52169259.png
REACHING 250 MB
http://imageshack.us/photo/my-images/706/91313357.png
OUT OF MEMORY
http://imageshack.us/photo/my-images/825/79083069.png
The final stack of objetcts generated ordered by LiveBytes:
http://imageshack.us/photo/my-images/546/95529690.png
Any help, suggestion, answer will be vastly appreciated.
The problem is that you are using PreparedStatement as if it is a Statement, as you are calling addBatch(string). The javadoc of this method says:
Note:This method cannot be called on a PreparedStatement or CallableStatement.
This comment was added with JDBC 4.0, before that it said the method was optional. The fact that Jaybird allows you to call this method on PreparedStatement is therefor a bug: I created issue JDBC-288 in the Jaybird tracker.
Now to the cause of the OutOfMemoryError: When you use addBatch(String) on the PreparedStatement implementation of Jaybird (FBPreparedStatement), it is added to a list internal to the Statement implementation (FBStatement). In case of FBStatement, when you call executeBatch(), it will execute all statements in this list and then clear it. In FBPreparedStatement however executeBatch() is overridden to execute the originally prepared statement with batch parameters (in your example it won't do anything, as you never actually add a PreparedStatement-style batch). It will never execute the statements you added with addBatch(String), but it will also not clear the list of statements in FBStatement and that is most likely the cause of your OutOfMemoryError.
Based on this, the solution should be to create a Statement using cnx.createStatement and use that to execute your queries, or investigate if you could benefit from using one or more PreparedStatement objects with a parameterized query. It looks like you should be able to use two separate PreparedStatements, but I am not 100% sure; the added benefit would be protection against SQL injection and a minor performance improvement.
Addendum
This issue has been fixed since Jaybird 2.2.2
Full disclosure: I am the developer of the Jaybird / Firebird JDBC driver.
do not execute batch statements while iterating through the result set. store the sql you want to execute in a collection and then when you have finished processing the result set
start executing the new sql. does everything have to happen inside the same transaction?