JDBC getString(i) significant slower in server - java

I have an Oracle 12c database query, which pulls a table of 13 columns and more than 114470 rows in a daily basis.
I was not concerned with this issue until I moved the same code from my DEV server to my PROD server.
On my DEV environment the query takes 3 min:26 sec to complete its execution.
However on PROD the exact same code takes 15 min:34 sec for finishing.
These times were retrieved adding logs on the following code execution:
private List<Map<String, String>> getFieldInformation(ResultSet sqlResult) throws SQLException {
//Map between each column name and the designated data as String
List<Map<String, String>> rows = new ArrayList<Map<String,String>>();
// Count number of returned records
ResultSetMetaData rsmd = sqlResult.getMetaData();
int numberOfColumns = rsmd.getColumnCount();
boolean continueLoop = sqlResult.next();
// If there are no results we return an empty list not null
if(!continueLoop) {
return rows;
}
while (continueLoop) {
Map<String, String> columns = new LinkedHashMap<String, String>();
// Reset Variables for data
String columnLabel = null;
String dataInformation = null;
// Append to the map column name and related data
for(int i = 1; i <= numberOfColumns; i++) {
columnLabel = rsmd.getColumnLabel(i);
dataInformation = sqlResult.getString(i);
if(columnLabel!=null && columnLabel.length()>0 && (dataInformation==null || dataInformation.length() <= 0 )) {
dataInformation = "";
}
columns.put(columnLabel, dataInformation);
}
rows.add(columns);
continueLoop = sqlResult.next();
}
return rows;
}
I understand that "getString" is not the best way for retrieving non TEXT data, but due to the nature of the project I not always know the data type.
Furthermore, I checked in PROD under task manager, that "Memory (Private Working Set)" is being reserved very slowly.
So I would appreciate if you could help in the following questions:
Why there is a discrepancy in the execution timings for both environments? Can you please highlight some ways for checking this issue?
Is there a way were I can see my result set required memory and reserve the same upfront? Will this have some improvements in Performance?
How can I improve the performance for getString(i) method?
Thank you in advance for your assistance.
Best regards,
Ziza

Related

Improve performance of loading 100,000 records from database

We created a program to make the use of the database easier in other programs. So the code im showing gets used in multiple other programs.
One of those other programs gets about 10,000 records from one of our clients and has to check if these are in our database already. If not we insert them into the database (they can also change and have to be updated then).
To make this easy we load all the entries from our whole table (at the moment 120,000), create a class for every entry we get and put all of them into a Hashmap.
The loading of the whole table this way takes around 5 minutes. Also we sometimes have to restart the program because we run into a GC overhead error because we work on limited hardware. Do you have an idea of how we can improve the performance?
Here is the code to load all entries (we have a global limit of 10.000 entries per query so we use a loop):
public Map<String, IMasterDataSet> getAllInformationObjects(ISession session) throws MasterDataException {
IQueryExpression qe;
IQueryParameter qp;
// our main SDP class
Constructor<?> constructorForSDPbaseClass = getStandardConstructor();
SimpleDateFormat itaTimestampFormat = new SimpleDateFormat("yyyyMMddHHmmssSSS");
// search in standard time range (modification date!)
Calendar cal = Calendar.getInstance();
cal.set(2010, Calendar.JANUARY, 1);
Date startDate = cal.getTime();
Date endDate = new Date();
Long startDateL = Long.parseLong(itaTimestampFormat.format(startDate));
Long endDateL = Long.parseLong(itaTimestampFormat.format(endDate));
IDescriptor modDesc = IBVRIDescriptor.ModificationDate.getDescriptor(session);
// count once before to determine initial capacities for hash map/set
IBVRIArchiveClass SDP_ARCHIVECLASS = getMasterDataPropertyBag().getSDP_ARCHIVECLASS();
qe = SDP_ARCHIVECLASS.getQueryExpression(session);
qp = session.getDocumentServer().getClassFactory()
.getQueryParameterInstance(session, new String[] {SDP_ARCHIVECLASS.getDatabaseName(session)}, null, null);
qp.setExpression(qe);
qp.setHitLimitThreshold(0);
qp.setHitLimit(0);
int nrOfHitsTotal = session.getDocumentServer().queryCount(session, qp, "*");
int initialCapacity = (int) (nrOfHitsTotal / 0.75 + 1);
// MD sets; and objects already done (here: document ID)
HashSet<String> objDone = new HashSet<>(initialCapacity);
HashMap<String, IMasterDataSet> objRes = new HashMap<>(initialCapacity);
qp.close();
// do queries until hit count is smaller than 10.000
// use modification date
boolean keepGoing = true;
while(keepGoing) {
// construct query expression
// - basic part: Modification date & class type
// a. doc. class type
qe = SDP_ARCHIVECLASS.getQueryExpression(session);
// b. ID
qe = SearchUtil.appendQueryExpressionWithANDoperator(session, qe,
new PlainExpression(modDesc.getQueryLiteral() + " BETWEEN " + startDateL + " AND " + endDateL));
// 2. Query Parameter: set database; set expression
qp = session.getDocumentServer().getClassFactory()
.getQueryParameterInstance(session, new String[] {SDP_ARCHIVECLASS.getDatabaseName(session)}, null, null);
qp.setExpression(qe);
// order by modification date; hitlimit = 0 -> no hitlimit, but the usual 10.000 max
qp.setOrderByExpression(session.getDocumentServer().getClassFactory().getOrderByExpressionInstance(modDesc, true));
qp.setHitLimitThreshold(0);
qp.setHitLimit(0);
// Do not sort by modification date;
qp.setHints("+NoDefaultOrderBy");
keepGoing = false;
IInformationObject[] hits = null;
IDocumentHitList hitList = null;
hitList = session.getDocumentServer().query(qp, session);
IDocument doc;
if (hitList.getTotalHitCount() > 0) {
hits = hitList.getInformationObjects();
for (IInformationObject hit : hits) {
String objID = hit.getID();
if(!objDone.contains(objID)) {
// do something with this object and the class
// here: construct a new SDP sub class object and give it back via interface
doc = (IDocument) hit;
IMasterDataSet mdSet;
try {
mdSet = (IMasterDataSet) constructorForSDPbaseClass.newInstance(session, doc);
} catch (Exception e) {
// cause for this
String cause = (e.getCause() != null) ? e.getCause().toString() : MasterDataException.ERRMSG_PART_UNKNOWN;
throw new MasterDataException(MasterDataException.ERRMSG_NOINSTANCE_POSSIBLE, this.getClass().getSimpleName(), e.toString(), cause);
}
objRes.put(mdSet.getID(), mdSet);
objDone.add(objID);
}
}
doc = (IDocument) hits[hits.length - 1];
Date lastModDate = ((IDateValue) doc.getDescriptor(modDesc).getValues()[0]).getValue();
startDateL = Long.parseLong(itaTimestampFormat.format(lastModDate));
keepGoing = (hits.length >= 10000 || hitList.isResultSetTruncated());
}
qp.close();
}
return objRes;
}
Loading 120,000 rows (and more) each time will not scale very well, and your solution may not work in the future as the record size grows. Instead let the database server handle the problem.
Your table needs to have a primary key or unique key based on the columns of the records. Iterate through the 10,000 records performing JDBC SQL update to modify all field values with where clause to exactly match primary/unique key.
update BLAH set COL1 = ?, COL2 = ? where PKCOL = ?; // ... AND PKCOL2 =? ...
This modifies an existing row or does nothing at all - and JDBC executeUpate() will return 0 or 1 indicating number of rows changed. If number of rows changed was zero you have detected a new record which does not exist, so perform insert for that new record only.
insert into BLAH (COL1, COL2, ... PKCOL) values (?,?, ..., ?);
You can decide whether to run 10,000 updates followed by however many inserts are needed, or do update+optional insert, and remember JDBC batch statements / auto-commit off may help speed things up.

Mongo DB Java Driver Cursor Doesn't contain full Collection

I currently have a cursor that is going through a MongoDB Collection and is taking out a couple of different values out and adding them to another table. However I have noticed that when the process is running the cursor isn't covering all the documents within the collection (Found out by adding counter).
The Beacon Lookup Collection has 3342 documents, but from the logging I can only see it's iterated through 1114 of them and finishes the cursor with no error.Looking at the cursor when debugging it does contain all 3343 documents.
Below is the method I am trying to run and currently having issues with:
public void flattenCollection(){
MongoCollection<Document> beaconLookup = getCollection("BEACON_LOOKUP");
MongoCollection<Document> triggers = getCollection("DIM_TRIGGER");
System.out.println(beaconLookup.count());
// count = 3342
long count = beaconLookup.count();
MongoCursor<Document> beaconLookupCursor = beaconLookup.find().batchSize((int) count).noCursorTimeout(true).iterator();
MongoCursor<Document> triggersCursor = triggers.find().iterator();
try {
while (beaconLookupCursor.hasNext()) {
int major = (Integer) beaconLookupCursor.next().get("MAJOR");
int minor = (Integer) beaconLookupCursor.next().get("MINOR");
if(major==1215) {
System.out.println("MAJOR " + major + " MINOR " + minor);
}
triggers.updateMany(and(eq("MAJOR", major),
eq("MINOR", minor)),
combine(set("BEACON_UUID",beaconLookupCursor.next().get("UUID"))));
count = count - 1;
System.out.println(count);
}
} finally {
beaconLookupCursor.close();
}
}
Any advice would be great!
You are calling next() more than one time for each iteration.
Change
int major = (Integer) beaconLookupCursor.next().get("MAJOR");
int minor = (Integer) beaconLookupCursor.next().get("MINOR");
to
Document doc = beaconLookupCursor.next();
int major = (Integer) doc.get("MAJOR");
int minor = (Integer) doc.get("MINOR");
Looks like there is one more call for UUID. Update that with doc reference too.

Merge data by date

I have below data which I fetched from database by using Hibernate NamedQuery
TXN_ID END_DATE
---------- ------------
121 15-JUN-16
122 15-JUN-16
123 16-MAY-16
Each row data can be store in Java class Object.
Now I want to combined data depending on the END_DATE. If END_DATE are same then merge TXN_ID data.
From the above data output would be :
TXN_ID END_DATE
---------- ------------
121|122 15-JUN-16
123 16-MAY-16
I want to do this program in java. What is the easy program for that?
Using the accepted function printMap, to iterate through the hashmap in order to see if output is correct.
With the code below:
public static void main(String[] args) {
String[][] b = {{"1","15-JUN-16"},{"2","16-JUN-16"},{"3","13-JUN-16"},{"4","16-JUN-16"},{"5","17-JUN-16"}};
Map<String, String> mapb = new HashMap<String,String>();
for(int j=0; j<b.length; j++){
String c = mapb.get(b[j][1]);
if(c == null)
mapb.put(b[j][1], b[j][0]);
else
mapb.put(b[j][1], c+" "+b[j][0]);
}
printMap(mapb);
}
You get the following output:
13-JUN-16 = 3
16-JUN-16 = 2 4
17-JUN-16 = 5
15-JUN-16 = 1
I think this will solve your problem.
With hibernate you can put query result in a list of object
Query q = session.createSQLQuery( sql ).addEntity(ObjDataQuery.class);
List<ObjDataQuery> res = q.list();
Now you can create an hashmap to storage final result, to populate this object you can iterate over res
Map<String, String> finalResult= new HashMap<>();
for (int i=0; i<res.size(); i++){
if (finalResult.get(res.get(i).date!=null){
//new element
finalResult.put(res.get(i).date,res.get(i).txn)
} else {
//update element
finalResult.put(res.get(i).date,
finalResult.get(res.get(i).date) + res.get(i).txn)
}
}
I've not tested it by logic should be correct.
Another way is to change the query to obtain direct the final result (in oracle see LISTAGG)

How to mass delete multiple rows in hbase?

I have the following rows with these keys in hbase table "mytable"
user_1
user_2
user_3
...
user_9999999
I want to use the Hbase shell to delete rows from:
user_500 to user_900
I know there is no way to delete, but is there a way I could use the "BulkDeleteProcessor" to do this?
I see here:
https://github.com/apache/hbase/blob/master/hbase-examples/src/test/java/org/apache/hadoop/hbase/coprocessor/example/TestBulkDeleteProtocol.java
I want to just paste in imports and then paste this into the shell, but have no idea how to go about this. Does anyone know how I can use this endpoint from the jruby hbase shell?
Table ht = TEST_UTIL.getConnection().getTable("my_table");
long noOfDeletedRows = 0L;
Batch.Call<BulkDeleteService, BulkDeleteResponse> callable =
new Batch.Call<BulkDeleteService, BulkDeleteResponse>() {
ServerRpcController controller = new ServerRpcController();
BlockingRpcCallback<BulkDeleteResponse> rpcCallback =
new BlockingRpcCallback<BulkDeleteResponse>();
public BulkDeleteResponse call(BulkDeleteService service) throws IOException {
Builder builder = BulkDeleteRequest.newBuilder();
builder.setScan(ProtobufUtil.toScan(scan));
builder.setDeleteType(deleteType);
builder.setRowBatchSize(rowBatchSize);
if (timeStamp != null) {
builder.setTimestamp(timeStamp);
}
service.delete(controller, builder.build(), rpcCallback);
return rpcCallback.get();
}
};
Map<byte[], BulkDeleteResponse> result = ht.coprocessorService(BulkDeleteService.class, scan
.getStartRow(), scan.getStopRow(), callable);
for (BulkDeleteResponse response : result.values()) {
noOfDeletedRows += response.getRowsDeleted();
}
ht.close();
If there exists no way to do this through JRuby, Java or alternate way to quickly delete multiple rows is fine.
Do you really want to do it in shell because there are various other better ways. One way is using the native java API
Construct an array list of deletes
pass this array list to Table.delete method
Method 1: if you already know the range of keys.
public void massDelete(byte[] tableName) throws IOException {
HTable table=(HTable)hbasePool.getTable(tableName);
String tablePrefix = "user_";
int startRange = 500;
int endRange = 999;
List<Delete> listOfBatchDelete = new ArrayList<Delete>();
for(int i=startRange;i<=endRange;i++){
String key = tablePrefix+i;
Delete d=new Delete(Bytes.toBytes(key));
listOfBatchDelete.add(d);
}
try {
table.delete(listOfBatchDelete);
} finally {
if (hbasePool != null && table != null) {
hbasePool.putTable(table);
}
}
}
Method 2: If you want to do a batch delete on the basis of a scan result.
public bulkDelete(final HTable table) throws IOException {
Scan s=new Scan();
List<Delete> listOfBatchDelete = new ArrayList<Delete>();
//add your filters to the scanner
s.addFilter();
ResultScanner scanner=table.getScanner(s);
for (Result rr : scanner) {
Delete d=new Delete(rr.getRow());
listOfBatchDelete.add(d);
}
try {
table.delete(listOfBatchDelete);
} catch (Exception e) {
LOGGER.log(e);
}
}
Now coming down to using a CoProcessor. only one advice, 'DON'T USE CoProcessor' unless you are an expert in HBase.
CoProcessors have many inbuilt issues if you need I can provide a detailed description to you.
Secondly when you delete anything from HBase it's never directly deleted from Hbase there is tombstone marker get attached to that record and later during a major compaction it gets deleted, so no need to use a coprocessor which is highly resource exhaustive.
Modified code to support batch operation.
int batchSize = 50;
int batchCounter=0;
for(int i=startRange;i<=endRange;i++){
String key = tablePrefix+i;
Delete d=new Delete(Bytes.toBytes(key));
listOfBatchDelete.add(d);
batchCounter++;
if(batchCounter==batchSize){
try {
table.delete(listOfBatchDelete);
listOfBatchDelete.clear();
batchCounter=0;
}
}}
Creating HBase conf and getting table instance.
Configuration hConf = HBaseConfiguration.create(conf);
hConf.set("hbase.zookeeper.quorum", "Zookeeper IP");
hConf.set("hbase.zookeeper.property.clientPort", ZookeeperPort);
HTable hTable = new HTable(hConf, tableName);
If you already aware of the rowkeys of the records that you want to delete from HBase table then you can use the following approach
1.First create a List objects with these rowkeys
for (int rowKey = 1; rowKey <= 10; rowKey++) {
deleteList.add(new Delete(Bytes.toBytes(rowKey + "")));
}
2.Then get the Table object by using HBase Connection
Table table = connection.getTable(TableName.valueOf(tableName));
3.Once you have table object call delete() by passing the list
table.delete(deleteList);
The complete code will look like below
Configuration config = HBaseConfiguration.create();
config.addResource(new Path("/etc/hbase/conf/hbase-site.xml"));
config.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
String tableName = "users";
Connection connection = ConnectionFactory.createConnection(config);
Table table = connection.getTable(TableName.valueOf(tableName));
List<Delete> deleteList = new ArrayList<Delete>();
for (int rowKey = 500; rowKey <= 900; rowKey++) {
deleteList.add(new Delete(Bytes.toBytes("user_" + rowKey)));
}
table.delete(deleteList);

Tracing operations in Mongodb Bulk Operation

I am using MongoDB 2.6.1. The question from me is that-
"Is it possible for keep a track of _id in Bulk Operations??"
Suppose if I have created one object for BulkWriteOperation, for example 50 documents to be inserted to the 'B' collection from 'A' collection. I need keep a list of successful write operations and failed write operations also.
Bulk Inserts and deletes are working fine. But the question is that-
-- "I need to keep a track of _ids, for a query- find the documents from A and insert to B collection. In the mean while, I need to keep a list of _ids (successful and failed operations). I need to delete the documents in A collection, only for those successful operations and keep failed documents as it is"--
Please help me out.
Thanking you :) :)
First, you'll need to use UnorderedBulkOperation for the entire batch to execute. You will need to use a try/catch around your BulkWriteOperation.execute(), catching BulkWriteException which will give you access to a list of BulkWriteError as well as the BulkWriteResult.
Here's a quick and dirty example:
MongoClient m = new MongoClient("localhost");
DB db = m.getDB( "test" );
DBCollection coll = db.getCollection( "bulk" );
coll.drop();
coll.createIndex(new BasicDBObject("i", 1), new BasicDBObject("unique", true));
BulkWriteOperation bulkWrite = coll.initializeUnorderedBulkOperation();
for (int i = 0; i < 100; i++) {
bulkWrite.insert(new BasicDBObject("i", i));
}
// Now add 10 documents to the batch that will generate a unique index error
for (int i = 0; i < 10; i++) {
bulkWrite.insert(new BasicDBObject("i", i));
}
BulkWriteResult result = null;
List<BulkWriteError> errors = null;
try {
result = bulkWrite.execute();
} catch (BulkWriteException bwe) {
bwe.printStackTrace();
errors = bwe.getWriteErrors();
result = bwe.getWriteResult();
}
for (BulkWriteError e : errors) {
System.out.println(e.getIndex() + " failed");
}
System.out.println(result);

Categories

Resources