I am using below code to get the all Job Details and display them in my UI. But it is taking too long ( Around 35 seconds ) to fetch the job details ( around 150 Quartz jobs ) from QUARTZ using JDBC data store ( Oracle DB).
Is there any alternate ways to speed up the task?
Any API support available for pagination from Quartz?
//All below code can be written using Java 8 Streams as well. Broke down the code to understand which //operation is taking time
Scheduler scheduler = schedulerFactoryBean.getScheduler();
Set<TriggerKey> triggerKeys = scheduler.getTriggerKeys(GroupMatcher.triggerGroupEquals(groupName)); //groupName - my group name - filtering by groupname
Iterator iterator = triggerKeys .iterator();
//This while block is taking around 12 seconds for around 150 jobs
while(iterator.hasNext()){
TriggerKey triggerKey = (TriggerKey) iterator.next();
if(scheduler.getTriggerState(triggerKey).equals(TriggerState.NONE)){
iterator.remove();
}
Map<TriggerKey, MyPOJO> myJobMap = new HashMap();
//This for loop is taking around 22 seconds for around 150 jobs
for(TriggerKey key : triggerKeys){
Trigger trigger = scheduler.getTrigger(key);
JobDataMap jobDataMap = scheduler.getJobDetail(trigger.getJobKey()).getJobDataMap();
MyPOJO myPOJO = jobDataMap.get("MYJOBPOJO");
myJobMap.put(key, myPOJO);
}
I have gone through the Quartz documentation but did not find any specific details on how to improve this read operation performance
Related
im working in java project where im calling a rest api using ResTemplate , im using CompletableFuture to do parallel calls , this is my code :
List<CompletableFuture<Object>> allUsersFuturesCalls = new ArrayList<>();
// User list contains 300 user
for (User user : listOfUsers) {
// 1 Mapping Excel to Model
User userToCreate = userService.MapUser(user);
// Add create user service to the list
// userService.createUser calls restTemplate POST
allUsersFuturesCalls.add(userService.createUser(userToCreate));
}
// Trigger Calls
CompletableFuture.allOf(allUsersFuturesCalls.toArray(new CompletableFuture[0])).join();
this code is work fine when the listOfUsers contains 300 elements , it takes 1 min but when i work with another listOfUsers that contains 4000 users the execution time is increased and it takes more than 15 min to trigger make 4k Calls.
do you have any idea on how to increase the performance of my CompletableFuture calls.
Regards!
I'm having performance issue trying to query large table, with multiple threads.
I'm using Oracle, Spring 2 and java 7.
I use a PoolDataSource (driver oracle.jdbc.pool.OracleDataSource) with as many connections as the number of thread to analyse a single table.
I made sure by logging poolDataSource.getStatistics() that I have enough available connections at any time.
Here is the code :
ExecutorService executorService = Executors.newFixedThreadPool(nbThreads);
List<Foo> foo = new ArrayList<>();
List<Callable<List<Foo>>> callables = new ArrayList<>();
int offset = 1;
while(offSetMaxReached) {
callables.add(new Callable<List<Foo>> {
#Override
public List<Foo> call() throws SQLException, InterruptedException {
return dao.doTheJob(...);
}
});
offset += 10000;
}
for(Future<List<Foo>> fooFuture : executorService.invokeAll(callables)) {
geometrieIncorrectes.addAll(fooFuture.get());
}
executorService.shutdown();
executorService.awaitTermination(1, TimeUnit.DAYS);
In the dao, I use a connection from the PoolDataSource, and Spring JdbcTemplate.query(query, PreparedStatementSetter, RowCallBackHandler).
The doTheJob method do the exact same thing for every query results.
My queries look like : SELECT A, B, C FROM MY.BIGTABLE OFFSET ? ROWS FETCH NEXT ? ROWS ONLY
So, to sum up, I have n threads called by a FixThreadPool, each thread deal with the exact same amount of data and do the exact same things.
But each thread completion is longer than the last one !
Example :
4 threads launched at the same time, but the first row of each resultSet (ie in the RowCallBackHandlers) is processed at :
thread 1 : 1.5s
thread 2 : 9s
thread 3 : 18s
thread 4 : 35s
and so on...
What can be the cause of such behavior ?
Edit :
The main cause of the problem was within the processed table itself.
Using OFFSET x ROWS FETCH NEXT y ROWS ONLY, Oracle needs to run through all the lines, starting by one.
So accessing offset 10 is faster than accessing offset 10000000 !
I was able to get good response time with temporary tables, and good indexes.
I am trying to write points to influxDB using their Java client.
Batch is important to me.
If I use the influxDB.enableBatch with influxDB.write(Point) no data is inserted.
If I use the BatchPoints and influxDB.write(batchPoints) - data is inserted successfully.
Both code samples are taken from: https://github.com/influxdata/influxdb-java/tree/influxdb-java-2.7
InfluxDB influxDB = InfluxDBFactory.connect(influxUrl, influxUser, influxPassword);
influxDB.setDatabase(dbName);
influxDB.setRetentionPolicy("autogen");
// Flush every 2000 Points, at least every 100ms
influxDB.enableBatch(2000, 100, TimeUnit.MILLISECONDS);
influxDB.write(Point.measurement("cpu")
.time(System.currentTimeMillis(), TimeUnit.MILLISECONDS)
.addField("idle", 90L)
.addField("user", 9L)
.addField("system", 1L)
.build());
Query query = new Query("SELECT idle FROM cpu", dbName);
QueryResult result = influxDB.query(query);
Returns nothing.
BatchPoints batchPoints = BatchPoints.database(dbName).tag("async", "true").build();
Point point1 = Point
.measurement("cpu")
.tag("atag", "test")
.addField("idle", 90L)
.addField("usertime", 9L)
.addField("system", 1L)
.build();
batchPoints.point(point1);
influxDB.write(batchPoints);
Query query = new Query("SELECT * FROM cpu ", dbName);
QueryResult result = influxDB.query(query);
This returns data successfully.
As mentioned, I need the first way to function.
How can I achieve that?
versions:
influxdb-1.3.6
influxdb-java:2.7
Regards, Ido
maybe it's too late or you have already resolved your issue, but I will answer your question, it may be useful for others.
I think your first example is not working because you enabled batch functionality and it will "Flush every 2000 Points, at least every 100ms". So basically it's working, but you are making select before the actual save is performed.
When you use influxDB.enableBatch(...); functionality influxdb-client creates internal thread pool for storing your data after collecting them or by timeout and it will not be done immediately.
In second example when you use influxDB.write(batchPoints); influxdb-client is synchronously writing your data to InfluxDb. That's why your select statement is able to return data immediately.
I am working on a system in which I need to store Avro Schemas in Cassandra database. So in Cassandra we will be storing something like this
SchemaId AvroSchema
1 some schema
2 another schema
Now suppose as soon as I insert another row in the above table in Cassandra and now the table is like this -
SchemaId AvroSchema
1 some schema
2 another schema
3 another new schema
As soon as I insert a new row in the above table - I need to tell my Java program to go and pull the new schema id and corresponding schema..
What is the right way to solve these kind of problem?
I know, one way is to have polling every few minutes, let's say every 5 minutes we will go and pull the data from the above table but this is not the right way to solve this problem as every 5 minutes, I am doing a pull whether or not there are any new schemas..
But is there any other solution apart from this?
Can we use Apache Zookeeper? Or Zookeeper is not fit for this problem?
Or any other solution?
I am running Apache Cassandra 1.2.9
Some solutions:
With database triggers: Cassandra 2.0 has some trigger support but it looks like it is not final and might change a little in 2.1 according to this article: http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-0-prototype-triggers-support. Triggers are a common solution.
You brought up polling but that is not always a bad option. Especially if you have something that marks that row as not being pulled yet, so you can just pull the new rows out of Cassandra. Pulling once every 5 minutes is nothing load wise for Cassandra or any database if the query is not a heavy cost. This option might not be good if new rows get inserted on a very infrequent basis.
Zookeeper would not be a perfect solution, see this quote:
Because watches are one time triggers and there is latency between
getting the event and sending a new request to get a watch you cannot
reliably see every change that happens to a node in ZooKeeper. Be
prepared to handle the case where the znode changes multiple times
between getting the event and setting the watch again. (You may not
care, but at least realize it may happen.)
Quote sourced from: http://zookeeper.apache.org/doc/r3.4.2/zookeeperProgrammers.html#sc_WatchRememberThese
Cassandra 3.0
You can use this and it will get you everything in the insert as a json object.
public class HelloWorld implements ITrigger
{
private static final Logger logger = LoggerFactory.getLogger(HelloWorld.class);
public Collection<Mutation> augment(Partition partition)
{
String tableName = partition.metadata().cfName;
logger.info("Table: " + tableName);
JSONObject obj = new JSONObject();
obj.put("message_id", partition.metadata().getKeyValidator().getString(partition.partitionKey().getKey()));
try {
UnfilteredRowIterator it = partition.unfilteredIterator();
while (it.hasNext()) {
Unfiltered un = it.next();
Clustering clt = (Clustering) un.clustering();
Iterator<Cell> cells = partition.getRow(clt).cells().iterator();
Iterator<ColumnDefinition> columns = partition.getRow(clt).columns().iterator();
while(columns.hasNext()){
ColumnDefinition columnDef = columns.next();
Cell cell = cells.next();
String data = new String(cell.value().array()); // If cell type is text
obj.put(columnDef.toString(), data);
}
}
} catch (Exception e) {
}
logger.debug(obj.toString());
return Collections.emptyList();
}
}
I wanted to integrate MongoDB in my applicaion. I have tested using Apache Banchmarking tool and produce 1,00,000 incoming request with 1000 concurrency level. After some test of insertion of records in mongodb, I can figure out that it is inserting around 1000 rec/sec. But it is not sufficient for my applicaion. Can anybody suggest that what is the best way to improve perofmance, so that I can acheive the goal of 2000 rec/sec.
My code is:
private static MongoOptions mo = new MongoOptions();
mo.connectionsPerHost = 20;
mo.threadsAllowedToBlockForConnectionMultiplier = 100;
private static Mongo m = new Mongo("127.0.0.1",mo);
private static DB db = m.getDB("mydb");
private static DBCollection coll = db.getCollection("mycoll");
DBObject dbObj = (DBObject) JSON.parse(msg);
db.requestStart();
coll.insert(dbObj);
dbObj.removeField("_id");
dbObj.put("val", "-10");
coll.insert(dbObj);
db.requestDone();
Having 1000 clients (which is what I assume you mean by concurrency level 1000) hitting the DB at one time sounds high to me. If it is running on a 1-2 core system your box is probably spending a lot of time switching between the different processes. Is the DB and benchmarking tool running on the same box? That will increase the amount of time it spends process switching also.
You could try putting the client on one multi core box and the DB on another.
Or try running fewer simulated clients maybe 10-20.