How to enhance CompletableFuture Join Performance - java

im working in java project where im calling a rest api using ResTemplate , im using CompletableFuture to do parallel calls , this is my code :
List<CompletableFuture<Object>> allUsersFuturesCalls = new ArrayList<>();
// User list contains 300 user
for (User user : listOfUsers) {
// 1 Mapping Excel to Model
User userToCreate = userService.MapUser(user);
// Add create user service to the list
// userService.createUser calls restTemplate POST
allUsersFuturesCalls.add(userService.createUser(userToCreate));
}
// Trigger Calls
CompletableFuture.allOf(allUsersFuturesCalls.toArray(new CompletableFuture[0])).join();
this code is work fine when the listOfUsers contains 300 elements , it takes 1 min but when i work with another listOfUsers that contains 4000 users the execution time is increased and it takes more than 15 min to trigger make 4k Calls.
do you have any idea on how to increase the performance of my CompletableFuture calls.
Regards!

Related

Android Room Paging Results For Export. Potential Problems with my solution

I'm writing an android app that supports exporting the app database to various formats. I don't want to run out of memory, but I want to page the results easily without receiving updates when it changes. So I put it in a service, and came up with the following method of paging.
I use a limit clause in my query to limit the number of results returned and I'm sorting on the primary key. So it should be fast. I use a set of nested for loops to execute the series of queries until no results are returned, and walk through the given results, so that's linear. It's in a service, so it doesn't matter that I'm using immediate result things here.
I feel like I might be doing something bad here. Am I?
// page through all results
for (List<CountedEventType> typeEvents = dao.getEventTypesPaged2(0);
typeEvents.size() > 0;
typeEvents = dao.getEventTypesPaged2(typeEvents.get(typeEvents.size() - 1).uid)
) {
for (CountedEventType type : typeEvents) {
// Do something for every result.
}
}
Here's my dao method.
#Dao
interface ExportDao {
#Query("SELECT * FROM CountedEventType WHERE uid > :lastUid ORDER BY uid ASC LIMIT 4")
List<CountedEventType> getEventTypesPaged2(int lastUid);
}

How can I get JPA/Entity Manager to make parallel queries instead of lumping them into one batch?

Inside the doGet method in my servlet I'm using a JPA TypedQuery to retrieve my data. I'm able to get the data I want through an http get request method. The method to get the data takes roughly 10 seconds and when I make a single request all is good. The problem occurs when I get multiple requests at the same time. If I make 4 request at the same time, all 4 queries are lumped together and they take 40 seconds to get the data back for all of them. How can I get JPA to make 4 separate queries in parallel? Is this something in the persistence.xml that needs set or is it a code related issue? Note: I've also tried executing this code in a thread. A link and some appropriate terminology to increase my understanding would be appreciated.
Thanks!
try{
String sequenceNo = request.getParameter("sequenceNo");
EntityManagrFactory emf = Persistence.createEntityManagerFactory("mydbcon");
EntityManager em = emf.createEntityManager();
long startTime = System.currentTimeMillis();
List<Myeo> returnData = methodToGetData(em);
System.out.println(sequenceNo + " " + (System.currentTimeMillis() - startTime));
String myJson = new Gson().toJson(returnData);
resp.getOutputStream().print(myJson);
resp.getOutputStream().flush();
}finally{
resp.getOutputStream().close();
if (em.isOpen())
em.close();
}
4 simulaneous request samples
localhost/myservlet/mycodeblock?sequenceNo=A
localhost/myservlet/mycodeblock?sequenceNo=B
localhost/myservlet/mycodeblock?sequenceNo=C
localhost/myservlet/mycodeblock?sequenceNo=D
resulting print statements
A 38002
B 38344
C 38785
D 39065
What I want
A 9002
B 9344
C 9785
D 10065
If you do 4 separate GET-requests these request should be called in parallel. They must not be lumped together, since they are called in different transactions.
If that does not work as you wrote, you should check whether you have defined a database-connection-pool-size or a servlet-thread-pool-size which serializes the calls to the dbms.

Influx db java client batch does not write to DB

I am trying to write points to influxDB using their Java client.
Batch is important to me.
If I use the influxDB.enableBatch with influxDB.write(Point) no data is inserted.
If I use the BatchPoints and influxDB.write(batchPoints) - data is inserted successfully.
Both code samples are taken from: https://github.com/influxdata/influxdb-java/tree/influxdb-java-2.7
InfluxDB influxDB = InfluxDBFactory.connect(influxUrl, influxUser, influxPassword);
influxDB.setDatabase(dbName);
influxDB.setRetentionPolicy("autogen");
// Flush every 2000 Points, at least every 100ms
influxDB.enableBatch(2000, 100, TimeUnit.MILLISECONDS);
influxDB.write(Point.measurement("cpu")
.time(System.currentTimeMillis(), TimeUnit.MILLISECONDS)
.addField("idle", 90L)
.addField("user", 9L)
.addField("system", 1L)
.build());
Query query = new Query("SELECT idle FROM cpu", dbName);
QueryResult result = influxDB.query(query);
Returns nothing.
BatchPoints batchPoints = BatchPoints.database(dbName).tag("async", "true").build();
Point point1 = Point
.measurement("cpu")
.tag("atag", "test")
.addField("idle", 90L)
.addField("usertime", 9L)
.addField("system", 1L)
.build();
batchPoints.point(point1);
influxDB.write(batchPoints);
Query query = new Query("SELECT * FROM cpu ", dbName);
QueryResult result = influxDB.query(query);
This returns data successfully.
As mentioned, I need the first way to function.
How can I achieve that?
versions:
influxdb-1.3.6
influxdb-java:2.7
Regards, Ido
maybe it's too late or you have already resolved your issue, but I will answer your question, it may be useful for others.
I think your first example is not working because you enabled batch functionality and it will "Flush every 2000 Points, at least every 100ms". So basically it's working, but you are making select before the actual save is performed.
When you use influxDB.enableBatch(...); functionality influxdb-client creates internal thread pool for storing your data after collecting them or by timeout and it will not be done immediately.
In second example when you use influxDB.write(batchPoints); influxdb-client is synchronously writing your data to InfluxDb. That's why your select statement is able to return data immediately.

Getting 'No Node Available Exception' while using TrasnsportClient of ElasticSearch

I am pretty new to Elastic Search. I have following things in my code OR looking for solution to 'No Node Available Exception' problem in the following scenario.
1) We have ES running on System with 1 Node and 1 Cluster.
2) We have 4 indexes on ES. (Each index has different type of data, Example: Customer
Preference / Customer Address / Customer Interests / Customer Basic Details)
3) We have WebApplication (as webservice) running on Tomcat.
4) We are calling webservice method as Controller's. -This will receive request from
consumers in the form JSON data.
5) Based on that data(Example: If consumers asks for Customer Preference for given
customer Id then we go to 'Customer Preferences' index) we will go to service(using
Spring) layers.
6) In each of the service layer we get TransportClient instance in SingleTon object and
wait for its response and return the result to Controller.
In a Scenario if consumer asks for all 4 types of data for a Customer, and if we ask first for preference, address, interests and basic details in sequence. It works well. But this adds to performance. So we want these things to process and get data parallel.
So we used Spring Task Executors to do this parallel. In that case we get data from one index and others will get 'No Node Available Exception'. Its pretty random to on say on which data we get this problem.
Pleas help me here.
Thanks in advance!....
I had a similar issue, when trying to write data into a same ES node from multiple web applications. I fixed it by creating separate nodes for each.
I suggest you to try these settings of ES
client.transport.sniff=true
sniffOnConnectionFault=true
Also you can get the data from 4 indexes in a single query.
For Customer Preference / Customer Address / Customer Interests / Customer Basic Details.
Example code:
SearchRequestBuilder srb = client
.prepareSearch("preference_index", "address_index", "interests_index", "details_index")
.setTypes("preference_doc", "address_doc", "interests_doc", "details_doc")
.setSearchType(SearchType.DEFAULT);
QueryBuilder boolBuilder = QueryBuilders.boolQuery().should(
QueryBuilders.matchQuery("id_customer", "14"));
SearchResponse response = srb.setSize(4).execute().actionGet();
SearchHit[] docs = response.getHits().getHits();

API twitter rate limit

Here is what i'm trying to do :
I have a list a twitter user ID, for each one of them I need to retrieve a complete list of his followers ID and his friends ID. I don't need anything else, no screen name etc..
i'm using twitter4j btw
Here is how I'm doing it :
for each user i'm executing the following code in order to get a complete list of his followers IDs
long lCursor = -1
do{
IDs response = t.getFollowersIDs(id, lCursor);
long tab[] = response.getIDs();
for(long val : tab){
myIdList.add(val);
}
lCursor = response.getNextCursor();
}while(lCursor != 0);
My problem :
according to this page : https://dev.twitter.com/docs/api/1.1/get/followers/ids
the request rate limit for getFollowersIDs() is 15, considering this method return a maximum number of 5000 IDs, it means that it will be only possible to get 15*5000 IDs (or 15 users if they have less than 5000 followers).
This is really not enough for what i'm trying to do.
Am I doing something wrong ? Is there any solutions to improve that ? (even slightly)
Thanks for your help :)
The rate limit for that endpoint in v1.1 is 15 calls per 15 minutes per access token. See https://dev.twitter.com/docs/rate-limiting/1.1 for more information about the limits.
With that in mind, if you have an access token for each of your users, you should be able to fetch up to 75,000 (15*5000) follower IDs every 15 minutes for each access token.
If you only have one access token you'll, unfortunately, be limited in the manner you described and will just have to handle when your application hits the rate limit and continue processing once the 15 minutes is up.

Categories

Resources