Multiple database queries in parallel, for a single client request

Multiple database queries in parallel, for a single client request - java

To complete certain requests from the user, in my application, I am issuing multiple DB queries from a single method, but they are currently being executed sequentially & thus the application is blocked until the time it has received the response/data for the previous query, then proceeding to next query. This is not something I like much. I would like to issue parallel queries.
Also after issuing queries I would like to do some other work, (instead of being blocked till previous queries response) & on getting the response for each query I would like to execute a code block specific to each query's data. What is the way to do this ?
Edit:My DB API does provide connection pooling.
I'm just a little bit familiar with Java multithreading.
Using:-
------
Java 1.6
Cassandra 1.1 Database with Hector

You should understand before you start doing this
To benefit from concurrency, you need to have multiple db connections. THe best way to solve this is to create a db pool.
You need to create a runnable / callable class for executing a db Statement. You will need to put together some messaging system to alert listeners to when your query has completed
Understand that when you are sending multiple requests at the same time, all bets are off as to which will complete first, and that there may be conflicts between statements that destabilize your app.

I have the similar task/issue. For complete build result I need to send few requests for few different service (few on REST, few on Thrift), for decrease latency I need to sent it in parallel. My idea is using java.util.concurrent.Future, make simple aggregation manager , which create many requests together and will be wait last retrieved response and return all needed data. In more advanced solution this manager can make/combine final result during other queries, but this solution can be not thread safe.

Here's a very trivial/limited approach:
final Connection conn = ...;
final Object[] result = new Object[1];
Thread t1 = new Thread(new Runnable() {
public void run() {
Object results = conn.executeQuery();
result[0] = results;
}
});
t1.setName("DBQueryWorker");
t1.start();
// do other work
while (t1.isAlive()) {
// wait on thread one
}
This is a simple approach, but many others are possible (eg, thread pooling via Java Concurrency task executors, Spring task executors, etc).

Related

LDAP VLV throws error "Other sort requests already in progress"

I am trying to implement pagination in LDAP using vlv, using reference from document https://docs.ldap.com/ldap-sdk/docs/javadoc/com/unboundid/ldap/sdk/controls/VirtualListViewRequestControl.html
it is working fine with single thread, but when try with multiple threads concurrently upto 5 threads it works fine, but as number of threads increased only 5 threads can run successfully exceed threads got failed with below error message:
LDAPException(resultCode=51 (busy), numEntries=0, numReferences=0, diagnostiMessage='Other sort requests already in progress', ldapSDKVersion=5.1.1..
I am using OpenLDAP, Unboundid api for connection with Java. About data size it is around 100k.
Tried with single connection and multiple connections(with multiple concurrent threads) getting same error in both cases.
Tried to synchronize block for fetching data.
On exception, make thread to wait and try again.
All above things didn't worked, threads cannot fetch data from LDAP.
After trying to close and reconnect connection as described in https://www.openldap.org/lists/openldap-technical/201107/msg00006.html
failed thread can fetch data but after retry lot of times, in my case thread retried about 2k times then it started fetching data.
Is there any better solution, retrying 2k times and getting result is not a good option.

From my experience in JAVA, it is better to use thread pools which shifts your solution from "how to manage threads" into a more robust and tasks oriented one.
To the point (of your use case): you may want to define a thread pool with a fixed size of thread. The pool will manage all incoming loads by re-using the threads in the pool. This is very efficient because more threads does not equal more performance. You may want to use a mechanism that re-uses threads, rather than just open and close threads and use too much of them.
You may start with something similar to this:
ExecutorService executorService = Executors.newFixedThreadPool(10);
Future<SearchResult> task1 = executorService.submit(() -> {
// your logic goes here
return result;
});
SearchResult result = task1.get();
This is an over simplified piece of code but you can clearly see that:
Tasks may be initiated from a stack (dynamically)
Results can be fetched by using a listener (you grab results only when they are ready - no polling needed)
The thread pool manages loads - so you can tweak your configuration and boost performance without changing your code (perfect for various environments that may want to configure your solution to suit their hardware profile)
I think you should give it a try.. after all - retrying 2000 times before success is really not that kind of idle 🙃

Creating workers in Spring

I'm writing a webserver with spring(mvc,data,security) which is serving tasks to physical devices(device count is around 100).
Device doesn't have query implementation inside. For example to execute some task u need write something like this:
Device driver = new DeviceDriver();
driver.setSettings(settingsJson);
driver.open(); // noone else can't connect to this device, open() can take up to 1 second
driver.setTask(taskJson);
driver.processTask(); // each task takes a few seconds to execute
String results = driver.getResults();
driver.close();
I'm not really an expert in designing architecture, so for now implemented webserver like this:
TaskController(#RestController) - processing incoming Post requests with tasks and persisting them to database.
DeviceService(#Service) - has init method, which gets list of devices from DB and creates/starts one worker per device. It passes taskRepository to each worker, so worker inside can save results of tasks.
Worker - extends Thread, it gets next task from database with certain period(via loop with sleep). When task executed worker saves result to db and updates status of task.
Does this approach makes any sense? Maybe there is better way to do this using spring components instead of Thread.

I would not create workers for each device (client). Because your controller will be able to serve concurrent requests being deployed on a thread-per-request based server. Additionally, this is not scalable at all- what if there is a new device on-boarded? You need to make changes on the database, restart the service with the current design!!
If you need device specific actions, you can just pass that on the request parameters from the device client. Therefore, there is no need to keep a predefined set of workers.
So, the design looks good except the worker set.

Use the #Scheduled annotation on your functions to build something like cron

How to optimize Tomcat for Feed pull

We have a mobile app which presents feed to users. The feed REST API is implemented on tomcat, which parallel makes calls to different data sources such as Couchbase, MYSQL to present the content. The simple code is given below:
Future<List<CardDTO>> pnrFuture = null;
Future<List<CardDTO>> newsFuture = null;
ExecutionContext ec = ExecutionContexts.fromExecutorService(executor);
final List<CardDTO> combinedDTOs = new ArrayList<CardDTO>();
// Array list of futures
List<Future<List<CardDTO>>> futures = new ArrayList<Future<List<CardDTO>>>();
futures.add(future(new PNRFuture(pnrService, userId), ec));
futures.add(future(new NewsFuture(newsService, userId), ec));
futures.add(future(new SettingsFuture(userPreferenceManager, userId), ec));
Future<Iterable<List<CardDTO>>> futuresSequence = sequence(futures, ec);
// combine the cards
Future<List<CardDTO>> futureSum = futuresSequence.map(
new Mapper<Iterable<List<CardDTO>>, List<CardDTO>>() {
#Override
public List<CardDTO> apply(Iterable<List<CardDTO>> allDTOs) {
for (List<CardDTO> cardDTOs : allDTOs) {
if (cardDTOs != null) {
combinedDTOs.addAll(cardDTOs);
}
}
Collections.sort(combinedDTOs);
return combinedDTOs;
}
}
);
Await.result(futureSum, Duration.Inf());
return combinedDTOs;
Right now we have around 4-5 parallel tasks per request. But it is expected to grow to almost 20-25 parallel tasks as we introduce new kinds of items in feed.
My question is, how can I improve this design? What kind of tuning is required in Tomcat to make sure such 20-25 parallel calls can be served optimally under heavy load.
I understand this is a broad topic, but any suggestions would be very helpful.

Tomcat just manages the incoming HTTP connections and pushes the bytes back and forth. There is no Tomcat optimization that can be done to make your application run any better.
If you need 25 parallel processes to run for each incoming HTTP request, and you think that's crazy, then you need to re-think how your application works.
No tomcat configuration will help with what you've presented in your question.

I understand you are calling this from a mobile app and the number of feeds could go up.
based on the amount of data being returned, would it be possible to return the results of some feeds in the same call?
That way the server does the work.
You are in control of the server - you are not in control of the users device and their connection speed.
As nickebbit suggested, things like DefferedResult are really easy to implement.
is it possible that the data from these feeds would not be updated in a quick fashion? If so - you should investigate the use of EHCache and the #Cacheable annotation.
You could come up with a solution where the user is always pulling a cached version of your content from your tomcat server. But your tomcat server is constantly updating that cache in the background.
Its an extra piece of work - but at the end of the day if the user experience is not fast - users will not want to use this app

It looks like your using Akka but not really embracing the Actor model, doing so will likely increase the parallelism and therefore scalability of your app.
If it was me I'd hand requests off from my REST API to a single or pool of coordinating actors that will process the request asynchronously. Using Spring's RestController this can be done using a Callable or DeferredResult but there will obviously be an equivalent in whatever framework you are using.
This coordinating actor would then in turn hand off processing to other actors (i.e. workers) that take care of the I/O bound tasks (preferably using their own dispatcher to ensure other CPU bound threads do not get blocked) and respond to the coordinator with their results.
Once all workers have fetched their data and replied to the coordinator with the results then the original request can be completed with the full result set.

How to listen new db records through java

Currently I use while(true) and Thread.sleep() for checking for new records in the db and execute java code.
Here is an example:
public class StartCommands implements Runnable{
private Active_Job activeJob;
Runnable execute_command;
public StartCommands(){
activeJobs = new Active_Job();
}
#Override
public void run(){
int jobId = 0;
while(true){
//access the db and get one row from the table by the status
jobId = activeJobs.get(Status.NEW);
if (jobId > 0){
activeJob.updateStatus(Status.INIT);
execute_command = activeJob.getCommand();
new Thread(execute_command).start();
activeJob = new Active_Job();
jobId = 0;
}
Thread.sleep(10*1000);
}
}
}
I've few places in the code that I use this method. But I dont like the endless loop and check every 10 seconds for new row.
So what I'm looking for is some kind of listener: once new record has been entered - execute java code. Some of the inserts executed from the application and some are not.

The technique you are using is called polling. You are checking for new records, waiting a set amount of time, then checking again for new records. One good way to respond to new records might be to create a controller that handles inserting new records into the database and force all clients (who update database records) to use the controller to do so. Then the controller can alert you when there is a new record. To facilitate the controller's alerts, you can set up a web service where the controller can contact you.
I say that this "might" be a good way to do it because creating a controller and a web service is obviously extra work. However, it would make polling unnecessary. If you want to continue using your polling technique, you could make a service (producer) that does the polling and fills a queue with the new results. Your other program (consumer) can then retrieve items from the queue and do something with them.

There is no builtin "update listener" in MySQL (or any SQL database I'm aware of), so you have to build your own.
Notice that in your implementation, if two new rows are added you will handle one, wait 10 seconds, then handle the next one. Your code cannot handle more than one event every 10 seconds.
What you want to do is separate the polling of the database from the dispatching of the worker threads. Have the polling loop wake up every n seconds, read ALL new records from the database, and add them to a work queue. Have a consumer thread that is waiting on the queue and launches processors as messages appear on the queue. using a thread pool implementation.

#nir, Since there is no mysql database update listener in java so far, so what you can do is, create a database update trigger against the table, the change of which you want to listen. Within the trigger statement or code construct a function.Now from within that function call java function. Java function should be such that it modify some text, say "a". Now register the listener against the change in "a". And within the class implementing the text change listener of "a",put the code you want to execute.

The Condition Interface would work nicely for your needs. It will give you the granular control you are looking for, and it will avoid the problem of spinning the thread constantly.
http://docs.oracle.com/javase/tutorial/essential/concurrency/newlocks.html
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/Condition.html

Use a trigger, call a User Defined Function that uses sys_exec() to run an external app that signals an inter-process semaphore. Your listener thread can wait on that and, when signaled, process the new records.

In oracle exists something called database change notification http://docs.oracle.com/cd/E11882_01/java.112/e16548/dbchgnf.htm and I just implement a component like yours is there something like that in mysql or what approach you arrived?

Notifying postgres changes to java application

Problem
I'm building a postgres database for a few hundred thousand products. I will set-up an index (Solr or maybe ElasticSearch) to improve query times for complex search queries.
The point now is how to let the index synchronized with the database?
In the past I had a kind of application that polled the database periodically to check for updates that should be done, but I would have an outdated index state time (from the database update to the index update pull).
I would prefer a solution in which the database would notify my application (java application) that something has been changed within the database, and at that point the application will decide if the index needs to be updated or not. To be more accurate, I would build a kind of producer and consumer structure in wish the replica will receive notifications from postgres that something changed, if this is pertinent to the data indexed it is stored in a stack of updates-to-do. The consumer would consume this stack and build the documents to be stored into the index.
Possible Solutions
One solution would be to write a kind of replica end-point in which the application would behave as a postgres instance that is being used to replicate the data from the original database. Do someone have some experience with this approach?
Which other solution do I have for this problem?

Which other solution do I have for this problem?
Use LISTEN and NOTIFY to tell your app that things have changed.
You can send the NOTIFY from a trigger that also records changes in a queue table.
You'll need a PgJDBC connection that has sent a LISTEN for the event(s) you're using. It must poll the database by sending periodic empty queries ("") if you're using SSL; if you are not using SSL this can be avoided by use of the async notification checks. You'll need to unwrap the Connection object from your connection pool to be able to cast the underlying connection to a PgConnection to use listen/notify with. See related answer
The producer/consumer bit will be harder. To have multiple crash-safe concurrent consumers in PostgreSQL you need to use advisory locking with pg_try_advisory_lock(...). If you don't need concurrent consumers then it's easy, you just SELECT ... LIMIT 1 FOR UPDATE a row at a time.
Hopefully 9.4 will include an easier method of skipping locked rows with FOR UPDATE, as there's work in development for it.

To use LISTEN and NOTIFY of postgres you need to use a driver that can support asynchronous notifications. The postgres JDBC driver does not support asynchronous notifications.
To constantly LISTEN over a channel from Application Server go for the pgjdbc-ng 0.6 driver.
http://impossibl.github.io/pgjdbc-ng/
It supports async notifications, without polling.

In general, I would recommend to implement loose coupling using the EAI patterns. Then, if you decide to exchange the database, the code at the index side does not change.
In case, you want to stick with tight coupling, I would recommend to use
LISTEN/NOTIFY.
In Java, it is important to use the pgjdbc-ng driver, because it supports async
notifications without polling.
Here's an asynchronous pattern (based on this answer):
import com.impossibl.postgres.api.jdbc.PGConnection;
import com.impossibl.postgres.api.jdbc.PGNotificationListener;
import com.impossibl.postgres.jdbc.PGDataSource;
import java.sql.Statement;
public static void listenToNotifyMessage() {
PGDataSource dataSource = new PGDataSource();
dataSource.setHost("localhost");
dataSource.setPort(5432);
dataSource.setDatabase("database_name");
dataSource.setUser("postgres");
dataSource.setPassword("password");
PGNotificationListener listener = (int processId, String channelName, String payload)
-> System.out.println("notification = " + payload);
try (PGConnection connection = (PGConnection) dataSource.getConnection()) {
Statement statement = connection.createStatement();
statement.execute("LISTEN test");
statement.close();
connection.addNotificationListener(listener);
// it only works if the connection is open. Therefore, we do an endless loop here.
while (true) {
Thread.sleep(500);
}
} catch (Exception e) {
System.err.println(e);
}
}
In the other statements, you can now execute NOTIFY test, 'This is a payload';. You can also execute NOTIFY in triggers etc.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Multiple database queries in parallel, for a single client request - java

Related

LDAP VLV throws error "Other sort requests already in progress"

Creating workers in Spring

How to optimize Tomcat for Feed pull

How to listen new db records through java

Notifying postgres changes to java application

Categories

Resources