Separate processes, each run in multithread JAVA - java

I have 4 separate processes which need to go one after another.
1st process
2nd process
3rd process
4th process
Since, every process is connected to one another, each process should run after process before him finishes.
Each process has its own variable length which will be various as programs data input grows.
But some sketch would be like this
Program Runs
1st process - lasts 10 seconds
2nd process - has 300 HTTP get requests, last 3 minutes
3rd process - has 600 HTTP get requests, lasts 6 minutes
4th process - lasts 1 minute
Program is written in java
Thanks for any answer!

There is no concurrency support in the java API for your use case because what you're asking for is the opposite of concurrent. You have a set of four mutually dependent operations that need to be run in a specific order. You only need, and should probably only use, one thread to correctly handle this case.
It would be reasonable and prudent to put each operation in its own method or class, based on how complex the operations are.
If you insist on using multiple threads, your main thread should maintain a list of runnables. Iterate through the list. Pop the first runnable from the list, create a new thread for that runnable, start the thread, and then invoke join() on the thread. The main thread will block until the runnable is complete. The loop will take you through all the runnables in order. Again, there is no good reason to do this. There may or may not be a bad reason.

Related

Let a queue build up to a certain amount before processing

So let me give you an idea of what I'm trying to do:
I've got a program that records statistics, lots and lots of them, but it records them as they happen one at a time and puts them into an ArrayList, for example:
Please note this is an example, I'm not recording these stats, I'm just simplifying it a bit
User clicks -> Add user_click to array
User clicks -> Add user_click to array
Key press -> Add key_press to array
After each event(clicks, key presses, etc) it checks the size of the ArrayList, if it is > 150 the following happens:
A new thread is created
That thread is given a copy of the ArrayList
The original ArrayList is .clear()'ed
The new thread combines similar items so user_click would now be one item with a quantity of 2, instead of 2 items with a quantity of 1 each
The thread processes the data to a MySQL db
I would love to find a better approach to this, although this works just fine. The issue with threadpools and processing immediately is there would be literally thousands of MySQL queries per day without combining them first..
Is there a better way to accomplish this? Is my method okay?
The other thing to keep in mind is the thread where events are fired and recorded can't be slowed down so I don't really want to combine items in the main thread.
If you've got code examples that would be great, if not just an idea of a good way to do this would be awesome as-well!
For anyone interested, this project is hosted on GitHub, the main thread is here, the queue processor is here and please forgive my poor naming conventions and general code cleanliness, I'm still(always) learning!
The logic described seems pretty good, with two adjustments:
Don't copy the list and clear the original. Send the original and create a new list for future events. This eliminates the O(n) processing time of copying the entries.
Don't create a new thread each time. Events are delayed anyway, since you're collecting them, so timeliness of writing to database is not your major concern. Two choices:
Start a single thread up front, then use a BlockingQueue to send list from thread 1 to thread 2. If thread 2 is falling behind, the lists will simply accumulate in the queue until thread 2 can catch up, without delaying thread 1, and without overloading the system with too many threads.
Submit the job to a thread pool, e.g. using an Executor. This would allow multiple (but limited number of) threads to process the lists, in case processing is slower than event generation. Disadvantage is that events may be written out of order.
For the purpose of separation of concern and reusability, you should encapsulate the logic of collecting events, and sending them to thread in blocks for processing, in a separate class, rather than having that logic embedded in the event-generation code.
That way you can easily add extra features, e.g. a timeout for flushing pending events before reaching normal threshold (150), so events don't sit there too long if event generation slows down.

JMeter - force all transactions/threads to stop when test reaches duration

I have a custom controller type which runs it's own specific test fragments. The important thing to note is that these fragments contain Transaction Controllers, which contain gaussian timers simulating wait times of up to 5 minutes.
The tests I am running are data driven, and should be runnable for a varying length of time. To specify the runtime of a test I have been using the "Duration" option on the Thread Group scheduler.
In the event were a test has ran beyond its duration, I've noticed that when these timer fragments are in use, the test is delayed and cannot end until the transaction (or at least the timer) has been complete. The other timings and samplers recorded seem to be unaffected, however the runtime of the test is impacted.
I'd like to solve this issue without having to rely on the user manually killing a test when it has reached it's duration. Is there any option within JMeter to kill or interrupt any type of running thread when a duration has been reached?
As per my understanding with Jmeter, there is no element which can stop the running test on reaching specific duration.
However, an element named 'Test Action' can be used to Pause/Stop/Stop Now actions on your test during run time and this element can be used under 'If Controller' element so that you can set the condition in this element to stop the thread.
Although JMeter provides various head-on elements to handle different conditions but in rare cases where existing elements could not provide direct solution to the problem then JMeter experts in any software testing company uses multiple elements with child-parent hierarchy to handle the condition [as used above with Test Action & If Controller elements]
I believe this has to do with stop test vs shutdown. When a test reaches its duration, it will issue a stop test, at which point any timer will finish, the request will happen, then the thread stops. This is why manually shutting it down works- shutdown doesn't respect timers, etc.
I don't think there's a way to set duration to use shutdown rather than stop. One thing you might try is multiple, smaller timers, and see if it still waits for all of them.

Reading huge file in Java

I read a huge File (almost 5 million lines). Each line contains Date and a Request, I must parse Requests between concrete **Date**s. I use BufferedReader for reading File till start Date and than start parse lines. Can I use Threads for parsing lines, because it takes a lot of time?
It isn't entirely clear from your question, but it sounds like you are reparsing your 5 million-line file every time a client requests data. You certainly can solve the problem by throwing more threads and more CPU cores at it, but a better solution would be to improve the efficiency of your application by eliminating duplicate work.
If this is the case, you should redesign your application to avoid reparsing the entire file on every request. Ideally you should store data in a database or in-memory instead of processing a flat text file on every request. Then on a request, look up the information in the database or in-memory data structure.
If you cannot eliminate the 5 million-line file entirely, you can periodically recheck the large file for changes, skip/seek to the end of the last record that was parsed, then parse only new records and update the database or in-memory data structure. This can all optionally be done in a separate thread.
Firstly, 5 million lines of 1000 characters is only 5Gb, which is not necessarily prohibitive for a JVM. If this is actually a critical use case with lots of hits then buying more memory is almost certainly the right thing to do.
Secondly, if that is not possible, most likely the right thing to do is to build an ordered Map based on the date. So every date is a key in the map and points to a list of line numbers which contain the requests. You can then go direct to the relevant line numbers.
Something of the form
HashMap<Date, ArrayList<String>> ()
would do nicely. That should have a memory usage of order 5,000,000*32/8 bytes = 20Mb, which should be fine.
You could also use the FileChannel class to keep the I/O handle open as you go jumping from on line to a different line. This allows Memory Mapping.
See http://docs.oracle.com/javase/7/docs/api/java/nio/channels/FileChannel.html
And http://en.wikipedia.org/wiki/Memory-mapped_file
A good way to parallelize a lot of small tasks is to wrap the processing of each task with a FutureTask and then pass each task to a ThreadPoolExecutor to run them. The executor should be initalized with the number of CPU cores your system has available.
When you call executor.execute(future), the future will be queued for background processing. To avoid creating and destroying too many threads, the ScheduledThreadPoolExecutor will only create as many threads as you specified and execute the futures one after another.
To retrieve the result of a future, call future.get(). When the future hasn't completed yet (or wasn't even started yet), this method will freeze until it is completed. But other futures get executed in background while you wait.
Remember to call executor.shutdown() when you don't need it anymore, to make sure it terminates the background threads it otherwise keeps around until the keepalive time has expired or it is garbage-collected.
tl;dr pseudocode:
create executor
for each line in file
create new FutureTask which parses that line
pass future task to executor
add future task to a list
for each entry in task list
call entry.get() to retrieve result
executor.shutdown()

Deciding number of threads in SOA Application (using Servlet)

I have an SOA application which consists of many servlets. When client submits the requests, my application connects to 4 external applications, exchanges data between them and provides the result.
Now, due to these 4 connections, the response of requests gets delayed considerably. Hence, we are planning to separate these 4 calls into threads so that the main thread can respond back quickly saying 'we are processing your data'.
The question is, how many threads should I start for these tasks? I can do all of the tasks in a single thread vs 4 different threads. What is the optimal solution?
Also, what affects CPU most? Number of threads OR length of the duration of execution of a particular thread?
My application receives 5 to 7 requests per second. So, what would be better? 1 separate (and longer running) thread OR 4 separate (but shorter running) threads per request?
Thanks in advance.
The number of threads you should start depends on the number of independent tasks you have. The more task/module/function(whatever you call) you have , the more threads you can start for each task/module. Based on the independent work to be done concurrently , you need to know how many threads you should use and how to utilize them effectively
what affects CPU most? Number of threads OR length of the duration of execution of a particular thread?
.
That seems to be a trivial question.Both will effect. Maybe not. Depends on the application/code you have.But that should not be a problem.

Performance issue designing threaded consumer of queue

I'm really new to programming and having performance problems with my software. Basically I get some data and run a 100 loop on it(i=0;i<100;i++) and during that loop my program makes 1 of 3 decisions, keep the data its working on, discard it, or send a version of it back to the queue to process. The individual work each thread does is very small but there's a lot of it(which is why I'm using a queue server to scale horizontally).
My problem is it never takes close to my entire cpu, my program runs at around 40% per core. After profiling, it seems the majority of the time is spend sending/receiving data from the queue(64% approx. in a part called com.rabbitmq.client.impl.Frame.readFrom(DataInputStream) and com.rabbitmq.client.impl.SocketFrameHandler.readFrame(), 17% approx. is getting it in the format for the queue(I brought down from 40% before) and the rest is spend on my programs logic). Obviously, I want my work to be done faster and want to not have it spend so much time in the queue and I'm wondering if there's a better design I can use.
My code is actually quite large but here's a overview of what it does:
I create a connection to the queue server(rabbitmq and java)
I fork as many threads as I have cpu cores(using the same connection)
Data from thread is
each thread creates its own channel to the queue server using the shared connection.
There'a while loop that pools the server and gets X number of messages without acknowledgments
Once I get a message, I use thread executor to send an acknowledge while my job is running
I parse the message and run my loop
If data is sent back to the queue, I send it to a thread executor that sends it back so my program can proceed with the next data set.
One weird thing I did, was although I use thread executor for acknowledgments and sending to the queue, my main worker thread is just a forked thread(using public void run()) because my program is dedicated to this single process I did that to make sure there was always X number of threads ready to work(and there was no shutting down/respawning of them). The rest is in threads because I figured the rest could wait/be queued while my main program runs.
I'm not sure how to design it better so it spends less time gathering/sending data. Is there any designs, rabbitmq, Java things I can use to help?
If it's not IO wait, then I suspect that it's down to some locking going on inside those methods.
It looks to me like your threads are spending a significant amount of time waiting for them to return. Somewhat counter-intuitively, you might well be able to increase your performance by cutting down on the number of threads, since they'll spend less time tripping over each other and more time actively doing something.
Give it a try and see what affect it has on the profile.

Categories

Resources