I don't understand threads - java

Let's say I am given a function
Data[] foo(double[] someNumbers, Data[] someData, AnalyzeTool tool, int numOfThreads)
, the array's having the same length len.
Now I would like to invoke numOfThreads threads in the method using tool to process each one Data.Object and write it one of another back into an Data[], so that the Data[] given and Data[] written stays in order.
Let's say a thread is now finished processing one Data Object, how do I tell this thread, that there is still data left, that isn't yet processed and assign and "lock" a next Data Object to process it? "Locking" should prevent that on Data Object is processed several times by multiple threads.
Does someone have an example how to do that? Any sort of constructive help is welcome.

It'd do it using JDK 8 and streams. I'm imagining something like this:
List<Data> foo(List<Double> someNumbers, List<Data> someData, AnalyzeTool tool) {
return someData.parallelStream().map((t, n) -> t.doSomething(n)).collect(Collectors.toList());
}

Being new to multi-threading, in my experience i would have done something like this:
Considering i have a lot of task to be done with each Data[] item. let us say every data item is a work.
ExecutorService provides you with a factory where there is a group of workers (Thread pools) to help you complete all of your work, the Executor service allocates work to each of the worker(Thread) one by one, as soon as anyone of them finishes, and more work is there, they are allocated with that.
consider this example:
ExecutorService executor = Executors.newFixedThreadPools(5);
//lets say we have 5 workers with us.
//then submitting all your work (a runnable) to the factory
for(int i=0;i<n;i++){
executor.submit(new work(someData[i]));
}
Executor will start doing the work as you submit and then picks up the next from the pool.... and so on.
simply then,
executor.shutdown();

Related

Can all Asynchronous tasks read a ArrayList at sametime without anydelay

In my Android app, I have a arrayList as mentioned below.
Public List<String> prefCoinList = new ArrayList<String>() ;
I will be executing 10 asynchronous task using THREAD_POOL_EXECUTOR as mentioned below.
new asyncTask().executeOnExecutor(AsyncTask.THREAD_POOL_EXECUTOR, order);
Each asynchronous task will only read the arrayList "prefCoinList" and checks for a particular value.
Question
Will all the 10 Asynchronous task will run without any deadlocks on the arrayList "prefCoinList" ?
Any thread locks / hanging issue will be there ?
If reading the arraylist at same time possible then will each thread get their own copy (or) all thread will wait and read the arraylist when they get their turn ?
If the list is not modified after construction then you can access it from multiple threads without issue. Each thread will read the single copy of the object.
If the list is modified occasionally then you could use a ReadWriteLock or other access control mechanism.
Let me asnwer each question.
1) Yeah, all 10 asynchronous calls will run without any problem, it is not a lot of info to be process.
2)You will not have thread locks, you only have to take care of use runOnUIThread because at that moment it stops the asynchronous call.
3) Java functions work passing the reference of the object. So, you have two options. You can pass to your asynchronous function the same array but you can't modify it, or you can create a copy of the array to each function to avoid problems.

Java multithreading for the purpose of simulating data

So I am currently creating a data analytics and predictive program, and for testing purposes, I am simulating large amounts of data (in the range of 10,000 - 1,000,000) "trials". The data is a simulated Match for a theoretical game. Each Match has rounds. The basic psudocode for the program is this:
main(){
data = create(100000);
saveToFile(data);
}
Data create(){
Data returnData = new Data(playTestMatch());
}
Match playTestMatch(){
List<Round> rounds = new List<Round>();
while(!GameFinished){
rounds.add(playTestRound());
}
Match returnMatch = new Match(rounds);
}
Round playTestRound(){
//Do round stuff
}
Right now, I am wondering whether I can handle the simulation of these rounds over multiple threads to speed up the process. I am NOT familiar with the theory behind multithreading, so would someone please either help me accomplish this, OR explain to me why this won't work (won't speed up the process). Thanks!
If you are new to Java multi-threading, this explanation might seem a little difficult to understand at first but I'll try and make it seem as simple as possible.
Basically I think generally whenever you have large datasets, running operations concurrently using multiple threads does significantly speed up the process as oppose to using a single threaded approach, but there are exceptions of course.
You need to think about three things:
Creating threads
Managing Threads
Communicating/sharing results computed by each thread with main thread
Creating Threads:
Threads can be created manually extending the Thread class or you can use Executors class.
I would prefer the Executors class to create threads as it allows you to create a thread pool and does the thread management for you. That is it will allow you to re-use existing threads that are idle in the thread pool, thus reducing memory footprint of the application.
You also have to look at ExecutorService Interface as you will be using it to excite your tasks.
Managing threads:
Executors/Executors service does a great job of managing threads automatically, so if you use it you don't have to worry about thread management much.
Communication: This is the key part of the entire process. Here you have to consider in great detail about thread safety of your app.
I would recommend using two queues to do the job, a read queue to read data off and write queue to write data to.
But if you are using a simple arraylist make sure that you synchronize your code for thread safety by enclosing the arraylist in a synchronized block
synchronized(arrayList){
// do stuff
}
If your code is thread-safe and you can split the task into discrete chunks that do not rely on each other then it is relatively easy. Make the class that does the work Callable and add the chunks of work to a List, and then use ExecutorService, like this:
ArrayList<Simulation> SL=new ArrayList<Simulation>();
for(int i=0; i<chunks; i++)
SL.add(new Simulation(i));
ExecutorService executor=Executors.newFixedThreadPool(nthreads);//how many threads
List<Future<Result>> results=null;
try {
results = executor.invokeAll(SL);
} catch (InterruptedException e) {
e.printStackTrace();
}
executor.shutdown();
for(Future<Result> result:results)
result.print();
So, Simulation is callable and returns a Result, results is a List which gets filled when executor.invokeAll is called with the ArrayList of simulations. Once you've got your results you can print them or whatever. Probably best to set nthreads equal to the number of cores you available.

Pass data to another thread, java

I'm creating a small application learning about Java threading. I want to have a thread running that will analyze a small piece of data (a poker hand), and output a display message when the hand is detected to be a winning hand.
I already have the part completed that generates hands until the deck is empty, I just need to figure out how to pass that data over into the other thread which analyzes and triggers the display message (just a simple System.out).
I'd like to do this to a currently running thread, instead of spawning a new thread for every hand that is dealt and passing the cards in the constructor.
public static void main(String[] args) {
Deck myDeck = new PokerDeck();
DeckHandlerInterface deckHandler = new DeckHandler();
(new Thread(new ThreadWin())).start();
for(int x = 0; x < 2; x++) {
while(myDeck.getDeck().size() >= deckHandler.getHandSize()) {
deckHandler.dealHand(myDeck.getDeck());
}
deckHandler.resetDeck();
}
}
My deckHandler returns a collection object which is what I want to pass to the other thread. That's the part I'm not sure how to do.
You probably want to use a couple of BlockingQueues. Have the thread that generates hands stick the hands in one queue. The thread checking hands polls that queue and checks any hands it finds. Then it writes the results to a 2nd queue which the hand-generating thread can poll and display.
There are many ways to accomplish this.
A simple approach might be to create a Queue that you pass in a reference to via the ThreadWin constructor.
Then you just add the objects you wish to pass to the queue from the main thread, and listen for new objects on the queue in your ThreadWin thread. In particular it seems like a BlockingQueue might be a good fit here.
It sounds like you may want your "ThreadWin" to observe (http://en.wikipedia.org/wiki/Observer_pattern) the DeckHandler
Basically, the ThreadWin thread will "register" with the DeckHandler so it gets notified when the DeckHandler gets a new batch of PokerHands.
When the ThreadWin thread is notified it will "stop resting" and determine which hand was best.
You can use BlockingQueue to create simple consumer-producer scenario, there is even a simple example in the documentation.
You should also read this to have a better understanding of concurrency.
Propably the best method is to use java.util.concurrent package threadpool to execute tasks. Threadpool are nice, easy to implement, but you will not learn much apart from using the threadpools.

Multithreaded file processing and reporting

I have an application that processes data stored in a number of files from an input directory and then produces some output depending on that data.
So far, the application works in a sequential basis, i.e. it launches a "manager" thread that
Reads the contents of the input directory into a File[] array
Processes each file in sequence and stores results
Terminates when all files are processed
I would like to convert this into a multithreaded application, in which the "manager" thread
Reads the contents of the input directory into a File[] array
Launches a number of "processor" threads, each of which processes a single file, stores results and returns a summary report for that file to the "manager" thread
Terminates when all files have been processed
The number of "processor" threads would be at most equal to the number of files, since they would be recycled via a ThreadPoolExecutor.
Any solution avoiding the use of join() or wait()/notify() would be preferrable.
Based on the above scenario:
What would be the best way of having those "processor" threads reporting back to the "manager" thread? Would an implementation based on Callable and Future make sense here?
How can the "manager" thread know when all "processor" threads are done, i.e. when all files have been processed?
Is there a way of "timing" a processor thread and terminating it if it takes "too long" (i.e., it hasn't returned a result despite the lapse of a pre-configured amount of time)?
Any pointers to, or examples of, (pseudo-)source code would be greatly appreciated.
You can definitely do this without using join() or wait()/notify() yourself.
You should take a look at java.util.concurrent.ExecutorCompletionService to start with.
The way I see it you should write the following classes:
FileSummary - Simple value object that holds the result of processing a single file
FileProcessor implements Callable<FileSummary> - The strategy for converting a file into a FileSummary result
File Manager - The high level manager that creates FileProcessor instances, submits them to a work queue and then aggregates the results.
The FileManager would then look something like this:
class FileManager {
private CompletionService<FileSummary> cs; // Initialize this in constructor
public FinalResult processDir(File dir) {
int fileCount = 0;
for(File f : dir.listFiles()) {
cs.submit(new FileProcessor(f));
fileCount++;
}
for(int i = 0; i < fileCount; i++) {
FileSummary summary = cs.take().get();
// aggregate summary into final result;
}
}
If you want to implement a timeout you can use the poll() method on CompletionService instead of take().
wait()/notify() are very low level primitives and you are right in wanting to avoid them.
The simplest solution would be to use a thread-safe queues (or stacks, etc. -- it doesn't really matter in this case). Before starting the worker threads, your main thread can add all the Files to the thread-safe queue/stack. Then start the worker threads, and let them all pull Files and process them until there are none left.
The worker threads can add results to another thread-safe queue/stack, where the main thread can get them from. The main thread knows how many Files there were, so when it has retrieved the same number of results, it will know that the job is finished.
Something like a java.util.concurrent.BlockingQueue would work, and there are other thread-safe collections in java.util.concurrent which would also be fine.
You also asked about terminating worker threads which are taking too long. I will tell right up front: if you can make the code which runs on the worker threads robust enough that you can safely leave this feature out, you will make things a lot simpler.
If you do need this feature, the simplest and most reliable solution is to have a per-thread "terminate" flag, and make the worker task code check that flag frequently and exit if it is set. Make a custom class for workers, and include a volatile boolean field for this purpose. Also include a setter method (because of volatile, it doesn't need to be synchronized).
If a worker discovers that its "terminate" flag is set, it could push its File object back on the work queue/stack so another thread can process it. Of course, if there is some problem which means the File cannot be successfully processed, this will lead to an infinite cycle.
The best is to make the worker code very simple and robust, so you don't need to worry about it "not terminating".
No need for them to report back. Just have a count of the number of jobs remaining to be done and have the thread decrement that count when it's done.
When the count reaches zero of jobs remaining to be done, all the "processor" threads are done.
Sure, just add that code to the thread. When it starts working, check the time and compute the stop time. Periodically (say when you go to read more from the file), check to see if it's past the stop time and, if so, stop.

Separate hashset to run the list at several threads

I googled and search here for this question and did not find anything similar to what I´m looking for.
I populated a HashSet with few objects called Person, I need to set four or five threads to search these Person in a huge text, thread seems to be the best solution for the better usage from the hardware.
The doubt is, how can I separate this HashSet and start 4 threads? I tried to create a new HashSet list and start a new thread with this new hashset divided in 4.
It seems to be a good solution but, is there a better way to do it? How can I separate the hashset and send at pieces to 4 or 5 new threads?
Access to a HashSet is O(1) so if you split it across multiple threads, it won't go any faster. You are better off attempting to split the file of searching is expensive. However if its efficient enough, one thread will be optimal.
It is worth remembering that using all the cores on your machine can mean your program is slower. If you just want to use up all the CPU on you machine, you can create a thread pool which does nothing but use up all the CPU on your machine.
You can implement a producer-consumer scheme: have a single thread read the values from the hash set one by one and put them in a queue which is then processesed by several worker threads. You can use the ExecutorService class to manage the workers.
Edit: Here's what you can do:
Define your worker class:
public class Worker implements Runnable {
private Person p;
public Worker(Person p) {
this.p = p;
}
public void run() {
// search for p
}
}
In the main thread:
ExecutorService s = Executors.newCachedThreadPool();
for(Person p: hashSet) {
s.submit(new Worker(p));
}
A couple of things to consider:
1) You could use the same HashSet, but you will need to synchronize it (wrap the calls to it with a synchronized block. But if all you are doing is looking up things in the hash, being multi-threaded will not buy you much.
2) If you want to split the HashSet, then you can consider a split on key ranges. So for example if you are searching for a name, names that start with A-F go in HashSet1, G-L HashSet2, etc. This way your searches can be completely parallel.
You cane iterate through the hash set using Iterator. & while iterating fetch the value and create a thread and fire it.
Else
you can use ExecutorService API where simultaneous tasks can be run in parallel.

Categories

Resources