Java multithreading for the purpose of simulating data

Java multithreading for the purpose of simulating data - java

So I am currently creating a data analytics and predictive program, and for testing purposes, I am simulating large amounts of data (in the range of 10,000 - 1,000,000) "trials". The data is a simulated Match for a theoretical game. Each Match has rounds. The basic psudocode for the program is this:
main(){
data = create(100000);
saveToFile(data);
}
Data create(){
Data returnData = new Data(playTestMatch());
}
Match playTestMatch(){
List<Round> rounds = new List<Round>();
while(!GameFinished){
rounds.add(playTestRound());
}
Match returnMatch = new Match(rounds);
}
Round playTestRound(){
//Do round stuff
}
Right now, I am wondering whether I can handle the simulation of these rounds over multiple threads to speed up the process. I am NOT familiar with the theory behind multithreading, so would someone please either help me accomplish this, OR explain to me why this won't work (won't speed up the process). Thanks!

If you are new to Java multi-threading, this explanation might seem a little difficult to understand at first but I'll try and make it seem as simple as possible.
Basically I think generally whenever you have large datasets, running operations concurrently using multiple threads does significantly speed up the process as oppose to using a single threaded approach, but there are exceptions of course.
You need to think about three things:
Creating threads
Managing Threads
Communicating/sharing results computed by each thread with main thread
Creating Threads:
Threads can be created manually extending the Thread class or you can use Executors class.
I would prefer the Executors class to create threads as it allows you to create a thread pool and does the thread management for you. That is it will allow you to re-use existing threads that are idle in the thread pool, thus reducing memory footprint of the application.
You also have to look at ExecutorService Interface as you will be using it to excite your tasks.
Managing threads:
Executors/Executors service does a great job of managing threads automatically, so if you use it you don't have to worry about thread management much.
Communication: This is the key part of the entire process. Here you have to consider in great detail about thread safety of your app.
I would recommend using two queues to do the job, a read queue to read data off and write queue to write data to.
But if you are using a simple arraylist make sure that you synchronize your code for thread safety by enclosing the arraylist in a synchronized block
synchronized(arrayList){
// do stuff
}

If your code is thread-safe and you can split the task into discrete chunks that do not rely on each other then it is relatively easy. Make the class that does the work Callable and add the chunks of work to a List, and then use ExecutorService, like this:
ArrayList<Simulation> SL=new ArrayList<Simulation>();
for(int i=0; i<chunks; i++)
SL.add(new Simulation(i));
ExecutorService executor=Executors.newFixedThreadPool(nthreads);//how many threads
List<Future<Result>> results=null;
try {
results = executor.invokeAll(SL);
} catch (InterruptedException e) {
e.printStackTrace();
}
executor.shutdown();
for(Future<Result> result:results)
result.print();
So, Simulation is callable and returns a Result, results is a List which gets filled when executor.invokeAll is called with the ArrayList of simulations. Once you've got your results you can print them or whatever. Probably best to set nthreads equal to the number of cores you available.

Related

java application multi-threading design and optimization

I designed a java application. A friend suggested using multi-threading, he claims that running my application as several threads will decrease the run time significantly.
In my main class, I carry several operations that are out of our scope to fill global static variables and hash maps to be used across the whole life time of the process. Then I run the core of the application on the entries of an array list.
for(int customerID : customers){
ConsumerPrinter consumerPrinter = new ConsumerPrinter();
consumerPrinter.runPE(docsPath,outputPath,customerID);
System.out.println("Customer with CustomerID:"+customerID+" Done");
}
for each iteration of this loop XMLs of the given customer is fetched from the machine, parsed and calculations are taken on the parsed data. Later, processed results are written in a text file (Fetched and written data can reach up to several Giga bytes at most and 50 MBs on average). More than one iteration can write on the same file.
Should I make this piece of code multi-threaded so each group of customers are taken in an independent thread?
How can I know the most optimal number of threads to run?
What are the best practices to take into consideration when implementing multi-threading?

Should I make this piece of code multi-threaded so each group of customers are taken
in an independent thread?
Yes multi-threading will save your processing time. While iterating on your list you can spawn new thread each iteration and do customer processing in it. But you need to do proper synchronization meaning if two customers processing requires operation on same resource you must synchronize that operation to avoid possible race condition or memory inconsistency issues.
How can I know the most optimal number of threads to run?
You cannot really without actually analyzing the processing time for n customers with different number of threads. It will depend on number of cores your processor has, and what is the actually processing that is taking place for each customer.
What are the best practices to take into consideration when implementing multi-threading?
First and foremost criteria is you must have multiple cores and your OS must support multi-threading. Almost every system does that in present times but is a good criteria to look into. Secondly you must analyze all the possible scenarios that may led to race condition. All the resource that you know will be shared among multiple threads must be thread-safe. Also you must also look out for possible chances of memory inconsistency issues(declare your variable as volatile). Finally there are something that you cannot predict or analyze until you actually run test cases like deadlocks(Need to analyze Thread dump) or memory leaks(Need to analyze Heap dump).

The idea of multi thread is to make some heavy process into another, lets say..., "block of memory".
Any UI updates have to be done on the main/default thread, like print messenges or inflate a view for example. You can ask the app to draw a bitmap, donwload images from the internet or a heavy validation/loop block to run them on a separate thread, imagine that you are creating a second short life app to handle those tasks for you.
Remember, you can ask the app to download/draw a image on another thread, but you have to print this image on the screen on the main thread.
This is common used to load a large bitmap on a separated thread, make math calculations to resize this large image and then, on the main thread, inflate/print/paint/show the smaller version of that image to te user.
In your case, I don't know how heavy runPE() method is, I don't know what it does, you could try to create another thread for him, but the rest should be on the main thread, it is the main process of your UI.
You could optmize your loop by placing the "ConsumerPrinter consumerPrinter = new ConsumerPrinter();" before the "for(...)", since it does not change dinamically, you can remove it inside the loop to avoid the creating of the same object each time the loop restarts : )

While straight java multi-threading can be used (java.util.concurrent) as other answers have discussed, consider also alternate programming approaches to multi-threading, such as the actor model. The actor model still uses threads underneath, but much complexity is handled by the actor framework rather than directly by you the programmer. In addition, there is less (or no) need to reason about synchronizing on shared state between threads because of the way programs using the actor model are created.
See Which Actor model library/framework for Java? for a discussion of popular actor model libraries.

Pass data to another thread, java

I'm creating a small application learning about Java threading. I want to have a thread running that will analyze a small piece of data (a poker hand), and output a display message when the hand is detected to be a winning hand.
I already have the part completed that generates hands until the deck is empty, I just need to figure out how to pass that data over into the other thread which analyzes and triggers the display message (just a simple System.out).
I'd like to do this to a currently running thread, instead of spawning a new thread for every hand that is dealt and passing the cards in the constructor.
public static void main(String[] args) {
Deck myDeck = new PokerDeck();
DeckHandlerInterface deckHandler = new DeckHandler();
(new Thread(new ThreadWin())).start();
for(int x = 0; x < 2; x++) {
while(myDeck.getDeck().size() >= deckHandler.getHandSize()) {
deckHandler.dealHand(myDeck.getDeck());
}
deckHandler.resetDeck();
}
}
My deckHandler returns a collection object which is what I want to pass to the other thread. That's the part I'm not sure how to do.

You probably want to use a couple of BlockingQueues. Have the thread that generates hands stick the hands in one queue. The thread checking hands polls that queue and checks any hands it finds. Then it writes the results to a 2nd queue which the hand-generating thread can poll and display.

There are many ways to accomplish this.
A simple approach might be to create a Queue that you pass in a reference to via the ThreadWin constructor.
Then you just add the objects you wish to pass to the queue from the main thread, and listen for new objects on the queue in your ThreadWin thread. In particular it seems like a BlockingQueue might be a good fit here.

It sounds like you may want your "ThreadWin" to observe (http://en.wikipedia.org/wiki/Observer_pattern) the DeckHandler
Basically, the ThreadWin thread will "register" with the DeckHandler so it gets notified when the DeckHandler gets a new batch of PokerHands.
When the ThreadWin thread is notified it will "stop resting" and determine which hand was best.

You can use BlockingQueue to create simple consumer-producer scenario, there is even a simple example in the documentation.
You should also read this to have a better understanding of concurrency.
Propably the best method is to use java.util.concurrent package threadpool to execute tasks. Threadpool are nice, easy to implement, but you will not learn much apart from using the threadpools.

Does it make sense to reuse Runnables in a thread pool?

I'm implementing a thread pool for processing a high volume market data feed and have a question about the strategy of reusing my worker instances that implement runnable which are submitted to the thread pool for execution. In my case I only have one type of worker that takes a String and parses it to create a Quote object which is then set on the correct Security. Given the amount of data coming off the feed it is possible to have upwards of 1,000 quotes to process per second and I see two ways to create the workers that get submitted to the thread pool.
First option is simply creating a new instance of a Worker every time a line is retrieved from the underlying socket and then adding it to the thread pool which will eventually be garbage collected after its run method executed. But then this got me thinking about performance, does it really make sense to instantiate 1,0000 new instances of the Worker class every second. In the same spirit as a thread pool do people know if it is a common pattern to have a runnable pool or queue as well so I can recycle my workers to avoid object creation and garbage collection. The way I see this being implemented is before returning in the run() method the Worker adds itself back to a queue of available workers which is then drawn from when processing new feed lines instead of creating new instances of Worker.
From a performance perspective, do I gain anything by going with the second approach or does the first make more sense? Has anyone implemented this type of pattern before?
Thanks - Duncan

I use a library I wrote called Java Chronicle for this. It is designed to persist and queue one million quotes per second without producing any significant garbage.
I have a demo here where it sends quote like objects with nano second timing information at a rate of one million messages per second and it can send tens of millions in a JVM with a 32 MB heap without triggering even a minor collection. The round trip latency is less than 0.6 micro-seconds 90% of the time on my ultra book. ;)
from a performance perspective, do I gain anything by going with the second approach or does the first make more sense?
I strongly recommend not filling your CPU caches with garbage. In fact I avoid any constructs which create any significant garbage. You can build a system which creates less than one object per event end to end. I have a Eden size which is larger than the amount of garbage I produce in a day so no GCs minor or full to worry about.
Has anyone implemented this type of pattern before?
I wrote a profitable low latency trading system in Java five years ago. At the time it was fast enough at 60 micro-seconds tick to trade in Java, but you can do better than that these days.
If you want low latency market data processing system, this is the way I do it. You might find this presentation I gave at JavaOne interesting as well.
http://www.slideshare.net/PeterLawrey/writing-and-testing-high-frequency-trading-engines-in-java
EDIT I have added this parsing example
ByteBuffer wrap = ByteBuffer.allocate(1024);
ByteBufferBytes bufferBytes = new ByteBufferBytes(wrap);
byte[] bytes = "BAC,12.32,12.54,12.56,232443".getBytes();
int runs = 10000000;
long start = System.nanoTime();
for (int i = 0; i < runs; i++) {
bufferBytes.reset();
// read the next message.
bufferBytes.write(bytes);
bufferBytes.position(0);
// decode message
String word = bufferBytes.parseUTF(StopCharTesters.COMMA_STOP);
double low = bufferBytes.parseDouble();
double curr = bufferBytes.parseDouble();
double high = bufferBytes.parseDouble();
long sequence = bufferBytes.parseLong();
if (i == 0) {
assertEquals("BAC", word);
assertEquals(12.32, low, 0.0);
assertEquals(12.54, curr, 0.0);
assertEquals(12.56, high, 0.0);
assertEquals(232443, sequence);
}
}
long time = System.nanoTime() - start;
System.out.println("Average time was " + time / runs + " nano-seconds");
when set with -verbose:gc -Xmx32m it prints
Average time was 226 nano-seconds
Note: there are no GCes triggered.

I'd use the Executor from the concurrency package. I believe it handles all this for you.

does it really make sense to instantiate 1,0000 new instances of the Worker class every second.
Not necessarily however you are going to have to be putting the Runnables into some sort of BlockingQueue to be able to be reused and the cost of the queue concurrency may outweigh the GC overhead. Using a profiler or watching the GC numbers via Jconsole will tell you if it is spending a lot of time in GC and this needs to be addressed.
If this does turn out to be a problem, a different approach would be to just put your String into your own BlockingQueue and submit the Worker objects to the thread-pool only once. Each of the Worker instances would dequeue from the queue of Strings and would never quit. Something like:
public void run() {
while (!shutdown) {
String value = myQueue.take();
...
}
}
So you would not need to create your 1000s of Workers per second.

Yes of course, something like this, because OS and JVM don't care about what is going on a thread, so generally this is a good practice to reuse a recyclable object.

I see two questions in your problem. One is about thread pooling, and another is about object pooling. For your thread pooling issue, Java has provided an ExecutorService . Below is an example of using an ExecutorService.
Runnable r = new Runnable() {
public void run() {
//Do some work
}
};
// Thread pool of size 2
ExecutorService executor = Executors.newFixedThreadPool(2);
// Add the runnables to the executor service
executor.execute(r);
The ExecutorService provides many different types of thread pools with different behaviors.
As far as object pooling is concerned, (Does it make sense to create 1000 of your objects per second, then leave them for garbage collection, this all is dependent on the statefulness and expense of your object. If your worried about the state of your worker threads being compromised, you can look at using the flyweight pattern to encapsulate your state outside of the worker. Additionally, if you were to follow the flyweight pattern, you can also look at how useful Future and Callable objects would be in your application architecture.

Trouble understanding Java threads

I learned about multiprocessing from Python and I'm having a bit of trouble understanding Java's approach. In Python, I can say I want a pool of 4 processes and then send a bunch of work to my program and it'll work on 4 items at a time. I realized, with Java, I need to use threads to achieve this same task and it seems to be working really really well so far.
But.. unlike in Python, my cpu(s) aren't getting 100% utilization (they are about 70-80%) and I suspect it's the way I'm creating threads (code is the same between Python/Java and processes are independent). In Java, I'm not sure how to create one thread so I create a thread for every item in a list I want to process, like this:
for (int i = 0; i < 500; i++) {
Runnable task = new MyRunnable(10000000L + i);
Thread worker = new Thread(task);
// We can set the name of the thread
worker.setName(String.valueOf(i));
// Start the thread, never call method run() direct
worker.start();
// Remember the thread for later usage
threads.add(worker);
}
I took it from here. My question is this the correct way to launch threads or is there a way to have Java itself manage the number of threads so it's optimal? I want my code to run as fast as possible and I'm trying to understand how to tell and resolve any issues that maybe arising from too many threads being created.
This is not a major issue, just curious to how it works under the Java hood.

You use an Executor, the implementation of which handles a pool of threads, decides how many, and so forth. See the Java tutorial for lots of examples.
In general, bare threads aren’t used in Java except for very simple things. Instead, there will be some higher-level API that receives your Runnable or Task and knows what to do.

Take a look at the Java Executor API. See this article, for example.
Although creating Threads is much 'cheaper' than it used to be, creating large numbers of threads (one per runnable as in your example) isn't the way to go - there's still an overhead in creating them, and you'll end up with too much context switching.
The Executor API allows you to create various types of thread pool for executing Runnable tasks, so you can reuse threads, flexibly manage the number that are created, and avoid the overhead of thread-per-runnable.
The Java threading model and the Python threading model (not multiprocessing) are really quite similar, incidentally. There isn't a Global Interpreter Lock as in Python, so there's usually less need to fork off multiple processes.

Thread is a "low level" API.
Depending on what you want to do, and the version of java you use, their is better solution.
If you use Java 7, and if your task allow it, you can use the fork/join framework : http://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html
However, take a look at the java concurrency tutorial : http://docs.oracle.com/javase/tutorial/essential/concurrency/executors.html

Forcing multiple threads to use multiple CPUs when they are available

I'm writing a Java program which uses a lot of CPU because of the nature of what it does. However, lots of it can run in parallel, and I have made my program multi-threaded. When I run it, it only seems to use one CPU until it needs more then it uses another CPU - is there anything I can do in Java to force different threads to run on different cores/CPUs?

There are two basic ways to multi-thread in Java. Each logical task you create with these methods should run on a fresh core when needed and available.
Method one: define a Runnable or Thread object (which can take a Runnable in the constructor) and start it running with the Thread.start() method. It will execute on whatever core the OS gives it -- generally the less loaded one.
Tutorial: Defining and Starting Threads
Method two: define objects implementing the Runnable (if they don't return values) or Callable (if they do) interface, which contain your processing code. Pass these as tasks to an ExecutorService from the java.util.concurrent package. The java.util.concurrent.Executors class has a bunch of methods to create standard, useful kinds of ExecutorServices. Link to Executors tutorial.
From personal experience, the Executors fixed & cached thread pools are very good, although you'll want to tweak thread counts. Runtime.getRuntime().availableProcessors() can be used at run-time to count available cores. You'll need to shut down thread pools when your application is done, otherwise the application won't exit because the ThreadPool threads stay running.
Getting good multicore performance is sometimes tricky, and full of gotchas:
Disk I/O slows down a LOT when run in
parallel. Only one thread should do disk read/write at a time.
Synchronization of objects provides safety to multi-threaded operations, but slows down work.
If tasks are too
trivial (small work bits, execute
fast) the overhead of managing them
in an ExecutorService costs more than
you gain from multiple cores.
Creating new Thread objects is slow. The ExecutorServices will try to re-use existing threads if possible.
All sorts of crazy stuff can happen when multiple threads work on something. Keep your system simple and try to make tasks logically distinct and non-interacting.
One other problem: controlling work is hard! A good practice is to have one manager thread that creates and submits tasks, and then a couple working threads with work queues (using an ExecutorService).
I'm just touching on key points here -- multithreaded programming is considered one of the hardest programming subjects by many experts. It's non-intuitive, complex, and the abstractions are often weak.
Edit -- Example using ExecutorService:
public class TaskThreader {
class DoStuff implements Callable {
Object in;
public Object call(){
in = doStep1(in);
in = doStep2(in);
in = doStep3(in);
return in;
}
public DoStuff(Object input){
in = input;
}
}
public abstract Object doStep1(Object input);
public abstract Object doStep2(Object input);
public abstract Object doStep3(Object input);
public static void main(String[] args) throws Exception {
ExecutorService exec = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
ArrayList<Callable> tasks = new ArrayList<Callable>();
for(Object input : inputs){
tasks.add(new DoStuff(input));
}
List<Future> results = exec.invokeAll(tasks);
exec.shutdown();
for(Future f : results) {
write(f.get());
}
}
}

When I run it, it only seems to use
one CPU until it needs more then it
uses another CPU - is there anything I
can do in Java to force different
threads to run on different
cores/CPUs?
I interpret this part of your question as meaning that you have already addressed the problem of making your application multi-thread capable. And despite that, it doesn't immediately start using multiple cores.
The answer to "is there any way to force ..." is (AFAIK) not directly. Your JVM and/or the host OS decide how many 'native' threads to use, and how those threads are mapped to physical processors. You do have some options for tuning. For example, I found this page which talks about how to tune Java threading on Solaris. And this page talks about other things that can slow down a multi-threaded application.

First, you should prove to yourself that your program would run faster on multiple cores. Many operating systems put effort into running program threads on the same core whenever possible.
Running on the same core has many advantages. The CPU cache is hot, meaning that data for that program is loaded into the CPU. The lock/monitor/synchronization objects are in CPU cache which means that other CPUs do not need to do cache synchronization operations across the bus (expensive!).
One thing that can very easily make your program run on the same CPU all the time is over-use of locks and shared memory. Your threads should not talk to each other. The less often your threads use the same objects in the same memory, the more often they will run on different CPUs. The more often they use the same memory, the more often they must block waiting for the other thread.
Whenever the OS sees one thread block for another thread, it will run that thread on the same CPU whenever it can. It reduces the amount of memory that moves over the inter-CPU bus. That is what I guess is causing what you see in your program.

First, I'd suggest reading "Concurrency in Practice" by Brian Goetz.
This is by far the best book describing concurrent java programming.
Concurrency is 'easy to learn, difficult to master'. I'd suggest reading plenty about the subject before attempting it. It's very easy to get a multi-threaded program to work correctly 99.9% of the time, and fail 0.1%. However, here are some tips to get you started:
There are two common ways to make a program use more than one core:
Make the program run using multiple processes. An example is Apache compiled with the Pre-Fork MPM, which assigns requests to child processes. In a multi-process program, memory is not shared by default. However, you can map sections of shared memory across processes. Apache does this with it's 'scoreboard'.
Make the program multi-threaded. In a multi-threaded program, all heap memory is shared by default. Each thread still has it's own stack, but can access any part of the heap. Typically, most Java programs are multi-threaded, and not multi-process.
At the lowest level, one can create and destroy threads. Java makes it easy to create threads in a portable cross platform manner.
As it tends to get expensive to create and destroy threads all the time, Java now includes Executors to create re-usable thread pools. Tasks can be assigned to the executors, and the result can be retrieved via a Future object.
Typically, one has a task which can be divided into smaller tasks, but the end results need to be brought back together. For example, with a merge sort, one can divide the list into smaller and smaller parts, until one has every core doing the sorting. However, as each sublist is sorted, it needs to be merged in order to get the final sorted list. Since this is "divide-and-conquer" issue is fairly common, there is a JSR framework which can handle the underlying distribution and joining. This framework will likely be included in Java 7.

There is no way to set CPU affinity in Java. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4234402
If you have to do it, use JNI to create native threads and set their affinity.

You should write your program to do its work in the form of a lot of Callable's handed to an ExecutorService and executed with invokeAll(...).
You can then choose a suitable implementation at runtime from the Executors class. A suggestion would be to call Executors.newFixedThreadPool() with a number roughly corresponding to the number of cpu cores to keep busy.

The easiest thing to do is break your program into multiple processes. The OS will allocate them across the cores.
Somewhat harder is to break your program into multiple threads and trust the JVM to allocate them properly. This is -- generally -- what people do to make use of available hardware.
Edit
How can a multi-processing program be "easier"? Here's a step in a pipeline.
public class SomeStep {
public static void main( String args[] ) {
BufferedReader stdin= new BufferedReader( System.in );
BufferedWriter stdout= new BufferedWriter( System.out );
String line= stdin.readLine();
while( line != null ) {
// process line, writing to stdout
line = stdin.readLine();
}
}
}
Each step in the pipeline is similarly structured. 9 lines of overhead for whatever processing is included.
This may not be the absolute most efficient. But it's very easy.
The overall structure of your concurrent processes is not a JVM problem. It's an OS problem, so use the shell.
java -cp pipline.jar FirstStep | java -cp pipline.jar SomeStep | java -cp pipline.jar LastStep
The only thing left is to work out some serialization for your data objects in the pipeline.
Standard Serialization works well. Read http://java.sun.com/developer/technicalArticles/Programming/serialization/ for hints on how to serialize. You can replace the BufferedReader and BufferedWriter with ObjectInputStream and ObjectOutputStream to accomplish this.

I think this issue is related to Java Parallel Proccesing Framework (JPPF). Using this you can run diferent jobs on diferent processors.

JVM performance tuning has been mentioned before in Why does this Java code not utilize all CPU cores?. Note that this only applies to the JVM, so your application must already be using threads (and more or less "correctly" at that):
http://ch.sun.com/sunnews/events/2009/apr/adworkshop/pdf/5-1-Java-Performance.pdf

You can use below API from Executors with Java 8 version
public static ExecutorService newWorkStealingPool()
Creates a work-stealing thread pool using all available processors as its target parallelism level.
Due to work stealing mechanism, idle threads steal tasks from task queue of busy threads and overall throughput will increase.
From grepcode, implementation of newWorkStealingPool is as follows
/**
* Creates a work-stealing thread pool using all
* {#link Runtime#availableProcessors available processors}
* as its target parallelism level.
* #return the newly created thread pool
* #see #newWorkStealingPool(int)
* #since 1.8
*/
public static ExecutorService newWorkStealingPool() {
return new ForkJoinPool
(Runtime.getRuntime().availableProcessors(),
ForkJoinPool.defaultForkJoinWorkerThreadFactory,
null, true);
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.