My application is supposed to have a "realtime with pause" functionality. The user can pause execution, do some things that modify what's going to happen, then unpause and let stuff happen. Stuff happens at regular intervals as specified by the user, can be slow, can be fast.
My goal at using threading here is to improve performance on multicore systems. The amount of data that the application is supposed to crunch at the time intervals is supposed to be arbitrarily large (I expect lots and lots of loops over collections, modifying object properties and generating random numbers, but precious little disk access). I don't want the application to be constrained by the capacity of a single core, if it can use more to run faster.
Will this actually work this way?
I've run some tests (made a program crunch numbers a lot, and looked at CPU usage during its activity), but it's not really conclusive - usage is certainly in the proximity of 100% on my dual core machine, but hardly ever 100%. Does a single-threaded (main only) Java application use all available cores for computation?
Does a single-threaded (main only) Java application use all available cores for computation?
No, it will normally use a single core.
Making a program do computations in parallel with multiple threads may make it faster, but it's not a magical solution for any kind of problem. Whether this is a suitable solution for your program depends on what your program is doing exactly, and if the algorithm can be parallelized. If, for example, you are doing lots of computations where the next computation depends on the result of the previous computation, then making it multi-threaded will not help a lot, because you can't do the computations at the same time - the next one first has to wait for the answer of the previous one. So, you first have to think about what computations in your program could be run in parallel.
Java has a lot of support for multi-threading. You can program with threads directly, or use an executor service, or use the fork/join framework. Whatever is appropriate depends on what exactly you want to do.
Does a single-threaded (main only) Java application use all available cores for computation?
Not usually, but you could make use of some higher level apis in java that is actually using threads for you and youre not even usinfpg threads directly, more obviousiously fork/join and executors, less obvious the new Streams API on collections (ie parallelStream).
In general, though, to make use of all cores, you need to do some kind of concurrency. Further...its really hard to just observe you OS monitor to see what is going on (especially with only 2 cores)...your OS has other things going on (trying to manage itself, running your IDE, running crontab, running a browers to post to stackoverflow ;).
Finally, just implementing (concurrency) itself may not help, you have to do it "right" for your code/algorithm.
a java thread will run in a single cpu. to use multiple CPUs, you should have multiple threads.
Imagine that u have to do various tasks using your hand. You will do it slowly using one hand and more effciently using both your hands. Similarly, in java or in any other language multi threading provides the system with many hands. The good news is that you can have many threads to do different tasks. Running operations in a single thread will make the program sluggish and sometimes unresponsive. A good practice is to do long running tasks in a separate thread. For example loading large chunks of data from a database should be processed in a separate thread. Downloading data from the internet should also be processed in a separate thread. What happens if you do long running operations in the main thread? The program HANGS and will become unresponsive till the task gets completed and the user will think that there is someting wrong. I hope you get it
Related
I am new to multithreading in Java, after looking at Java virtual machine - maximum number of threads it would appear there isn't a limit to how many threads a Java/Android app can run. However, is there an advisable limit? What I mean by this is, is there a number of threads where if you run past this number then it is unwise because you are unable to determine what thread does what at what time? I hope my question makes sense.
There are some advisable limits, however they don't really have anything to do with keeping track of them.
Most multithreading comes with locking. If you are using central data storage or global mutable state then the more threads you have, the more lock contention you will get. This is app-specific and depends on how much of said state you have and how often threads read and write it.
There are no limits in desktop JVMs by default, but there are OS limits.It should be in the tens of thousands for modern Windows machines, but don't rely on the ability to create much more than that.
Running multiple tasks in parallel is great, but the hardware can only cope with so much. If you are using small threads that get fired up sometimes, and spend most their time idle, that's no biggie (Java servers were written like this for years). However if your threads are very intensive, making more of them than the number of cores you have is not likely to give you any benefit. (I believe the standard practice is twice the number of cores if you anticipate threads going idle sometimes).
Threads have a cost to them. Whenever you switch Threads you switch context, and while it isn't that expensive, doing it constantly will hurt performance. It's not a good idea to create a Thread to sum up two integers and write back a result.
If Threads need visibility of each others state, then they are greatly slowed down, since a lot of their writes have to be written back to main memory. Threads are best used for standalone tasks that require little interaction with each other.
TL;DR
Depends on OS and Hardware: on servers creating thousands of threads is fine, on desktop machines you should limit yourself to 50-200 and choose carefully what you do with them.
Note: Androids default and suggested "UI multithread helper" - the AsyncTask is not actually a thread. It's a task invoked from a ThreadPool, and as such there is no limit or penalty to using it. It has an upper limit on the number of threads it spawns and reuses them rather than creating new ones. Most Android apps should use it instead of spawning their own threads. In general, Thread Pools are fairly widespread and are a great choice unless you are forced into blocking operations.
In order to improve the execution speed of a Java program running in Google App Engine, can I create additional Java threads during the runtime to make use of idle machines in the data center?
I've found conflicting data thus far.
If your primary concern is to improve the execution time, take a look at Memcache and Tasks. They can be used to reduce or avoid the latency of reading from or writing to the Datastore or other storage options, fetching URLs, sending emails, etc. If you do a lot of difficult computations that can run in parallel, look at MapReduce API.
Once you remove all the delays from your program, there will be no reason to use multiple threads within a single request.
Note that App Engine instances can use multithreading to execute multiple requests at the same time, so they tend to use allocated resources efficiently. To enable it, see:
https://developers.google.com/appengine/docs/java/config/appconfig#Java_appengine_web_xml_Using_concurrent_requests
If you have a problem that calls for a multithreaded solution, you can use threads (as described on the link that you included in your question).
However, based on your reasoning ("to make use of idle machines in the datacenter"), it seems like you're misguided. You should not use threads for that reason. You use the machines hours that you pay for and not more. The only time you will have an idle machine is if you tell App Engine to keep around an extra idle machine so that it doesn't have to start up an extra machine your app gets a big usage spike.
Most of the time, unless you are truly doing parallel computation, you won't need to use multiple threads in App Engine. For instance, the datastore has an asynchronous API so that you can do multiple datastore operations in parallel without having to deal with threads yourself.
Does that make sense?
This is a similar question to the one appearing at: How to ensure Java threads run on different cores. However, there might have been a lot of progress in that in Java, and also, I couldn't find the answer I am looking for in that question.
I just finished writing a multithreaded program. The program spawns several threads, but it doesn't seem to be using more than a single core. The program is faster (I am parallelizing something which makes it faster), but it definitely does not use all cores available, judging by running "top".
Any ideas? Is that an expected behavior?
The general code structure is as following:
for (some values in i)
{
start a thread of instantiated as MyThread(i)
(this thread uses heavily ConcurrentHashMap and arrays and basic arithmetic, no IO)
add the thread to a list T
}
foreach (thread in T)
{
do thread.join()
}
If its almost exactly 100% of one CPU, it can mean you really have
one core thread which is doing all the work and the others are not doing so much.
one resource which you are locking on and only one thread has a chance to run.
If you are using approximately one CPU it can mean this is all the work your CPUs have because you are waiting for something such as IO (network and/or disk)
I suggest you look at the state of your threads in VisualVM. It will help you identify which threads are running and give you an ideal of their pattern of behaviour. I also suggest you use a CPU profiler to help find your bottlenecks.
I think I read in the SCJP book by Katherine Sierra that JVM's ask the underlying OS to create a new OS thread for every Java thread.
So it's up to the underlying Operating System to decide how to balance Java (and any other kind of) threads between the available CPU's.
I was just wondering whether we actually need the algorithm to be muti-threaded if it must make use of the multi-core processors or will the jvm make use of multiple core's even-though our algorithm is sequential ?
UPDATE:
Related Question:
Muti-Threaded quick or merge sort in java
I don't believe any current, production JVM implementations perform automatic multi-threading. They may use other cores for garbage collection and some other housekeeping, but if your code is expressed sequentially it's difficult to automatically parallelize it and still keep the precise semantics.
There may be some experimental/research JVMs which try to parallelize areas of code which the JIT can spot as being embarrassingly parallel, but I haven't heard of anything like that for production systems. Even if the JIT did spot this sort of thing, it would probably be less effective than designing your code for parallelism in the first place. (Writing the code sequentially, you could easily end up making design decisions which would hamper automatic parallelism unintentionally.)
Your implementation needs to be multi-threaded in order to take advantage of the multiple cores at your disposal.
Your system as a whole can use a single core per running application or service. Each running application, though, will work off a single thread/core unless implemented otherwise.
Java will not automatically split your program into threads. Currently, if you want you code to be able to run on multiple cores at once, you need to tell the computer through threads, or some other mechanism, how to split up the code into tasks and the dependencies between tasks in your program. However, other tasks can run concurrently on the other cores, so your program may still run faster on a multicore processor if you are running other things concurrently.
An easy way to make you current code parallizable is to use JOMP to parallelize for loops and processing power intensize, easily parellized parts of your code.
I dont think using multi-threaded algorithm will make use of multi-core processors effectively unless coded for effectiveness. Here is a nice article which talks about making use of multi-core processors for developers -
http://java.dzone.com/news/building-multi-core-ready-java
On a multicore box, the java thread schedulers decisions are rather arbitrary, it assigns thread priorities based on when the thread was created, from which thread it was created etc.
The idea is to run a tuning process using pso that would randomly set thread priorities and then eventually reach optimal priorities where the fitness function is the total run time of the program?
Of course there would be more parameters, like the priorities would shift during the run to find an optimal priority function.
How practical, interesting does the idea sound? and any suggestions.
Just some background,
ive been programming in java/c/c++ for a few years now with various projects, another alternative would be making a thread scheduler based on this in c, where the default thread scheduler is the OS.
Your approach as described is a static approach, i.e. you need to run the program several times, then come up with a scheduling solution, then ship your scheduling information with the program.
The problem is that for most non-trivial programs, their performance will partly depend on the specific data they're working with. Even if you find an optimal way to schedule threads for one data set, there is absolutely no guarantee that it will improve speed on another one. In most cases, running what will be a long and arduous optimization every time they want to do a new release will not be worth it for devs, unless perhaps for large computation efforts (where the programs are likely to be manually tuned and not written in java anyway).
I'd say a self-learning thread scheduler is a nice idea, but you can't treat it as a classical optimization problem here. You either need to be sure that your scheduling order will remain optimal (unlikely) or find an optimization method that works at runtime. And the issue here might be that it wouldn't take much for the overhead of your scheduler to destroy any performance gain you might get.
I think this is a somewhat subjective question, but overall no, don't think it would work.
Best way to find out -- start an open source project and see people's usage/reaction.
It sounds very interesting to me -- but I personally don't find it very much useful. Perhaps we're just not at the point where concurrent programming is as prevalent and easy as it could be.
With the promotion of functional programming, I guess the world would move towards avoiding thread synchronization as much as possible (thus making thread scheduling less of an impact in overall performance)
From my personal subjective experience, most performance problems in software can be solved by improving one single bottleneck area that accounts for 90% of the slowdown. This optimizer may help find that out. I am not sure how much the scheduling strategy could improve overall performance, though.
Don't get discouraged, though! I'm just talking out of thin air. It sounds fun, so why not just play with it anyway :)