On a multicore box, the java thread schedulers decisions are rather arbitrary, it assigns thread priorities based on when the thread was created, from which thread it was created etc.
The idea is to run a tuning process using pso that would randomly set thread priorities and then eventually reach optimal priorities where the fitness function is the total run time of the program?
Of course there would be more parameters, like the priorities would shift during the run to find an optimal priority function.
How practical, interesting does the idea sound? and any suggestions.
Just some background,
ive been programming in java/c/c++ for a few years now with various projects, another alternative would be making a thread scheduler based on this in c, where the default thread scheduler is the OS.
Your approach as described is a static approach, i.e. you need to run the program several times, then come up with a scheduling solution, then ship your scheduling information with the program.
The problem is that for most non-trivial programs, their performance will partly depend on the specific data they're working with. Even if you find an optimal way to schedule threads for one data set, there is absolutely no guarantee that it will improve speed on another one. In most cases, running what will be a long and arduous optimization every time they want to do a new release will not be worth it for devs, unless perhaps for large computation efforts (where the programs are likely to be manually tuned and not written in java anyway).
I'd say a self-learning thread scheduler is a nice idea, but you can't treat it as a classical optimization problem here. You either need to be sure that your scheduling order will remain optimal (unlikely) or find an optimization method that works at runtime. And the issue here might be that it wouldn't take much for the overhead of your scheduler to destroy any performance gain you might get.
I think this is a somewhat subjective question, but overall no, don't think it would work.
Best way to find out -- start an open source project and see people's usage/reaction.
It sounds very interesting to me -- but I personally don't find it very much useful. Perhaps we're just not at the point where concurrent programming is as prevalent and easy as it could be.
With the promotion of functional programming, I guess the world would move towards avoiding thread synchronization as much as possible (thus making thread scheduling less of an impact in overall performance)
From my personal subjective experience, most performance problems in software can be solved by improving one single bottleneck area that accounts for 90% of the slowdown. This optimizer may help find that out. I am not sure how much the scheduling strategy could improve overall performance, though.
Don't get discouraged, though! I'm just talking out of thin air. It sounds fun, so why not just play with it anyway :)
Related
I'm opening this questions since I can't find easy to understand summarized information about this topic. There isn't even a good youtube-video that explains this.
I'm currently studying realtime programming and statical- and dynamical scheduling is a part of it. I just can't seem to get my head around it.
If there is someone who can explain the advantages and disadvantages with statical- and dynamical scheduling in a educational way, that would really be helpful.
What I've got so far is the following:
Statical scheduling:
Is a off-line approach where a schedule is generated manually. It can be modified during run-time, but isn't suggested because it then can cause the threads to miss it's deadlines. It's easy to implement and to analyze. Because it's easy to analyze it's easy to see if the system is going to make all of its deadlines.
Dynamical scheduling:
Is a on-line approach where the schedule is generated automatically. It can be modified during run-time by the system and it should't cause (in most cases) the threads to miss its deadlines. If the system changes it's easy to generate a new schedule since it's automatically generated. There isn't a guarantee that the system meets all its deadlines.
Anyone that can explain these two a bit better than me? Or perhaps add more information about these two. Perhaps illustrate it with a image so it'll be easier to wrap my head around it.
In simple terms,
Static Scheduling is the mechanism, where we have already controlled the order/way that the threads/processes are executing in our code (Compile time). If you have used any control(locks, semaphores, joins, sleeps) over threads in your program (to achieve some goal), then you have intended to use static (compile time) scheduling.
Dynamic Scheduling is the mechanism where thread scheduling is done by the operating systems based on any scheduling algorithm implemented in OS level. So the execution order of threads will be completely dependent on that algorithm, unless we have put some control on it (with static scheduling).
I think the term 'advantages' would not be the best term here. Simply when you are implementing any control over threads with your code to achieve some task, you should make sure that you have used minimal controls and also in most optimized way. :))
Addition:
Comparison between Static & Dynamic Scheduling
Generally we would never have a computer program which would completely depend on only one of Static or Dynamic Scheduling.
Instead we would have some programs which are pretty much in controlled from the code itself (Strongly static). This would be a good example for that.
And some programs would be strongly dynamic (weakly static). This would be a good example for that. There you might see other than the start of 2 threads, rest of the program execution would be a free flyer.
Please don't try to find a disclaimer criteria which would seal a program either a strongly static or strongly dynamic one. :))
Positives & Negatives
Dynamic Scheduling scheduling is faster in execution than static scheduling, since it's basically a free flyer without any intentional waits, joins etc. (any kind of synchronization/protection between threads).
Dynamic Scheduling is not aware of any thread dependencies (safeness, synchronization etc.). If you followed above sources I mentioned, you would probably have the idea.
So generally, how good multi-threading programmer you are, would depend on how limited restrictions, dependencies, bottlenecks you have implemented on your threads yet to achieve your task successfully. :))
I think I have covered quite a things. Please raise me questions if any. :))
Dynamic Scheduling –
o Main advantages (PROs):
- Enables handling cases of dependence unknown at compile time
- Simplifies compiler
- Allows compiled code to run efficiently on a different pipeline
o Disadvantages (CONs):
- Significant increase in hardware complexity
- Increased power consumption
- Could generate imprecise exceptions
During Static scheduling the order of the thread or processes is already controlled by the compiler . So it occurs at the compile time.
Here if there is a data dependency involving memory then it wouldn't be solved or recognised at compile time therefore the concept of Dynamic scheduling was introduced .
Dynamic scheduling also determines the order of excecution but here the hardware does this rather than the compiler.
I am a beginner in java. My requirement is to develop an agent application which check whether a system (CPU) is in good health to handle/run more java applications(There are several CPU’s available to run a java application. So, we should select the most healthy CPU according to its performance ).
What are the factors should I consider to check the CPU health? I already included RAM and CPU load to check CPU health.
*Is it possible to check the Heap memory ,I am getting the Heap usage of the current running program. Is there any way to find Heap memory used by all programs together run in Java Virtual Machine?
*Can I use number of Threads here?
Thanks in advance.
You're talking about task scheduling. This is a complex problem. Unless your project's core value lies in better CPU scheduling, I really recommend you rely on the operating system's scheduler instead, which likely has been improved over years or decades. This makes the endeavor very simple and you can ask more specific questions about how to influence the system's scheduler using Java APIs.
You'll want to look at the Java threading API and other concurrency-related packages.
If you really think you can get some benefits from very simple ("naive") scheduling, make sure to test all your scenarios to confirm. Often you'll encounter unexpected ramifications to your heuristics that may make things worse.
If you're an expert in task scheduling and your project's core value does lie in better scheduling, I suggest you rephrase your question to make it more explicit that you're looking for Java-related features. Note that the JVM is quite abstract, it might not provide the flexibility you require.
If you're not an expert in task scheduling and your project's core value still lies in better scheduling, I guess you're in for a nice ride. I suggest starting with thicker resources and asking more specific questions on SO or other places as you encounter them.
Good luck.
My application is supposed to have a "realtime with pause" functionality. The user can pause execution, do some things that modify what's going to happen, then unpause and let stuff happen. Stuff happens at regular intervals as specified by the user, can be slow, can be fast.
My goal at using threading here is to improve performance on multicore systems. The amount of data that the application is supposed to crunch at the time intervals is supposed to be arbitrarily large (I expect lots and lots of loops over collections, modifying object properties and generating random numbers, but precious little disk access). I don't want the application to be constrained by the capacity of a single core, if it can use more to run faster.
Will this actually work this way?
I've run some tests (made a program crunch numbers a lot, and looked at CPU usage during its activity), but it's not really conclusive - usage is certainly in the proximity of 100% on my dual core machine, but hardly ever 100%. Does a single-threaded (main only) Java application use all available cores for computation?
Does a single-threaded (main only) Java application use all available cores for computation?
No, it will normally use a single core.
Making a program do computations in parallel with multiple threads may make it faster, but it's not a magical solution for any kind of problem. Whether this is a suitable solution for your program depends on what your program is doing exactly, and if the algorithm can be parallelized. If, for example, you are doing lots of computations where the next computation depends on the result of the previous computation, then making it multi-threaded will not help a lot, because you can't do the computations at the same time - the next one first has to wait for the answer of the previous one. So, you first have to think about what computations in your program could be run in parallel.
Java has a lot of support for multi-threading. You can program with threads directly, or use an executor service, or use the fork/join framework. Whatever is appropriate depends on what exactly you want to do.
Does a single-threaded (main only) Java application use all available cores for computation?
Not usually, but you could make use of some higher level apis in java that is actually using threads for you and youre not even usinfpg threads directly, more obviousiously fork/join and executors, less obvious the new Streams API on collections (ie parallelStream).
In general, though, to make use of all cores, you need to do some kind of concurrency. Further...its really hard to just observe you OS monitor to see what is going on (especially with only 2 cores)...your OS has other things going on (trying to manage itself, running your IDE, running crontab, running a browers to post to stackoverflow ;).
Finally, just implementing (concurrency) itself may not help, you have to do it "right" for your code/algorithm.
a java thread will run in a single cpu. to use multiple CPUs, you should have multiple threads.
Imagine that u have to do various tasks using your hand. You will do it slowly using one hand and more effciently using both your hands. Similarly, in java or in any other language multi threading provides the system with many hands. The good news is that you can have many threads to do different tasks. Running operations in a single thread will make the program sluggish and sometimes unresponsive. A good practice is to do long running tasks in a separate thread. For example loading large chunks of data from a database should be processed in a separate thread. Downloading data from the internet should also be processed in a separate thread. What happens if you do long running operations in the main thread? The program HANGS and will become unresponsive till the task gets completed and the user will think that there is someting wrong. I hope you get it
I'm learning reactive programming techniques, with async I/O etc, and I just can't find decent authoritative comparative data about the benefits of not switching threads.
Apparently switching threads is "expensive" compared to computations. But what scale are we talking on?
The essential question is "How many processor cycles/instructions does it take to switch a java thread?" (I'm expecting a range)
Is it affected by OS?
I presume it's affected by number of threads, which is why async IO is so much better than blocking - the more threads, the further away the context has to be stored (presumably even out of the cache into main memory).
I've seen Approximate timings for various operations which although it's (way) out of date, is probably still useful for relating processor cycles (network would likely take more "instructions", SSD disk probably less).
I understand that reactive applications enable web apps to go from 1000's to 10,000's requests per second (per server), but that's hard to tell too - comments welcome
NOTE - I know this is a bit of a vague, useless, fluffy question at the moment because I have little idea on the inputs that would affect the speed of a context switch. Perhaps statistical answers would help - as an example I'd guess >=60% of threads would take between 100-10000 processor cycles to switch.
Thread switching is done by the OS, so Java has little to do with it. Also, on linux at least, but I presume also many other operating systems, the scheduling cost does not depend on the number of threads. Linux has been using an O(1) scheduler since version 2.6.
The thread switch overhead on Linux is some 1.2 µs (article from 2018). Unfortunately the article doesn't list the clock speed at which that was measured, but the overhead should be some 1000-2000 clock cycles or thereabout. On a given machine and OS the thread switching overhead should be more or less constant, not a wide range.
Apart from this direct switching cost there's also the cost of changing workload: the new thread is most likely using a different set of instructions and data, which need to be loaded into the cache, but this cost doesn't differ between a thread switch or an asynchronous programming 'context switch'. And for completeness, switching to an entirely different process has the additional overhead of changing the memory address space, which is also significant.
By comparison, the switching overhead between goroutines in the Go programming language (which uses userspace threads which are very similar to asynchronous programming techniques) was around 170 ns, so one seventh of a linux thread switch.
Whether that is significant for you depends on your use case of course. But for most tasks, the time you spend doing computation will be far more than the context switching overhead. Unless you have many threads that do an absolutely tiny amount of work before switching.
Threading overhead has improved a lot since the early 2000s, and according to the linked article running 10,000 threads in production shouldn't be a problem on a recent server with a lot of memory. General claims of thread switching being slow are often based on yesteryears computers, so take those with a grain of salt.
One remaining fundamental advantage of asynchronous programming is that the userspace scheduler has more knowledge about the tasks, and so can in principle make smarter scheduling decisions. It also doesn't have to deal with processes from different users doing wildly different things that still need to be scheduled fairly. But even that can be worked around, and with the right kernel extensions these Google engineers were able to reduce the thread switching overhead to the same range as goroutine switches (200 ns).
Rugal has a point. In modern architectures theoretical turn-around times are usually far off from actual measurements because the hardware, as well as the software have become so much more complex. It also inherently depends on your application. Many web-applications for example are I/O-bound where the context switch time matters a lot less.
Also note that context switching (what you refer to as thread switching) is an OS thing and not a Java thing. There is no guarantee as to how "heavy" a context switch in your OS is. It used to take tens if not hundreds of thousands of CPU cycles to do a kernel-level switch, but there are also user-level switches, as well as experimental systems, where even kernel-level switches can take only a few hundred cycles.
I need to parallelize a CPU intensive Java application on my multicore desktop but I am not so comfortable with threads programming. I looked at Scala but this would imply learning a new language which is really time consuming. I also looked at Ateji PX Java parallel extensions which seem very easy to use but did not have a chance yet to evaluate it. Would anyone recommend it? Other suggestions welcome.
Thanks in advance for your help
Bill
I would suggest you try the built-in ExecutorService for distributing multiple tasks across multiple threads/cores. Do you have any requirements which this might not do for you?
The Java concurrency utilites:
http://download.oracle.com/javase/1.5.0/docs/guide/concurrency/overview.html
make parallel programming on Java even easier than it already was. I would suggest starting there - if you are uncomfortable with that level of working with threads, I would think twice about proceeding further. Parallelizing anything requires some level of technical comfort with how concurrent computation is done and coordinated. In my opinion, it can't get much easier than that framework - which is part of the reason why you see so few alternatives.
Second, the main thing you should think about is what the unit of work is for parallelization. If your unit of work is independent (i.e., each parallel task does not impact the others), this is generally far easier because you don't need to worry about much (or any) synchronization at all. Put effort into thinking how to model the problem so that computation is as independent as possible. If you model it well, you will almost certainly reduce the lines of code (which reduces the error, etc).
Admittedly, frameworks that automatically parallelize for you are less error prone, but can be suboptimal if your model unit of work doesn't play to their parallelization scheme.
I am the lead developer of Ateji PX. As you mention, guaranteeing thread safety is an important topic. It is also a very difficult one, and there's not much help out there beside hand-written and hand-checked #ThreadSafe annotations. See e.g. "The problem with threads".
We are currently working on a parallel verifier for Ateji PX. This has become possible because parallelism in Ateji PX is compositional (unlike threads) and based on a sound mathematical foundation, namely pi-calculus. Even without a tool, experience shows that expressing parallelism in an intuitive and compositional way makes it much easier to "think parallel" and catch errors earlier.
I browsed quickly through the Ateji PX web site. Seems to be a nice product but I'm afraid you will be disappointed at some point since Ateji PX only provides you an intuitive simple way of performing high level parallel operations such as distributing the work load on several workers, creating rendez-vous points between parallel tasks, etc. However as you can read in the FAQ in the section How do you detect and prevent data dependencies? Ateji PX does not ensure that the underlying code is thread safe. So at any rate you'll still be needing skills in Java thread programming.
Edit:
Also consider that when maintenance time will come and you won't be available to perform it, it'll be easier to find a contractor, employee or trainee with skills in standard Java multithread programming than in Ateji PX.
Last word, there's a free 30 days evaluation, try it.
Dont worry java 7 is coming up with Fork Join by Doug lea for Distributed Processing.