Making computation-threads cancellable in a smart way

Making computation-threads cancellable in a smart way - java

I am wondering how to reach a compromise between fast-cancel-responsiveness and performance with my threads which body look similar to this loop:
for(int i=0; i<HUGE_NUMBER; ++i) {
//some easy computation like adding numbers
//which are result of previous iteration of this loop
}
If a computation in loop body is quite easy then adding simple check-reaction to each iteration:
if (Thread.currentThread().isInterrupted()) {
throw new InterruptedException("Cancelled");
}
may slow down execution of the code.
Even if I change the above condition to:
if (i % 100 && Thread.currentThread().isInterrupted()) {
throw new InterruptedException("Cancelled");
}
Then compilator cannot just precompute values of i and check condition only in some specific situations since HUGE_NUMBER is variable and can have different values.
So I'd like to ask if there's any smart way of adding such check to a presented code knowing that:
HUGE_NUMBER is variable and can have different values
loop body consists of some easy-to-compute, but relying on prevoius computations code.
What I want to say is that one iteration of a loop is quite fast, but HUGE_NUMBER of iterations can take a little more time and this is what I want to avoid.

First of all, use Thread.interrupted() instead of Thread.currentThread().isInterrupted() in that case.
You should think about if checking the interruption flag really slows down your calculation too much! One the one hand, if the loop body is VERY simple, even a huge number of iterations (the upper limit is Integer.MAX_VALUE) will run in a few seconds. Even when checking the interruption flag will result in an overhead of 20 or 30%, this will not add very much to the total runtime of your algorithm.
On the other hand, if the loop body is not that simple and so it will run longer, testing the interruption flag will not be a remarkable overhead I think.
Don't do tricks like if (i % 10000 == 0), as this will slow down calculation much more than a 'short' Thread.interrupted().
There is one small trick that you could use - but think twice because it makes your code more complex and less readable:
Whenever you have a loop like that:
for (int i = 0; i < max; i++) {
// loop-body using i
}
you can split up the total range of i into several intervals of size INTERVAL_SIZE:
int start = 0;
while (start < max) {
final int next = Math.min(start + INTERVAL_SIZE, max);
for(int i = start; i < next; i++) {
// loop-body using i
}
start = next;
}
Now you can add your interruption check right before or after the inner loop!
I've done some tests on my system (JDK 7) using the following loop-body
if (i % 2 == 0) x++;
and Integer.MAX_VALUE / 2 iterations. The results are as follows (after warm-up):
Simple loop without any interruption checks: 1,949 ms
Simple loop with check per iteration: 2,219 ms (+14%)
Simple loop with check per 1 million-th iteration using modulo: 3,166 ms (+62%)
Simple loop with check per 1 million-th iteration using bit-mask: 2,653 ms (+36%)
Interval-loop as described above with check in outer loop: 1,972 ms (+1.1%)
So even if the loop-body is as simple as above, the overhead for a per-iteration check is only 14%! So it's recommended to not do any tricks but simply check the interruption flag via Thread.interrupted() in every iteration!

Make your calculation an Iterator.
Although this does not sound terribly useful the benefit here is that you can then quite easily write filter iterators that can be surprisingly flexible. They can be added and removed simply - even through configuration if you wish. There are a number of benefits - try it.
You can then add a filtering Iterator that watches the time and checks for interrupt on a regular basis - or something even more flexible.
You can even add further filtering without compromising the original calculation by interspersing it with brittle status checks.

Related

Why does the iteration speed increase over time? [JAVA]

I was playing around with loops in java, when I saw that the iteration speed keeps increasing.
Kind of seemed interesting.
Any ideas why?
Code:
import org.junit.jupiter.api.Test;
public class RandomStuffTest {
public static long iterationsPerSecond = 0;
#Test
void testIterationSpeed() {
Thread t = new Thread(()->{
try{
while (true){
System.out.println("Iterations per second: "+iterationsPerSecond);
iterationsPerSecond = 0;
Thread.sleep(1000);
}
} catch (Exception e) {
e.printStackTrace();
}
});
t.setDaemon(true);
t.start();
while (true){
for (long i = 0; i < Long.MAX_VALUE; i++) {
iterationsPerSecond++;
}
}
}
}
Output:
Iterations per second: 6111
Iterations per second: 2199824206
Iterations per second: 4539572003
Iterations per second: 6919540856
Iterations per second: 9442209284
Iterations per second: 11899448226
Iterations per second: 14313220638
Iterations per second: 16827637088
Iterations per second: 19322118707
Iterations per second: 21807781722
Iterations per second: 24256315314
Iterations per second: 26641505580
Another thing that I noticed:
The CPU usage was around 20% all the time and not really increasing...
Maybe because I was running the code as a test using Junit?

The problem is the Java Memory Model (JMM).
Every thread is allowed to have (does not have to do this) a local copy of each field. Whenever it writes or reads this field it is free to just set its local copy and sync it up with other threads' local copies much, much later.
Said differently, the JVM is free to re-order instructions, do things in parallel, and otherwise apply whatever weird stuff it wants to optimize your code, as long as certain guarantees are never broken.
One guarantee that is easy to understand: The JVM is free to reorder or parallelize 2 sequential instructions, but it must never be possible to write code that can observe this except through timing.
In other words, int x = 0; x = 5; System.out.println(x); must necessarily print 5 and never 0.
You can establish such relationships between 2 threads as well but this involves the use of volatile and/or synchronized and/or something that does this internally (most things in the java.util.concurrent package).
You didn't, so this result is meaningless. Most likely, the instruction iterationsPerSecond = 0 is having no effect; the code iterationsPerSecond++ reads 9442209284, increments by one, and writes it back - and that field got written to 0 someplace in the middle of all that, which thus accomplished nothing whatsoever.
If you want to test this properly, try a volatile variable, or better yet an AtomicLong.

Like already indicated, the code is broken due to a data race.
The JIT can do some funny stuff with your code because of the data race:
while (true){
for (long i = 0; i < Long.MAX_VALUE; i++) {
iterationsPerSecond++;
}
}
Since it doesn't know that another thread is also messing with the iterationsPerSecond, the compiler could fold the for loop because it can calculate the outcome of the loop:
while (true){
iterationsPerSecond=Long.MAX_VALUE
}
And it could even decide to pull out the write of the loop since the same value is written (loop invariant code motion):
iterationsPerSecond=Long.MAX_VALUE
while (true){
}
It could even decide the throw away the store, because it doesn't know there are any readers. So effectively it is a dead store and hence it can apply dead code elimination.
while (true){
}
An atomic or volatile would solve the problem because a happens before edge is established. Using a volatile or an atomiclong.get/set is equally expensive. It has the same compiler restrictions and fences on hardware level.
If you want to run microbenchmarks, I would suggest checking out JMH. It will protect you against a lot of trivial mistakes.

break is slowing down my loop?

I have a nested loop which iterates over all combinations of two elements from an array. However, if the sum of the two values is too large, I want to skip to the next x.
Here's the Java code snippet:
/* Let array be an array of integers
* and size be equal to its length.
*/
for (int a = 0; a < size; a++)
{
int x = array[a];
for (int b = 0; b < size(); b++)
{
int y = array[b];
if ((x + y) < MAX)
{
// do stuff with x and y
}
else
{
// x + y is too big; skip to next x
break;
}
}
}
This works exactly as expected.
However, if I replace the break statement with b = size;, it surprisingly runs about 20% faster. Note that by setting b = size;, the inner for conditional becomes false and execution continues to the next iteration of the outer a loop.
Why would this happen? It seems like break should be faster, as I would have thought it saves an assignment, jump, and compare. Though clearly it does not.

Why would this happen? It seems like break should be faster ...
IMO, the most likely explanation is some kind of JVM warmup effect, especial since the overall times (120ms versus 74ms) are so small. If you wrapped that loop in another one, so that you could perform the time measurements repeatedly in the same run, this anomaly is likely to go away.
(Just increasing the array sizes isn't necessarily going to help. The best way to be sure that you have accounted for JVM warmup anomalies it to use a benchmarking framework; e.g. Caliper. But, failing that, put the "snippet" into a method and call it repeatedly.)
... as I would have thought it saves an assignment, jump, and compare. Though clearly it does not.
It is not clear at all. Your Java code gets compiled to bytecodes by javac (or your IDE). When you run the code, it starts out interpreting the bytecodes, and then after a bit they are compiled to native code by the JIT compiler:
The JIT compilation takes time that is (probably) included in your time measurements ... and one source of warmup anomalies.
The code produced by the JIT compiler is influenced by statistics gathered while interpreting. One of the things that is typically measured is whether branches (e.g. if tests) go one way or the other. This is used to make branch predictions ... which if correct make the test-and-branch instruction sequences a lot faster.

Is it bad practice to use break to exit a loop in Java? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I was wondering if it is a "bad practice" to use a break statement to exit a loop instead of fulfilling the loop condition?
I do not have enough insight in Java and the JVM to know how a loop is handled, so I was wondering if I was overlooking something critical by doing so.
The focus of this question: is there a specific performance overhead?

Good lord no. Sometimes there is a possibility that something can occur in the loop that satisfies the overall requirement, without satisfying the logical loop condition. In that case, break is used, to stop you cycling around a loop pointlessly.
Example
String item;
for(int x = 0; x < 10; x++)
{
// Linear search.
if(array[x].equals("Item I am looking for"))
{
//you've found the item. Let's stop.
item = array[x];
break;
}
}
What makes more sense in this example. Continue looping to 10 every time, even after you've found it, or loop until you find the item and stop? Or to put it into real world terms; when you find your keys, do you keep looking?
Edit in response to comment
Why not set x to 11 to break the loop? It's pointless. We've got break! Unless your code is making the assumption that x is definitely larger than 10 later on (and it probably shouldn't be) then you're fine just using break.
Edit for the sake of completeness
There are definitely other ways to simulate break. For example, adding extra logic to your termination condition in your loop. Saying that it is either loop pointlessly or use break isn't fair. As pointed out, a while loop can often achieve similar functionality. For example, following the above example..
while(x < 10 && item == null)
{
if(array[x].equals("Item I am looking for"))
{
item = array[x];
}
x++;
}
Using break simply means you can achieve this functionality with a for loop. It also means you don't have to keep adding in conditions into your termination logic, whenever you want the loop to behave differently. For example.
for(int x = 0; x < 10; x++)
{
if(array[x].equals("Something that will make me want to cancel"))
{
break;
}
else if(array[x].equals("Something else that will make me want to cancel"))
{
break;
}
else if(array[x].equals("This is what I want"))
{
item = array[x];
}
}
Rather than a while loop with a termination condition that looks like this:
while(x < 10 && !array[x].equals("Something that will make me want to cancel") &&
!array[x].equals("Something else that will make me want to cancel"))

Using break, just as practically any other language feature, can be a bad practice, within a specific context, where you are clearly misusing it. But some very important idioms cannot be coded without it, or at least would result in far less readable code. In those cases, break is the way to go.
In other words, don't listen to any blanket, unqualified advice—about break or anything else. It is not once that I've seen code totally emaciated just to literally enforce a "good practice".
Regarding your concern about performance overhead, there is absolutely none. At the bytecode level there are no explicit loop constructs anyway: all flow control is implemented in terms of conditional jumps.

The JLS specifies a break is an abnormal termination of a loop. However, just because it is considered abnormal does not mean that it is not used in many different code examples, projects, products, space shuttles, etc. The JVM specification does not state either an existence or absence of a performance loss, though it is clear code execution will continue after the loop.
However, code readability can suffer with odd breaks. If you're sticking a break in a complex if statement surrounded by side effects and odd cleanup code, with possibly a multilevel break with a label(or worse, with a strange set of exit conditions one after the other), it's not going to be easy to read for anyone.
If you want to break your loop by forcing the iteration variable to be outside the iteration range, or by otherwise introducing a not-necessarily-direct way of exiting, it's less readable than break.
However, looping extra times in an empty manner is almost always bad practice as it takes extra iterations and may be unclear.

In my opinion a For loop should be used when a fixed amount of iterations will be done and they won't be stopped before every iteration has been completed. In the other case where you want to quit earlier I prefer to use a While loop. Even if you read those two little words it seems more logical. Some examples:
for (int i=0;i<10;i++) {
System.out.println(i);
}
When I read this code quickly I will know for sure it will print out 10 lines and then go on.
for (int i=0;i<10;i++) {
if (someCondition) break;
System.out.println(i);
}
This one is already less clear to me. Why would you first state you will take 10 iterations, but then inside the loop add some extra conditions to stop sooner?
I prefer the previous example written in this way (even when it's a little more verbose, but that's only with 1 line more):
int i=0;
while (i<10 && !someCondition) {
System.out.println(i);
i++;
}
Everyone who will read this code will see immediatly that there is an extra condition that might terminate the loop earlier.
Ofcourse in very small loops you can always discuss that every programmer will notice the break statement. But I can tell from my own experience that in larger loops those breaks can be overseen. (And that brings us to another topic to start splitting up code in smaller chunks)

Using break in loops can be perfectly legitimate and it can even be the only way to solve some problems.
However, it's bad reputation comes from the fact that new programmers usually abuse it, leading to confusing code, especially by using break to stop the loop in conditions that could have been written in the loop condition statement in the first place.

No, it is not a bad practice to break out of a loop when if certain desired condition is reached(like a match is found). Many times, you may want to stop iterations because you have already achieved what you want, and there is no point iterating further. But, be careful to make sure you are not accidentally missing something or breaking out when not required.
This can also add to performance improvement if you break the loop, instead of iterating over thousands of records even if the purpose of the loop is complete(i.e. may be to match required record is already done).
Example :
for (int j = 0; j < type.size(); j++) {
if (condition) {
// do stuff after which you want
break; // stop further iteration
}
}

It isn't bad practice, but it can make code less readable. One useful refactoring to work around this is to move the loop to a separate method, and then use a return statement instead of a break, for example this (example lifted from #Chris's answer):
String item;
for(int x = 0; x < 10; x++)
{
// Linear search.
if(array[x].equals("Item I am looking for"))
{
//you've found the item. Let's stop.
item = array[x];
break;
}
}
can be refactored (using extract method) to this:
public String searchForItem(String itemIamLookingFor)
{
for(int x = 0; x < 10; x++)
{
if(array[x].equals(itemIamLookingFor))
{
return array[x];
}
}
}
Which when called from the surrounding code can prove to be more readable.

There are a number of common situations for which break is the most natural way to express the algorithm. They are called "loop-and-a-half" constructs; the paradigm example is
while (true) {
item = stream.next();
if (item == EOF)
break;
process(item);
}
If you can't use break for this you have to repeat yourself instead:
item = stream.next();
while (item != EOF) {
process(item);
item = stream.next();
}
It is generally agreed that this is worse.
Similarly, for continue, there is a common pattern that looks like this:
for (item in list) {
if (ignore_p(item))
continue;
if (trivial_p(item)) {
process_trivial(item);
continue;
}
process_complicated(item);
}
This is often more readable than the alternative with chained else if, particularly when process_complicated is more than just one function call.
Further reading: Loop Exits and Structured Programming:
Reopening the Debate

If you start to do something like this, then I would say it starts to get a bit strange and you're better off moving it to a seperate method that returns a result upon the matchedCondition.
boolean matched = false;
for(int i = 0; i < 10; i++) {
for(int j = 0; j < 10; j++) {
if(matchedCondition) {
matched = true;
break;
}
}
if(matched) {
break;
}
}
To elaborate on how to clean up the above code, you can refactor, moving the code to a function that returns instead of using breaks. This is in general, better dealing with complex/messy breaks.
public boolean matches()
for(int i = 0; i < 10; i++) {
for(int j = 0; j < 10; j++) {
if(matchedCondition) {
return true;
}
}
}
return false;
}
However for something simple like my below example. By all means use break!
for(int i = 0; i < 10; i++) {
if(wereDoneHere()) { // we're done, break.
break;
}
}
And changing the conditions, in the above case i, and j's value, you would just make the code really hard to read. Also there could be a case where the upper limits (10 in the example) are variables so then it would be even harder to guess what value to set it to in order to exit the loop. You could of course just set i and j to Integer.MAX_VALUE, but I think you can see this starts to get messy very quickly. :)

No, it is not a bad practice. It is the most easiest and efficient way.

While its not bad practice to use break and there are many excellent uses for it, it should not be all you rely upon. Almost any use of a break can be written into the loop condition. Code is far more readable when real conditions are used, but in the case of a long-running or infinite loop, breaks make perfect sense. They also make sense when searching for data, as shown above.

If you know in advance where the loop will have to stop, it will probably improve code readability to state the condition in the for, while, or `do-while loop.
Otherwise, that's the exact use case for break.

break and continue breaks the readability for the reader, although it's often useful.
Not as much as "goto" concept, but almost.
Besides, if you take some new languages like Scala (inspired by Java and functional programming languages like Ocaml), you will notice that break and continue simply disappeared.
Especially in functional programming, this style of code is avoided:
Why scala doesn't support break and continue?
To sum up: break and continueare widely used in Java for an imperative style, but for any coders that used to practice functional programming, it might be.. weird.

checking a value for reset value before resetting it - performance impact?

I have a variable that gets read and updated thousands of times a second. It needs to be reset regularly. But "half" the time, the value is already the reset value. Is it a good idea to check the value first (to see if it needs resetting) before resetting (a write operaion), or I should just reset it regardless? The main goal is to optimize the code for performance.
To illustrate:
Random r = new Random();
int val = Integer.MAX_VALUE;
for (int i=0; i<100000000; i++) {
if (i % 2 == 0)
val = Integer.MAX_VALUE;
else
val = r.nextInt();
if (val != Integer.MAX_VALUE) //skip check?
val = Integer.MAX_VALUE;
}
I tried to use the above program to test the 2 scenarios (by un/commenting the 2nd "if" line), but any difference is masked by the natural variance of the run duration time.
Thanks.

Don't check it.
It's more execution steps = more cycles = more time.
As an aside, you are breaking one of the basic software golden rules: "Don't optimise early". Unless you have hard evidence that this piece if code is a performance problem, you shouldn't be looking at it. (Note that doesn't mean you code without performance in mind, you still follow normal best practice, but you don't add any special code whose only purpose is "performance related")

The check has no actual performance impact. We'd be talking about a single clock cycle or something, which is usually not relevant in a Java program (as hard-core number crunching usually isn't done in Java).
Instead, base the decision on readability. Think of the maintainer who's going to change this piece of code five years on.
In the case of your example, using my rationale, I would skip the check.

Most likely the JIT will optimise the code away because it doesn't do anything.
Rather than worrying about performance, it is usually better to worry about what it
simpler to understand
cleaner to implement
In both cases, you might remove the code as it doesn't do anything useful and it could make the code faster as well.
Even if it did make the code a little slower it would be very small compared to the cost of calling r.nextInt() which is not cheap.

For Loops Code Optimization

I had a challenge to print out multiples of 7 (non-negative) to the 50th multiple in the simplest way humanly possible using for loops.
I came up with this (Ignoring the data types)
for(int i = 0; i <= 350; i += 7)
{System.out.println(i);}
The other guy came up with this
for(int i=0;i <=50; i++)
{
System.out.println(7*i);
}
However, I feel the two code snippets could be further optimized. If it actually can please tell. And what are the advantages/disadvantages of one over the other?

If you really want to optimize it, do this:
System.out.print("0\n7\n14\n21\n28\n35\n42\n49\n56\n63\n70\n77\n84\n91\n98\n105\n112\n119\n126\n133\n140\n147\n154\n161\n168\n175\n182\n189\n196\n203\n210\n217\n224\n231\n238\n245\n252\n259\n266\n273\n280\n287\n294\n301\n308\n315\n322\n329\n336\n343\n350");
and it's O(1) :)

The first one technically performs less operations (no multiplication).
The second one is slightly more readable (50 multiples of 7 vs. multiples of 7 up to 350).
Probably can't be optimized any further.
Unless you're willing to optimize away multiple println calls by doing:
StringBuilder s = new StringBuilder();
for(int i = 0; i <= 350; i += 7) s.append(i).append(", ");
System.out.println(s.toString());
(IIRC printlns are relatively expensive.)
This is getting to the point where you gain a tiny bit of optimization at the expense of simplicity.

In theory, your code is faster since it does not need one less multiplication instruction per loop.
However, the multiple calls to System.out.println (and the integer-to-string conversion) will dwarf the runtime the multiplication takes. To optimize, aggregate the Strings with a StringBuilder and output the whole result (or output the result when memory becomes a problem).
However, in real-world code, this is extremely unlikely to be the bottleneck. Profile, then optimize.

The second function is the best you would get:
O(n)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Making computation-threads cancellable in a smart way - java

Related

Why does the iteration speed increase over time? [JAVA]

break is slowing down my loop?

Is it bad practice to use break to exit a loop in Java? [closed]

checking a value for reset value before resetting it - performance impact?

For Loops Code Optimization

Categories

Resources