CPU usage is 100% during Thread.onSpinWait()

CPU usage is 100% during Thread.onSpinWait() - java

I'm writing a backtesting raw data collector for my crypto trading bot and I've run into a weird optimization issue.
I constantly have 30 runnables in an Executors.newCachedThreadPool() running get requests from an API. Since the API has a request limit of 1200 per minute I have this bit of code in my runnable:
while (minuteRequests.get() >= 1170) {
Thread.onSpinWait();
}
Yes, minuteRequests is an AtomicInteger, so I'm not running into any issues there.
Everything works, the issue is that even though I'm using the recommended busy-waiting onSpinWait method, I shoot from 24% CPU usage or so to 100% when the waiting is initiated. For reference I'm running this on a 3900X (24 thread).
Any recommendations on how to better handle this situation?

My recommendation would be to not do busy waiting at all.
The javadocs for Thread.onSpinWait say this:
Indicates that the caller is momentarily unable to progress, until the occurrence of one or more actions on the part of other activities. By invoking this method within each iteration of a spin-wait loop construct, the calling thread indicates to the runtime that it is busy-waiting. The runtime may take action to improve the performance of invoking spin-wait loop constructions.
Note the highlighted section uses the word may rather than will. That means that it also may not do anything. Also "improve the performance" does not mean that your code will be objectively efficient.
The javadoc also implies that the improvements may be hardware dependent.
In short, this is the right way to use onSpinwait ... but you are expecting too much of it. It won't make your busy-wait code efficient.
So what would I recommend you actually do?
I would recommend that you replace the AtomicInteger with a Semaphore (javadoc). This particular loop would be replaced by the following:
semaphore.acquire();
This blocks1 until 1 "permit" is available and acquires it. Refer to the class javadocs for an explanation of how semaphores work.
Note: since you haven't show us the complete implementation of your rate limiting, it is not clear how your current approach actually works. Therefore, I can't tell you exactly how to replace AtomicInteger with Semaphore throughout.
1 - The blocked thread is "parked" until some other thread releases a permit. While it is parked, the thread does not run and is not associated with a CPU core. The core is either left idle (typically in a low power state) or it is assigned to some other thread. This is typically handled by the operating system's thread scheduler. When another thread releases a permit, the Semaphore.release method will tell the OS to unpark one of the threads that is blocked in acquire.

Related

Why prefer wait/notify to while cycle?

I have some misunderstanding with advantages of wait/notify. As i understand processor core will do nothing helpful in both cases so what's the reason tro write complex wait/notify block codes instead of just waiting in cycle?
I'm clear that wait/notify will not steal processor time in case when two threads are executed on only one core.

"Waiting in a cycle" is most commonly referred to as a "busy loop" or "busy wait":
while ( ! condition()) {
// do nothing
}
workThatDependsOnConditionBeingTrue();
This is very disrespectful of other threads or processes that may need CPU time (it takes 100% time from that core if it can). So there is another variant:
while ( ! condition()) {
sleepForShortInterval();
// do nothing
}
workThatDependsOnConditionBeingTrue();
The small sleep in this variant will drop CPU usage dramatically, even if it is ~100ms long, which should not be noticeable unless your application is real-time.
Note that there will generally be a delay between when the condition actually becomes true and when sleepForShortInterval() ends. If, to be more polite to others, you sleep longer -- the delay will increase. This is generally unacceptable in real-time scenarios.
The nice way to do this, assuming that whatever condition() is checking is being changed from another thread, is to have the other thread wake you up when it finishes whatever you are waiting for. Cleaner code, no wasted CPU, and no delays.
Of course, it's quicker to implement a busy wait, and it may justified for quick'n'dirty situations.
Beware that, in a multithreaded scenario where condition() can be changed to false as well as true, you will need to protect your code between the while and the workThatDependsOnConditionBeingTrue() to avoid other threads changing its value in this precise point of time (this is called a race codition, and is very hard to debug after the fact).

I think you answered your question almost by saying
I'm clear that wait/notify will not steal processor time in case.
Only thing I would add is, this true irrespective of one core or multi-core. wait/notify wont keep the cpu in a busy-wait situation compared to while loop or periodic check.
what's the reason not to run core but wait? There's no helpful work in any case and you're unable to use core when it's in waiting state.
I think you are looking at it from a single application perspective where there is only one application with one thread is running. Think of it from a real world application (like web/app servers or standalone) where there are many threads running and competing for cpu cycles - you can see the advantage of wait/notify. You would definitely not want even a single thread to just do a busy-wait and burn the cpu cycles.
Even if it a single application/thread running on the system there are always OS process running and its related processes that keep competing for the CPU cycles. You don't want them to starve them because the application is doing a while busy-wait.
Quoting from Gordon's comment
waiting in cycle as you suggest you are constantly checking whether the thing you are waiting for has finished, which is wasteful and if you use sleeps you are just guessing with timing, whereas with wait/notify you sit idle until the process that you are waiting on tells you it is finished.

In general, your application is not the only one running on the CPU. Using non-spinning waiting is, first of all, an act of courtesy towards the other processes/threads which are competing for the CPU in order to do some useful job. The CPU scheduler cannot know a-priori if your thread is going to do something useful or just spin on a false flag. So, it can't tune itself based on that, unless you tell it you don't want to be run, because there's nothing for you to do.
Indeed, busy-waiting is faster than getting the thread to sleep, and that's why usually the wait() method is implemented in a hybrid way. It first spins for a while, and then it actually goes to sleep.
Besides, it's not just waiting in a loop. You still need to synchronize access to the resources you're spinning on. Otherwise, you'll fall victim of race conditions.
If you feel the need of a simpler interface, you might also consider using CyclicBarrier, CountDownLatch or a SynchronousQueue.

difference between wait() and yield()

So far what I have understood about wait() and yield () methods is that yield() is called when the thread is not carrying out any task and lets the CPU execute some other thread. wait() is used when some thread is put on hold and usually used in the concept of synchronization. However, I fail to understand the difference in their functionality and i'm not sure if what I have understood is right or wrong. Can someone please explain the difference between them(apart from the package they are present in).

aren't they both doing the same task - waiting so that other threads can execute?
Not even close, because yield() does not wait for anything.
Every thread can be in one of a number of different states: Running means that the thread is actually running on a CPU, Runnable means that nothing is preventing the thread from running except, maybe the availability of a CPU for it to run on. All of the other states can be lumped into a category called blocked. A blocked thread is a thread that is waiting for something to happen before it can become runnable.
The operating system preempts running threads on a regular basis: Every so often (between 10 times per second and 100 times per second on most operating systems) the OS tags each running thread and says, "your turn is up, go to the back of the run queue' (i.e., change state from running to runnable). Then it lets whatever thread is at the head of the run queue use that CPU (i.e., become running again).
When your program calls Thread.yield(), it's saying to the operating system, "I still have work to do, but it might not be as important as the work that some other thread is doing. Please send me to the back of the run queue right now." If there is an available CPU for the thread to run on though, then it effectively will just keep running (i.e., the yield() call will immediately return).
When your program calls foobar.wait() on the other hand, it's saying to the operating system, "Block me until some other thread calls foobar.notify().
Yielding was first implemented on non-preemptive operating systems and, in non-preemptive threading libraries. On a computer with only one CPU, the only way that more than one thread ever got to run was when the threads explicitly yielded to one another.
Yielding also was useful for busy waiting. That's where a thread waits for something to happen by sitting in a tight loop, testing the same condition over and over again. If the condition depended on some other thread to do some work, the waiting thread would yield() each time around the loop in order to let the other thread do its work.
Now that we have preemption and multiprocessor systems and libraries that provide us with higher-level synchronization objects, there is basically no reason why an application programs would need to call yield() anymore.

wait is for waiting on a condition. This might not jump into the eye when looking at the method as it is entirely up to you to define what kind of condition it is. But the API tries to force you to use it correctly by requiring that you own the monitor of the object on which you are waiting, which is necessary for a correct condition check in a multi-threaded environment.
So a correct use of wait looks like:
synchronized(object) {
while( ! /* your defined condition */)
object.wait();
/* execute other critical actions if needed */
}
And it must be paired with another thread executing code like:
synchronized(object) {
/* make your defined condition true */)
object.notify();
}
In contrast Thread.yield() is just a hint that your thread might release the CPU at this point of time. It’s not specified whether it actually does anything and, regardless of whether the CPU has been released or not, it has no impact on the semantics in respect to the memory model. In other words, it does not create any relationship to other threads which would be required for accessing shared variables correctly.
For example the following loop accessing sharedVariable (which is not declared volatile) might run forever without ever noticing updates made by other threads:
while(sharedVariable != expectedValue) Thread.yield();
While Thread.yield might help other threads to run (they will run anyway on most systems), it does not enforce re-reading the value of sharedVariable from the shared memory. Thus, without other constructs enforcing memory visibility, e.g. decaring sharedVariable as volatile, this loop is broken.

The first difference is that yield() is a Thread method , wait() is at the origins Object method inheritid in thread as for all classes , that in the shape, in the background (using java doc)
wait()
Causes the current thread to wait until another thread invokes the notify() method or the notifyAll() method for this object. In other words, this method behaves exactly as if it simply performs the call wait(0).
yield()
A hint to the scheduler that the current thread is willing to yield its current use of a processor. The scheduler is free to ignore this hint.
and here you can see the difference between yield() and wait()

Yield(): When a running thread is stopped to give its space to another thread with a high priority, this is called Yield.Here the running thread changes to runnable thread.
Wait(): A thread is waiting to get resources from a thread to continue its execution.

I've been taught that conditions in concurrency do not necessarily need to be written in a loop, against what the oracle doc says

So basically I am learning a bit more serious concurrency (studying how things actually work, instead of just using random stuff if needed).
And my proffesor, when I asked him about this, said me that he and his colleagues hadn't been able to reproduce a spurious wake up, and believes that line is an old line not deleted (like, it was there, java got "better", it's not longer needed, the line is still there), and that is not the case.
Link:
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/Condition.html
It's right below the point called:
Implementation Considerations
In his opinion, a condition that looked kind of like this:
lock.lock()
if (p>q) {
lock.newCondition().await
}
Would be perfectly fine, since he says a spurious wake up can not happen, it wouldn't be needed a loop:
lock.lock()
while (p>q) {
lock.newCondition().await
}
I am MORE than likely mixing things and understanding both the doc and my teacher the wrong way, but I do have spent some time trying to understand why each thing, and can't come with an "answer" of my own, I either believe one or the other (not like it matters, it's pure I-want-to-learn).
My teacher does spend time telling us how explaining concurrency in java it's pretty silly, but I didn't choose it either, so there's that.

Would be perfectly fine, since he says a spurious wake up can not happen, it wouldnt be needed a loop:
Your teacher is wrong for two reasons:
Spurious wakeups do happen. It may not happen on the architecture that they tested on but if you don't take it into account, when you move your application to a different piece of hardware or a different OS revision, you will see problems. It may also be that the spurious interrupts happen occasionally during an exceptional kernel event such as a signal getting delivered at precisely the wrong time. Again, your application may run fine in testing but when you move it into production with a lot higher load, the frequency of the exceptional event may increase...
The underlying problem is that certain native thread implementations may choose to wakeup all conditions associated with an application instead of the specific one that was notified. This is well documented in the javadocs for Object.wait():
As in the one argument version, interrupts and spurious wakeups are possible, and this method should always be used in a loop:
Here's one example of an architecture that has this limitation. I'll quote from this interesting blog entry:
Internally, wait is implemented as a call to the 'futex' system call. Each blocking system call on Linux returns abruptly when the process receives a signal -- because calling signal handler from kernel call is tricky. What if the signal handler calls some other system function? And a new signal arrives? It's easy to run out of kernel stack for a process. Exactly because each system call can be interrupted, when glibc calls any blocking function, like 'read', it does it in a loop, and if 'read' returns EINTR, calls 'read' again.
The while loop is also very important to protect against race conditions -- especially in multiple thread producer/consumer models. If you have multiple threads that are consuming from a queue (for example), a notification that there are items in the queue may wakeup a thread but by the time it is able to get the lock, another thread has already dequeued the item.
This is well documented on my page here with a sample program that demonstrates the race condition without the use of while.
Producer Consumer Thread Race Conditions
In your example, thread A may be waiting in await() while another thread B may be waiting to get the lock(). Thread C has the lock and is adding to the queue.
// B is here waiting for the lock
lock.lock()
while (p > q) {
// A is here waiting for the signal
lock.newCondition().await();
}
// dequeue
lock.unlock();
Then if the producer adds something to the queue and calls signal() the thread A moves from the WAIT state to the BLOCKED state to get the lock itself. But it may be behind thread B which is already waiting. Once the lock is released, thread B dequeues the element, not thread A. When thread A then gets a chance to dequeue, the queue is empty. Without the while loop, you can get out-of-bounds exceptions or other problems by trying to dequeue from an empty queue.
See my link for more explicit details of the race.

It is still necessary. Your professor is not necessarily incorrect, but has created a strawman argument to knock down.
There are two reasons why you must protect your conditions in a loop.
The first is spurious wake-up. Your professor seems to have been unable to reproduce this, and it may likely not be a problem on the platforms he tests on. This does not mean it is unreproduceable on all platforms.
The second is that between the times that you wake up and actually go to do the logic, the condition may no longer be true. You must guard against this potential race condition. This is also notoriously difficult to reproduce in the lab, and will probably only happen in bizarre circumstances in production.

Java performance issue with Thread.sleep()

Inline Java IDE hint states, "Invoking Thread.sleep in loop can cause performance problems." I can find no elucidation elsewhere in the docs re. this statement.
Why? How? What other method might there be to delay execution of a thread?

It is not that Thread.sleep in a loop itself is a performance problem, but it is usually a hint that you are doing something wrong.
while(! goodToGoOnNow()) {
Thread.sleep(1000);
}
Use Thread.sleep only if you want to suspend your thread for a certain amount of time. Do not use it if you want to wait for a certain condition.
For this situation, you should use wait/notify instead or some of the constructs in the concurrency utils packages.
Polling with Thread.sleep should be used only when waiting for conditions external to the current JVM (for example waiting until another process has written a file).

It depends on whether the wait is dependent on another thread completing work, in which case you should use guarded blocks, or high level concurrency classes introduced in Java 1.6. I recently had to fix some CircularByteBuffer code that used Thread sleeps instead of guarded blocks. With the previous method, there was no way to ensure proper concurrency. If you just want the thread to sleep as a game might, in the core game loop to pause execution for a certain amount of time so that over threads have good period in which to execute, Thread.sleep(..) is perfectly fine.

It depends on why you're putting it to sleep and how often you run it.
I can think of several alternatives that could apply in different situations:
Let the thread die and start a new one later (creating threads can be expensive too)
Use Thread.join() to wait for another thread to die
Use Thread.yield() to allow another thread to run
Let the thread run but set it to a lower priority
Use wait() and notify()

http://www.jsresources.org/faq_performance.html
1.6. What precision can I expect from Thread.sleep()?
The fundamental problem with short sleeps is that a call to sleep finishes the current scheduling time slice. Only after all other threads/process finished, the call can return.
For the Sun JDK, Thread.sleep(1) is reported to be quite precise on Windows. For Linux, it depends on the timer interrupt of the kernel. If the kernel is compiled with HZ=1000 (the default on alpha), the precision is reported to be good. For HZ=100 (the default on x86) it typically sleeps for 20 ms.
Using Thread.sleep(millis, nanos) doesn't improve the results. In the Sun JDK, the nanosecond value is just rounded to the nearest millisecond. (Matthias)

why? that is because of context switching (part of the OS CPU scheduling)
How? calling Thread.sleep(t) makes the current thread to be moved from the running queue to the waiting queue. After the time 't' reached the the current thread get moved from the waiting queue to the ready queue and then it takes some time to be picked by the CPU and be running.
Solution: call Thread.sleep(t*10); instead of calling Thread.Sleep(t) inside loop of 10 iterations ...

I have face this problem before when waiting for asynchronous process to return a result.
Thread.sleep is a problem on multi thread scenario. It tends to oversleep. This is because internally it rearrange its priority and yields to other long running processes (thread).
A new approach is using ScheduledExecutorService interface or the ScheduledThreadPoolExecutor introduce in java 5.
Reference: http://download.oracle.com/javase/1,5.0/docs/api/java/util/concurrent/ScheduledExecutorService.html

It might NOT be a problem, it depends.
In my case, I use Thread.sleep() to wait for a couple of seconds before another reconnect attempt to an external process. I have a while loop for this reconnect logic till it reaches the max # of attemps. So in my case, Thread.sleep() is purely for timing purpose and not coordinating among multithreads, it's perfectly fine.
You can configure you IDE in how this warning should be handled.

I suggest looking into the CountDownLatch class. There are quite a few trivial examples out there online. Back when I just started multithreaded programming they were just the ticket for replacing a "sleeping while loop".

Deadlock in a single threaded java program [duplicate]

This question already has answers here:
Is it possible for a thread to Deadlock itself?
(20 answers)
Closed 9 years ago.
Read that deadlock can happen in a single threaded java program. I am wondering how since there won't be any competition after all. As far as I can remember, books illustrate examples with more than one thread. Can you please give an example if it can happen with a single thread.

It's a matter of how exactly you define "deadlock".
For example, this scenario is somewhat realistic: a single-threaded application that uses a size-limited queue that blocks when its limit is reached. As long as the limit is not reached, this will work fine with a single thread. But when the limit is reached, the thread will wait forever for a (non-existing) other thread to take something from the queue so that it can continue.

Before multicore processors became cheap, all desktop computers had single-core processors. Single-core processors runs only on thread. So how multithreading worked then? The simplest implementation for Java would be:
thread1's code:
doSomething();
yield(); // may switch to another thread
doSomethingElse();
thread2's code:
doSomething2();
yield(); // may switch to another thread
doSomethingElse2();
This is called cooperative multithreading - all is done with just 1 thread, and so multithreading was done in Windows 3.1.
Today's multithreading called preemptive multithreading is just a slight modification of cooperative multithreading where this yield() is called automatically from time to time.
All that may reduce to the following interlacings:
doSomething();
doSomething2();
doSomethingElse2();
doSomethingElse();
or:
doSomething();
doSomething2();
doSomethingElse();
doSomethingElse2();
And so on... We converted multithreaded code to single-threaded code. So yes, if a deadlock is possible in multithreaded programs in single-threaded as well. For example:
thread1:
queue.put(x);
yield();
thread2:
x = queue.waitAndGet()
yield();
It's OK with this interlace:
queue.put(x);
x = queue.waitAndGet()
But here we get deadlock:
x = queue.waitAndGet()
queue.put(x);
So yes, deadlocks are possible in single-threaded programs.

Well I dare say yes
If you try to acquire the same lock within the same thread consecutively, it depends on the type of lock or locking implementation whether it checks if the lock is acquired by the same thread. If the implementation does not check this, you have a deadlock.
For synchronized this is checked, but I could not find the guarantee for Semaphore.
If you use some other type of lock, you have to check the spec as how it is guaranteed to behave!
Also as has already been pointed out, you may block (which is different from deadlock) by reading/ writing to a restricted buffer. For instance you write things into a slotted buffer and only read from it on certain conditions. When you can no longer insert, you wait until a slot becomes free, which won't happen since you yourself do the reading.
So I daresay the answer should be yes, albeit not that easy and usually easier to detect.
hth
Mario

Even if your java stuff is single-threaded there are still signal handlers, which are executed in a different thread/context than the main thread.
So, a deadlock can indeed happen even on single-threaded solutions, if/when java is running on linux.
QED.
-pbr

No, Sounds pretty impossible to me.
But you could theoretically lock a system resource while another app locks another that you're going to request and that app is going to request the one you've already locked. Bang Deadlock.
But the OS should be able to sort this thing out by detecting that and give both resources to one app at the time. Chances for this to happen is slim to none, but any good OS should be able to handle this one-in-a billion chance.
If you make the design carefully and only locks one resource at a time, this can not happen.

No.
Deadlock is a result of multiple threads (or processes) attempting to acquire locks in such a way that neither can continue.
Consider a quote from the Wikipedia article: (http://en.wikipedia.org/wiki/Deadlock)
"When two trains approach each other at a crossing, both shall come to a full stop and neither shall start up again until the other has gone."

It is actually quite easy:
BlockingQueue bq = new ArrayBlockingQueue(1);
bq.take();
will deadlock.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.