What exactly is a blocking method in Java? - java

The definition of a blocking method is very clear. There is still something that puzzles me. In a Java program, if I create a thread and try to take() from an empty BlockingQueue, that thread becomes in WAITING state according to the debugger. This is as expected.
On the other hand, if I create a thread and try to call accept() method of ServerSocket class(This is also a blocking code according to the JavaDoc), I see that this thread always in RUNNING state.
I am expecting a blocking method to be parked with monitors in Java. If a method is blocking like ServerSocket::accept, how come this method does not progress accept line and still have the status of RUNNING?

There's the concept of 'a blocking call' / 'a blocking method', as one might use in conversation, or as a tutorial might use, or even as a javadoc might use it.
Then there is the specific and extremely precisely defined java.lang.Thread state of BLOCKING.
The two concepts do not align, as you've already figured out with your test. The BLOCKING state effectively means 'blocking by way of this list of mechanisms that can block' (mostly, waiting to acquire a monitor, i.e. what happens when you enter a synchronized(x) block or try to pick up again from an x.wait() call: In both cases the thread needs to become 'the thread' that owns the lock on the object x is pointing at, and if it can't do that because another thread holds it, then the thread's state becomes BLOCKING.
This is spelled out in the javadoc. Here's the quote:
A thread that is blocked waiting for a monitor lock is in this state.
('monitor lock' is JVM-ese for the mechanism that synchronized and obj.wait/notify/notifyAll work with, and nothing else).
Keep reading the full javadoc of that page, as the detailed descriptions of these states usually spell out precisely which methods can cause these states.
This lets you figure out that if you write this code:
synchronized (foo) {
foo.wait();
}
then that thread goes through these states, in the worst case scenario)
RUNNING -> BLOCKED (another thread is in a synchronized(foo) block already).
BLOCKED -> RUNNING (that other thread is done)
RUNNING -> WAITING (obj.wait() is called, now waiting for a notify)
WAITING -> BLOCKED (we've been notified, but the thread still cannot continue until that monitor is picked up again, that's how wait and notify work).
BLOCKED -> RUNNING (got the lock on foo)
So why is my I/O thing on RUNNING then?
Unfortunately, I/O-related blocking is highly undefined behaviour.
However, I can explain a common scenario (i.e. what most combinations of OS, hardware, and JVM provider end up doing).
The reason for the undefined behaviour is the same reason for the RUNNING state: Java is supposed to run on a lot of hardware/operation system combos, and most of the I/O is, in the end, just stuff that java 'farms out' to the OS, and the OS does whatever the OS is going to do. Java is not itself managing the state of these threads, it just calls on the OS to do a thing, and then the OS ends up blocking, waiting, etc. Java doesn't need to manage it; not all OSes even allow java to attempt to update this state, and in any case there'd be no point whatsoever to it for the java process, it would just slow things down, and add to the pile of 'code that needs to be custom written for every OS that java should run on', and that's a pile you'd prefer remain quite small. The only benefit would be for you to write code that can programatically inspect thread states... and that's more a job for agents, not for java code running inside the same VM.
But, as I said, undefined, mostly. Don't go relying on the fact that x.get() on a socket's InputStream will keep the thread in RUNNING state.
Similar story when you try to interrupt() a thread that is currently waiting in I/O mode. That means the I/O call that is currently waiting for data might exit immediately with some IOException (not InterruptedException, though, that is guaranteed; InterruptedException is checked, InputStream.read() isn't declared to throw it, therefore, it won't) - or, it might do nothing at all. Depends on OS, java version, hardware, vendor, etc.

So the thread states don’t match up with OS thread states. They are defined in https://docs.oracle.com/javase/8/docs/api/java/lang/Thread.State.html:
public static enum Thread.State
extends Enum<Thread.State>
A thread state. A thread can be in one of the following states:
NEW
A thread that has not yet started is in this state.
RUNNABLE
A thread executing in the Java virtual machine is in this state.
BLOCKED
A thread that is blocked waiting for a monitor lock is in this state.
WAITING
A thread that is waiting indefinitely for another thread to perform a particular action is in this state.
TIMED_WAITING
A thread that is waiting for another thread to perform an action for up to a specified waiting time is in this state.
TERMINATED
A thread that has exited is in this state.
A thread can be in only one state at a given point in time. These states are virtual machine states which do not reflect any operating system thread states.
When we say something is blocked or waiting, we have broad ideas about what that means. But blocked here doesn’t mean blocked on I/O, it doesn’t mean blocked trying to acquire a ReentrantLock, it specifically means blocked trying to acquire a monitor lock. So that is why your socket accept call shows the thread as running, the definitions are very narrow. Read the rest of the linked Java doc, it is extremely specific about what qualifies as a given state.

Related

Is atomicity truly uninterruptible or virtually uninterruptible?

For example, I have a method called increase.
public synchronized void increase() {
count++;
}
Two threads (T_1 and T_2) both execute this method. We know count++ is a compound operation that consists of read, modify and write. If T_1 gets the lock firstly, execute read, can T_1 be interrupted at this point (Although T_2 can't do anything besides waiting for the lock to be released)?
From Concurrency in Go by Katherine Cox-Buday, it says:
When something is considered atomic, or to have the property of atomicity, this means that within the context that it is operating, it is indivisible, or uninterruptible.
I think it means that atomicity is truly uninterruptible.
But from one answer to java - What does "atomic" mean in programming? - Stack Overflow, it says:
"Atomic operation" means an operation that appears to be instantaneous from the perspective of all other threads. You don't need to worry about a partly complete operation when the guarantee applies.
I think it means that atomicity is virtually uninterruptible (it can be interrupted, but from the perspective of all other threads, it seems it can't be interrupted.).
So, which one is right?
A it was pointed out there is now way for your program to understand the difference. But threads scheduler on OS level is free to suspend a thread and resume another thread.
For example if inside synchronized method you are doing some blocking I/O (like reading user input or reading from socket) scheduler can detect it and pause a thread which holds lock but is blocked on I/O and try to resume any other thread (maybe even the one which is also waiting for the same lock).

difference between wait() and yield()

So far what I have understood about wait() and yield () methods is that yield() is called when the thread is not carrying out any task and lets the CPU execute some other thread. wait() is used when some thread is put on hold and usually used in the concept of synchronization. However, I fail to understand the difference in their functionality and i'm not sure if what I have understood is right or wrong. Can someone please explain the difference between them(apart from the package they are present in).
aren't they both doing the same task - waiting so that other threads can execute?
Not even close, because yield() does not wait for anything.
Every thread can be in one of a number of different states: Running means that the thread is actually running on a CPU, Runnable means that nothing is preventing the thread from running except, maybe the availability of a CPU for it to run on. All of the other states can be lumped into a category called blocked. A blocked thread is a thread that is waiting for something to happen before it can become runnable.
The operating system preempts running threads on a regular basis: Every so often (between 10 times per second and 100 times per second on most operating systems) the OS tags each running thread and says, "your turn is up, go to the back of the run queue' (i.e., change state from running to runnable). Then it lets whatever thread is at the head of the run queue use that CPU (i.e., become running again).
When your program calls Thread.yield(), it's saying to the operating system, "I still have work to do, but it might not be as important as the work that some other thread is doing. Please send me to the back of the run queue right now." If there is an available CPU for the thread to run on though, then it effectively will just keep running (i.e., the yield() call will immediately return).
When your program calls foobar.wait() on the other hand, it's saying to the operating system, "Block me until some other thread calls foobar.notify().
Yielding was first implemented on non-preemptive operating systems and, in non-preemptive threading libraries. On a computer with only one CPU, the only way that more than one thread ever got to run was when the threads explicitly yielded to one another.
Yielding also was useful for busy waiting. That's where a thread waits for something to happen by sitting in a tight loop, testing the same condition over and over again. If the condition depended on some other thread to do some work, the waiting thread would yield() each time around the loop in order to let the other thread do its work.
Now that we have preemption and multiprocessor systems and libraries that provide us with higher-level synchronization objects, there is basically no reason why an application programs would need to call yield() anymore.
wait is for waiting on a condition. This might not jump into the eye when looking at the method as it is entirely up to you to define what kind of condition it is. But the API tries to force you to use it correctly by requiring that you own the monitor of the object on which you are waiting, which is necessary for a correct condition check in a multi-threaded environment.
So a correct use of wait looks like:
synchronized(object) {
while( ! /* your defined condition */)
object.wait();
/* execute other critical actions if needed */
}
And it must be paired with another thread executing code like:
synchronized(object) {
/* make your defined condition true */)
object.notify();
}
In contrast Thread.yield() is just a hint that your thread might release the CPU at this point of time. It’s not specified whether it actually does anything and, regardless of whether the CPU has been released or not, it has no impact on the semantics in respect to the memory model. In other words, it does not create any relationship to other threads which would be required for accessing shared variables correctly.
For example the following loop accessing sharedVariable (which is not declared volatile) might run forever without ever noticing updates made by other threads:
while(sharedVariable != expectedValue) Thread.yield();
While Thread.yield might help other threads to run (they will run anyway on most systems), it does not enforce re-reading the value of sharedVariable from the shared memory. Thus, without other constructs enforcing memory visibility, e.g. decaring sharedVariable as volatile, this loop is broken.
The first difference is that yield() is a Thread method , wait() is at the origins Object method inheritid in thread as for all classes , that in the shape, in the background (using java doc)
wait()
Causes the current thread to wait until another thread invokes the notify() method or the notifyAll() method for this object. In other words, this method behaves exactly as if it simply performs the call wait(0).
yield()
A hint to the scheduler that the current thread is willing to yield its current use of a processor. The scheduler is free to ignore this hint.
and here you can see the difference between yield() and wait()
Yield(): When a running thread is stopped to give its space to another thread with a high priority, this is called Yield.Here the running thread changes to runnable thread.
Wait(): A thread is waiting to get resources from a thread to continue its execution.

Thread behaviour

From a Thread perspective, what is a block, wait and lock? Rather,is it necessary to have all these three in any operation? For example, in a producer-consumer pattern how this things are implemented.
Thanks in advance
A blocking operation is one that blocks the thread until the operation completes. Blocking a thread is the process of telling the thread scheduler (usually the operating system, although there are user-level thread libraries) not to run a thread until that thread is woken up. There are many kinds of blocking operations, and one example is file I/O. As with any other blocking operation, the method doesn't return until the relevant operation (in this case, file I/O) has completed.
A wait is a particular kind of blocking operation used for thread synchronization. Specifically, it says "please block the thread that called wait until some other thread wakes it up." In Java, wait is a method. The corresponding wake-up method is notify.
A lock is a higher-level abstraction that says "only allow a limited number of threads into this region of code." Most commonly, that limited number is 1, in which case a mutex (which I explain in plenty of detail in this SO answer) is the preferred locking primitive in a lower-level language like C. In Java, the most common locking primitive is called a monitor. There is a notion of owning an object's monitor (every object has a monitor), and waiting on a monitor, and waking up a thread that is waiting on a monitor. How do we accomplish this? You guessed it - we use the wait method to wait on a monitor, and notify to wake up one of the threads that is waiting on the monitor.
Now an answer that will probably sound a bit like Greek, given that you are just starting with concurrency: To implement the producer-consumer pattern, the most common strategy is to use two semaphores (plus a mutex to synchronize access to the buffer). A semaphore is usually implemented with a mutex, but is a higher-order construct because it allows counting some resource. So you keep one semaphore to count the number of items in the buffer, and one to count the number of empty spaces in the buffer. The producer waits on the empty space semaphore and adds items to the buffer whenever space becomes available, and the consumer waits on the items semaphore and consumes an item whenever an item becomes available.
Now I've defined what these things are, but I haven't really talked about how to use them. That, however, is worth several lectures in a college course, and is certainly too much for a StackOverflow answer. I'd recommend the concurrency lessons in the Java tutorials as a way to get started with threading. Also, look up college courses on the web. Many schools post notes publicly online, so with a little searching you can often find high-quality material.
EDIT: A description of the difference between wait and blocking I/O
Before you begin reading this, make sure you're familiar with what a thread is, and what a process is. I give an explanation in the first four paragraphs of this SO answer, and Wikipedia has a more detailed explanation (albeit with less historical context).
Each thread has one very important piece of information: an instruction pointer (there are other important pieces of information associated with each thread, but they aren't important now). The instruction pointer is a JVM-maintained pointer to the currently executing bytecode instruction. Every time you execute an instruction (each instruction is an abstract representation of a very simple operation, such as "call method foo on object x), the instruction pointer is moved forward to some "next instruction." To run your program, the JVM sets the instruction pointer to the beginning of main and keeps executing instructions and moving the instruction pointer forward until the program exits somehow.
A blocking operation stops the instruction pointer from moving forward until some event occurs to cause the instruction pointer to move forward again. Certainly the thread that initiated the blocking operation can't make this event happen, because that thread's instruction pointer isn't moving forward i.e. that thread is doing nothing.
Now, there are a lot of different kinds of blocking operations. One is blocking I/O. If you call System.out.println, for example, the println method doesn't return until the text is written out to the console. In this case, the instruction pointer stops somewhere inside System.out.println, and the operating system signals the thread to wake up whenever the console printing finishes. So the thread doesn't have to start its own instruction pointer moving again, but the method still returns just after the text is written to the console. So, at a very high level:
Thread 0 calls System.out.println("foo")
Thread 0's instruction pointer stops moving while the operating system writes "foo" to the console
When the operating system is done writing to the console, it notifies the JVM, and the JVM automatically starts moving thread 0's instruction pointer moving again. All of this happens without the programmer who writes System.out.println having to think about it.
Another completely separate kind of blocking operation is encapsulated in the Object.wait method. Whenever a thread calls Object.wait, that thread's instruction pointer stops moving, but instead of the operating system starting the movement of the instruction pointer again, another thread does the job. In this case, there is no external event that will cause the thread's instruction pointer to be restarted (as in the blocking I/O case), but there is an event internal to the program. As I said, another thread will start the instruction pointer moving again by calling Object.notify. So, at a very high level:
Thread 0 calls x.wait() on some object
Thread 0's instruction pointer stops moving
Thread 1 calls x.notify() on the same object x
Thread 0's instruction pointer starts moving again
Thread 0 and thread 1 are now executing concurrently
Notice that a lot more work has to go into writing wait/notify code correctly - the JVM and the operating system don't do all the work for you this time. They still actually do most of the work for you, but you actually have to think about calling wait and notify, and how they allow you to communicate between threads, implement locks, and more.
So there are two morals to this story. The first is that blocking I/O and wait are completely different beasts. In both cases, a thread is blocked, but in the blocking I/O case the thread is woken up automatically by the operating system, while in the wait case the thread has to rely on another thread calling notify in order to wake it up. The second is that concurrent programming is harder to reason about than serial programming. The toy examples I've put in this answer don't really do the second point justice.
No, you don't necessarily need a lock or a wait just because you're using threads. However, if you want the threads to exchange data, they are often useful.
Here's a good explanation with an example of the consumer producer model:
http://www.ase.md/~aursu/JavaThreadsSynchronization.html
Cheers!
Block : Prevent the Executing.
Wait : Suspends the current thread.
Lock : When you lock it others can't Use it.
Consider online purchase when a customer buys a Movie Ticket
As soon as he chooses the seat. Others won't be able to get those seat at the same time(Locking those seats).

Confusing use of synchronized in Java: pattern or anti-pattern?

I'm doing a code review for a change in a Java product I don't own. I'm not a Java expert, but I strongly suspect that this is pointless and indicates a fundamental misunderstanding of how synchronization works.
synchronized (this) {
this.notify();
}
But I could be wrong, since Java is not my primary playground. Perhaps there is a reason this is done. If you can enlighten me as to what the developer was thinking, I would appreciate it.
It certainly is not pointless, you can have another thread that has a reference to the object containing the above code doing
synchronized(foo) {
foo.wait();
}
in order to be woken up when something happens. Though, in many cases it's considered good practice to synchronize on an internal/private lock object instead of this.
However, only doing a .notify() within the synchronization block could be quite wrong - you usually have some work to do and notify when it's done, which in normal cases also needs to be done atomically in regards to other threads. We'd have to see more code to determine whether it really is wrong.
If that is all that is in the synchonized block then it is an antipattern, the point of synchronizing is to do something within the block, setting some condition, then call notify or notifyAll to wake up one or more waiting threads.
When you use wait and notify you have to use a condition variable, see this Oracle tutorial:
Note: Always invoke wait inside a loop that tests for the condition being waited for. Don't assume that the interrupt was for the particular condition you were waiting for, or that the condition is still true.
You shouldn't assume you received a notification just because a thread exited from a call to Object#wait, for multiple reasons:
When calling the version of wait that takes a timeout value there's no way to know whether wait ended due to receiving a notification or due to timing out.
You have to allow for the possibility that a Thread can wake up from waiting without having received a notification (the "spurious wakeup").
The waiting thread that receives a notification still has to reacquire the lock it gave up when it started waiting, there is no atomic linking of these two events; in the interval between being notified and reacquiring the lock another thread can act and possibly change the state of the system so that the notification is now invalid.
You can have a case where the notifying thread acts before any thread is waiting so that the notification has no effect. Assuming one thread will enter a wait before the other thread will notify is dangerous, if you're wrong the waiting thread will hang indefinitely.
So a notification by itself is not good enough, you end up guessing about whether a notification happened when the wait/notify API doesn't give you enough information to know what's going on. Even if other work the notifying thread is doing doesn't require synchronization, updating the condition variable does; there should at least be an update of the shared condition variable in the synchronized block.
This is perfectly fine. According to the Java 6 Object#notify() api documentation:
This method should only be called by a thread that is the owner of this object's monitor.
This is generally not a anti-pattern, if you still want to use intrinsic locks. Some may regard this as an anti pattern, as the new explicit locks from java.util.concurrent are more fine grained.
But your code is still valid. For instance, such code can be found in a blocking queue, when an blocking operation has succeeded and another waiting thread should be notified. Note however that concurrency issues are highly dependent on the usage and the surrounding code, so your simple snippet is not that meaningful.
The Java API documentation for Object.notify() states that the method "should only be called by a thread that is the owner of this object's monitor". So the use could be legitimate depending upon the surrounding context.

What does it mean when a thread moves out of an object's monitor?

All I want to know is that when a thread enters out of a lock, does it means it "ends" or just that it has finished using that function or code which employed the use of the object whose monitor that particular thread is in?
Just that it has finished using that function or code which employed the use of the object. Such pieces of code are commonly known as critical section(s).
For your general understanding: methods run on threads. So it is possible that one method is being executed by multiple threads at the same time.
Imagine you want to make sure that a method, or part of it, can only be executed by one thread at a time. This is called a critical section.
A critical section in Java can be protected by a lock: implicitly via synchronized or explicitly via java.util.concurrent.locks.
Only one thread at a time can acquire the lock and entering the critical section requires that the lock be acquired first. At the end of the critical section the lock is released and the thread continues running but now without holding that lock.
A thread encountering a lock held by another thread (not necessarily for the same critical section) cannot proceed at that point and must wait. The thread, and other threads waiting on the same lock, will be notified when they can retry to acquire the lock. Again, only one thread will win and the process repeats (unless you have a deadlock for example).

Categories

Resources