I am intermediate in java. I am working with a company quite new to company, they have asked me to take session on "threading concepts in java with real example" as I don't have hands on on threading I can just prepare slides hoe threads can be implemented using Thread class or Runnable interface.
Can anybody please help me out with real scenario of threads and its implementation?
Thanks in advance
I would recommend Brien Goetz' "Java Concurrency In Practice". It talks about features added to the JDK above and beyond Thread and Runnable that will make your life better.
In 2016 it's a better idea to dig into the java.util.concurrency package and JDK 8 lambdas and parallel stream. No one should be trying to write multithreaded code with raw Thread unless they know what they're doing. We've been given better abstractions - use them.
I'd probably start with Sun's (now Oracle's) excellent documentation on concurrency.
As an example, you can create something like a banking application where you have a shared data structure (accounts), and you have multiple threads operating on the account (performing withdrawals and deposits).
The questions is too generic to answer but here are few details. Threads are used to run multiple things in parallel (in theory only it is based on lot of other factors like num of cpus, no of cores etc etc). Multi-threading has gone through lot of improvements in JDK since its inception. You can read the tutorial here: http://download.oracle.com/javase/tutorial/essential/concurrency/
A real life example can be a telecom application SCP (service control point), that receives lot of requests , (in order of 400 in a sec). The application that handles the request employs a master-slave configuration. There is a thread pool, each thread of which is waiting for a signal to run.
The master thread, receives the request, the request data is posted in some object, that the thread functions reads, and then the thread is signaled to run. When the processing is finished the worker thread is returned to the thread pool.
There can be a flag, that informs about the status of the thread for example, busy, idle, bad etc.
Related
I have been thinking about why JDBC is only blocking operation and why I can't set some listener to the hypothetical event handler onResultSetArrived(ResultSet rs). Why I have to block single one thread per each JDBC query.
After a while I've dive into Java Sockets (I suppose JDBC is build on top of them) and realised that there also isn't any event handling. Only option to provide non-blocking read is through the available() method but this is very inefficient as it has to be checked periodically in the loop.
As far as I'm aware, interruption is fundamental thing in PC. It goes down from the hardware up to the operating system. In the Java it can be implemented into event driven approach in read value from Socket.
Now, my question is am I missing something and there exists some workaround or current architecture in Java really is one thread per one blocking operation? And if yes isn't it inefficient?
In Java, you can have many threads. A thread is doing its stuff until it is blocked somewhere (typically, on a mutex or a I/O operation). Of course, this does not block other threads.
The fundamental scenario of multithreaded applications is that you use multiple threads when waiting for a blocked thread would introduce too much waiting. Definition of "too much" here depends entirely on you, but in general, this is how you achieve beter performance through better utilization of resources.
There are some limitations in how threads in Java work, however. Most, if not all of them are when the thread is blocked somewhere "outside" of Java such as in OS call or external (native) library. Theoretically, if native code blocks a thread, Java can not do anything about it. Normally, this should not be a problem unless the native code has a bug.
So in the case of a blocking JDBC response, you would create a new thread which would do other work while first thread is waiting for database to complete. Alternatively, you could make a thread just for doing JDBC. You could make it exactly like you want (with listeners etc.) except for limitations imposed by OS. So it's possible, but it's probably not provided out-of-the-box by JDBC drivers. There is a lot of infrastructure already in core Java which you might find useful (thread pools, workers, synchronized collections). But as with any multithreading, you need to be very careful with accessing data from different threads simultaneously.
Since Java 7, there is also support for non-blocking I/O (NIO). This is almost exactly what you are describing. I/O is offloaded to OS, so your operations return immediately and you get a callback when the operation is finished. However, not all libraries support NIO. For my work, I have never had a reason to use it, because I could always implement the same stuff with my threads at least as good.
If the question is whether the "current architecture in Java really is one thread per one blocking operation" and by "blocking operation" you mean "database operation" then the answer is no. Most database drivers available for Java currently are jdbc-based and do work that way. But there are usable alternatives (https://spring.io/blog/2016/11/28/going-reactive-with-spring-data) and more on the way (
https://blogs.oracle.com/java/jdbc-next:-a-new-asynchronous-api-for-connecting-to-a-database , https://dzone.com/articles/spring-5-webflux-and-jdbc-to-block-or-not-to-block). For how this works see How is ReactiveMongo implemented so that it is considered non-blocking?
For jdbc there are also ways to wrap the jdbc calls (Wrapping blocking I/O in project reactor , Spring webflux and reading from database ) and projects pursuing this approach (https://dzone.com/articles/myth-asynchronous-jdbc)
What I understood from Vert.x documentation (and a little bit of coding in it) is that Vert.x is single threaded and executes events in the event pool. It doesn't wait for I/O or any network operation(s) rather than giving time to another event (which was not before in any Java multi-threaded framework).
But I couldn't understand following:
How single thread is better than multi-threaded? What if there are millions of incoming HTTP requests? Won't it be slower than other multi-threaded frameworks?
Verticles depend on CPU cores. As many CPU cores you have, you can have that many verticles running in parallel. How come a language that works on a virtual machine can make use of CPU as needed? As far as I know, the Java VM (JVM) is an application that uses just another OS process for (here my understanding is less about OS and JVM hence my question might be naive).
If a single threaded, non-blocking concept is so effective then why can't we have the same non-blocking concept in a multi-threaded environemnt? Won't it be faster? Or again, is it because CPU can execute one thread at a time?
What I understood from Vert.x documentation (and a little bit of coding in it) is that Vert.x is single threaded and executes events in the event pool.
It is event-driven, callback-based. It isn't single-threaded:
Instead of a single event loop, each Vertx instance maintains several event loops. By default we choose the number based on the number of available cores on the machine, but this can be overridden.
It doesn't wait for I/O or any network operation(s)
It uses non-blocking or asynchronous I/O, it isn't clear which. Use of the Reactor pattern suggests non-blocking, but it may not be.
rather than giving time to another event (which was not before in any Java multi-threaded framework).
This is meaningless.
How single thread is better than multi-threaded?
It isn't.
What if there are millions of incoming HTTP requests? Won't it be slower than other multi-threaded frameworks?
Yes.
Verticles depend on CPU cores. As many CPU cores you have, you can have that many verticles running in parallel. How come a language that works on a virtual machine can make use of CPU as needed? As far as I know, the Java VM (JVM) is an application that uses just another OS process for (here my understanding is less about OS and JVM hence my question might be naive).
It uses a thread per core, as per the quotation above, or whatever you choose by overriding that.
If a single threaded, non-blocking concept is so effective then why can't we have the same non-blocking concept in a multi-threaded environemnt?
You can.
Won't it be faster?
Yes.
Or again, is it because CPU can execute one thread at a time?
A multi-core CPU can execute more than one thread at a time. I don't know what 'it' in 'is it because' refers to.
First of all, Vertx isn't single threaded by any means. It just doesn't spawn more threads that it needs.
Second, and this is not related to Vertx at all, JVM maps threads to native OS threads.
Third, we can have non-blocking behavior in multithreaded environment. It's not one thread per CPU, but one thread per core.
But then the question is: "what are those threads doing?". Because usually, to be useful, they need other resources. Network, DB, filesystem, memory. And here it becomes tricky. When you're single threaded, you don't have race conditions. The only one accessing the memory at any point of time is you. But if you're multi threaded, you need to concern yourself with mutexes, or any other way to keep you data consistent.
Q:
How single thread is better than multi-threaded? What if there are millions of incoming HTTP requests? Won't it be slower than other multi-threaded frameworks?
A:
Vert.x isn't a single threaded framework, it does make sure that a "verticle" which is something you deploy within you application and register with vert.x is mostly single threaded.
The reason for this is that concurrency with multiple threads over complicates concurrency with locks synchronisation and other concept that need to be taken care of with multi threaded communication.
While verticles are single threaded the do use something called an event loop which is the true power behind this paradigm called the reactor pattern or multi reactor pattern in Vert.x's case. Multiple verticles can be registered within one application, communication between these verticles run through an eventbus which empowers verticles to use an event based transfer protocol internally but this can also be distributed using some other technology to manage the clustering.
event loops handle events coming in on one thread but everything is async so computation gets handled by the loop and when it's done a signal notifies that a result can be used.
So all computation is either callback based or uses something like Reactive.X / fibers / coroutines / channels and the lot.
Due to the simpler communication model for concurrency and other nice features of Vert.x it can actually be faster than a lot of the Blocking and pure multi threaded models out there.
the numbers
Q:
If a single threaded, non-blocking concept is so effective then why can't we have the same non-blocking concept in a multi-threaded environemnt? Won't it be faster? Or again, is it because CPU can execute one thread at a time?
A:
Like a said with the first question it's not really single threaded. Actually when you know something is blocking you'll have to register computation with a method called executeBlocking which wil make it run multithreaded on an ExecutorService managed by Vert.x
The reason why Vert.x's model is mostly faster is also here because event loops make better use of cpu computation features and constraints. This is mostly powered by the Netty project.
the overhead of multi threading with it's locks and syncs imposes to much strain to outdo Vert.x with it's multi reactor pattern.
This is a similar question to the one appearing at: How to ensure Java threads run on different cores. However, there might have been a lot of progress in that in Java, and also, I couldn't find the answer I am looking for in that question.
I just finished writing a multithreaded program. The program spawns several threads, but it doesn't seem to be using more than a single core. The program is faster (I am parallelizing something which makes it faster), but it definitely does not use all cores available, judging by running "top".
Any ideas? Is that an expected behavior?
The general code structure is as following:
for (some values in i)
{
start a thread of instantiated as MyThread(i)
(this thread uses heavily ConcurrentHashMap and arrays and basic arithmetic, no IO)
add the thread to a list T
}
foreach (thread in T)
{
do thread.join()
}
If its almost exactly 100% of one CPU, it can mean you really have
one core thread which is doing all the work and the others are not doing so much.
one resource which you are locking on and only one thread has a chance to run.
If you are using approximately one CPU it can mean this is all the work your CPUs have because you are waiting for something such as IO (network and/or disk)
I suggest you look at the state of your threads in VisualVM. It will help you identify which threads are running and give you an ideal of their pattern of behaviour. I also suggest you use a CPU profiler to help find your bottlenecks.
I think I read in the SCJP book by Katherine Sierra that JVM's ask the underlying OS to create a new OS thread for every Java thread.
So it's up to the underlying Operating System to decide how to balance Java (and any other kind of) threads between the available CPU's.
Recently, I've been working on the deployment of concurrent objects onto multicore. In a sample, I use BlockingQueue.take() method whose specification mentions that it is blocking. It means that the method does not release the enclosing thread's resources such that it can be re-used for other concurrent tasks. This is useful since the total number of live threads in a JVM instance is limited and if the application would need thousands of live threads, then it is vital to be able to re-use suspended threads. On the other hand, JVM uses a 1:1 mapping from application-level threads to OS-level threads in Java; i.e. each Java Thread instance becomes an underlying OS-level thread.
The current solution is based on java.util.concurrency in Java 1.5+. Still, we need worker threads that are such scalable to a large number. Now, I am interested to find the following answers:
Is there any way to replace the implementation of java.lang.Thread in JVM such that I can plug my own Thread implementation?
Is this only possible through tweaking C++ sections of the thread implementation in JVM and recompiling it?
Is there any library to provide a way to replace the classical thread in Java?
Again, in the same line, is there a library or a way to guide how some threads in Java can be mapped to only one thread in the OS-level?
I also found this discussing different implementations of JVM and I am not sure if they could help.
Thanks for your comments and ideas in advance.
If you are creating thousands of threads, you're doing it wrong.
Instead, consider using the Executor framework. (Start with the Executors and ThreadPoolExecutor classes.) They allow you to queue thousands of tasks while having a sane number of threads handling them.
I guess this approach is what you meant by "library to replace the classical threads". I highly recommend you look into executors.
One caveat: Executors, by default, use non-daemon threads. Therefore, you must shut down your executor when you're done with it. You can do this at program exit, if there is a normal way to exit your program that doesn't simply involve waiting for all threads to finish. :-)
Where can I find asynchronous programming example using Java? I'm interested in finding patterns in asynchronous programming for building applications that present responsiveness (preventing applications that periodically hang and stop responding to user input and server applications that do not respond to client requests in a timely fashion) and scalability.
In particularly it will be helpful to see a sample which performs I/O operations (such as file reads/writes, Web requests, and database queries) and also has a lot of CPU processing involved like a shopping suggester in a webpage.
Which are the Java libraries which can help in determining when an application's responsiveness is unpredictable - because the application's thread performs I/O requests, the application is basically giving up control of the thread's processing to the I/O device (a hard drive, a network, or whatever)
In a GUI, you could use threads to perform background tasks.
Java supports non blocking I/O in the new I/O API (NIO).
If your question is more architecturally oriented, this book offers an in-depth discussion of asynchronous patterns: Patterns of Enterprise Application Architecture, by Martin Fowler.
For examples performing asynchronous operations with emphasis on non-blocking IO on files, you may check some samples here: https://github.com/javasync/idioms (disclaimer I am the author).
I use this samples in introduction to asynchronous programming in Java and we explore callback based, CompletableFuture and finally reactive streams.
Which are the Java libraries which can help in determining when an application's responsiveness is unpredictable - because the application's thread performs I/O requests, the application is basically giving up control of the thread's processing to the I/O device (a hard drive, a network, or whatever)
If I understand you correctly, you are asking for some library that examines other threads to determine if they are blocked in I/O calls.
I don't think that this is possible in a standard JVM. Furthermore, I don't think that this would necessarily be sufficient to guarantee "responsiveness".
If you are using some kind of I/O operation (for example read on InputStream, which can block) you put it into a thread and the simplest solution is to use join on the thread for a given amount:
MyThread myThread = new MyThread();
myThread.start();
myThread.join(10000);
This join will then wait for atmost 10 seconds. After that time you can just ignore the thread, ...
You can also use the Decorator pattern. You can read more here.
in a web environment, you can make use of the new j2ee6 Asynchronous feature.
take a look at
http://docs.oracle.com/javaee/6/tutorial/doc/gkkqg.html