This question already has answers here:
How to find out the optimal amount of threads?
(5 answers)
Closed 6 years ago.
I am doing a webcrawler and using threads to download pages.
The first limiting factor to the performance of my program is the bandwidth, I can never download more pages that it can get.
The second thing is what I interested. I am using threads to download many pages at same time, but as I create more threads, more sharing of processor occurs. Is there some metric/way/class of tests to determine what is the ideal number of threads or if after certain number, the performance doesn't change or decrease?
we've developped a multithreaded parrallel web crawler. Benchmarking troughput is the best way to get ideas on how the beast will handle his job. For a dedicated java server, one thread per core is a base to start, then the I/O comes into play and change.
Performances do decrease after certain number of threads. But it depends on the site you crawl too, on the OS you use, etc. Try to find a site with a merely constant response time to do your first benchmarks (like Google, but take differents services)
With slow websites, higher number of threads tends to compensate i/o blocking
Have a look at my answer in this thread
How to find out the optimal amount of threads?
Your example will likely be CPU bound, so you need a way to work out the contention to be able to work out the right number of threads on your box to use and be able to keep them all busy. Profiling will help there but remember it'll depend on the number of cores (as well as the network latency already mentioned etc) so use the runtime to get the number of cores when wiring up your thread pool size.
No quick answer I'm afraid, there will be an element of test, measure, adjust, repeat I'm afraid!
The ideal number of thread should be close to the number of cores (virtual cores) your hardware provides. This is to avoid thread context switching and thread scheduling. If you're doing heavy IO operations with many blocking reads (your thread blocks on a socket read) I suggest you redesign your code to use non-blocking IO APIs. Typically this will involve one "selector" thread that will monitor the activity of thousands of sockets and a small number of worker threads that will do the processing. If you code is in Java, the APIs are NIO. The only blocking call will be when you call selector.select() and it will only block if there is nothing to be processed on any of the thousands of sockets. Event-driven frameworks such as netty.io use this model and have proven to be very scalable and to best use the hardware resources of the system.
I say use something like Akka manage the threads for u. Use Jersey http client lib with non blocking IO which works with callback if i remember correctly. It's possibly the ideal setting for that type of tasks.
Related
So, I was starting up minecraft a few days ago and opened up it's developer console to see what it was doing while it was updating itself. I noticed one of the lines said the following:
Downloading 32 files. (16 threads)
Now, the first thing that came to mind was: the processor can still only do one thing at a time, all threads do is split each of their tasks up and distribute the CPU power between them, so what would the purpose be of downloading multiple files on multiple threads if each thread is still only being run on a single processor?
Then, in the process of deciding whether or not I should ask this question on SO, I remembered that multiple cores can reside on one processor. For example, my processor is quad-core. So, you can actually accomplish 4 downloads truly simultaneously. Now that sounds like it makes sense. Except for the fact that there are 16 threads being use for minecraft's download. So, basically my question is:
Does increasing the number of threads during a download help the speed at all? (Assuming a multi-core processor, and the thread count is less than the core count.)
And
If you increase the number of threads to past the number of cores, does speed still increase? (It sounds to me like the downloads would be max-speed after 4 threads, on a quad-core processor.)
Downloads are network-bound, not CPU-bound. So theoretically, using multiple threads will not make it faster.
On the one hand, if your program downloads using synchronous (blocking) I/O, then multiple threads simply enables less blocking to occur. In general, on the other hand, it is more sensible to just use a single thread with asynchronous I/O.
On the gripping hand, asynchronous I/O is trickier to code correctly than synchronous I/O (which is straightforward). So the developers may have just decided to favour ease of programming over pure performance. (Or they may favour compatibility with older Java platforms: real async I/O is only available with NIO2 (which came with Java 7).)
When one thread downloads one file, it will spend some time waiting. When one thread downloads N files, one after another, it will spend, on average, N times as much total wait time.
When N threads each download one file, each of those threads will spend some time waiting, but some of those waits will be overlapped (e.g., thread A and thread B are both waiting at the same time.) The end result is that it may take less wall-clock time to get all N of the files.
On the other hand, if the threads are waiting for files from the same server, each thread's individual wait time may be longer.
The question of whether or not there is an over-all performance benefit depends on the client, on the server, and on the available network bandwidth. If the network can't carry bytes as fast as the server can pump them out, then multi-threading the client probably won't save any time, if the server is single-threaded, then multi-threading the client definitely won't help, but if the conditions are right (e.g., if you have a fast internet connection and especially if the files are coming from a server farm instead of a single machine), then multi-threading potentially can speed things up.
Normally it will not be faster, but there are always exceptions.
Assuming for each download thread, you are opening a new connection, then if
The network (either your own network, or target system) is limiting the download speed for each connection, or
You are downloading from multiple servers, and etc
Or, if the "download" is not a plain download, but downloading something and do some CPU intensive processing on that.
In such cases you may see download speed become faster when having multiple thread.
I am new to multithreading in Java, after looking at Java virtual machine - maximum number of threads it would appear there isn't a limit to how many threads a Java/Android app can run. However, is there an advisable limit? What I mean by this is, is there a number of threads where if you run past this number then it is unwise because you are unable to determine what thread does what at what time? I hope my question makes sense.
There are some advisable limits, however they don't really have anything to do with keeping track of them.
Most multithreading comes with locking. If you are using central data storage or global mutable state then the more threads you have, the more lock contention you will get. This is app-specific and depends on how much of said state you have and how often threads read and write it.
There are no limits in desktop JVMs by default, but there are OS limits.It should be in the tens of thousands for modern Windows machines, but don't rely on the ability to create much more than that.
Running multiple tasks in parallel is great, but the hardware can only cope with so much. If you are using small threads that get fired up sometimes, and spend most their time idle, that's no biggie (Java servers were written like this for years). However if your threads are very intensive, making more of them than the number of cores you have is not likely to give you any benefit. (I believe the standard practice is twice the number of cores if you anticipate threads going idle sometimes).
Threads have a cost to them. Whenever you switch Threads you switch context, and while it isn't that expensive, doing it constantly will hurt performance. It's not a good idea to create a Thread to sum up two integers and write back a result.
If Threads need visibility of each others state, then they are greatly slowed down, since a lot of their writes have to be written back to main memory. Threads are best used for standalone tasks that require little interaction with each other.
TL;DR
Depends on OS and Hardware: on servers creating thousands of threads is fine, on desktop machines you should limit yourself to 50-200 and choose carefully what you do with them.
Note: Androids default and suggested "UI multithread helper" - the AsyncTask is not actually a thread. It's a task invoked from a ThreadPool, and as such there is no limit or penalty to using it. It has an upper limit on the number of threads it spawns and reuses them rather than creating new ones. Most Android apps should use it instead of spawning their own threads. In general, Thread Pools are fairly widespread and are a great choice unless you are forced into blocking operations.
I run multiple game servers and I want to develop a custom application to manage them. Basically all the game servers will connect to the application to exchange data. I don't want any of this data getting lost so I think it would be best to use TCP. I have looked into networking and understand how it works however I have a question about cpu usage. More servers are being added and in the next few months it could potentially reach around 100 - 200 and will continue to grow as needed. Will new threads for each server use a lot of cpu and is it a good idea to do this? Does anyone have any suggestions on how to go about this? Thanks.
You should have a look at non blocking io. With blocking io, each socket will consume 1 thread and the number of threads in a system is limited. And even if you can create 1000+, it is a questionable approach.
With non blocking io, you can server multiple sockets with a single thread. This is a more scalable approach + you control how many threads at any given moment are running.
More servers are being added and in the next few months it could potentially reach around 100 - 200 and will continue to grow as needed. Will new threads for each server use a lot of cpu and is it a good idea to do this?
It is a standard answer to caution away from 100s of threads and to the NIO solution. However, it is important to note that the NIO approach has a significantly more complex implementation. Isolating the interaction with a server connection to a single thread has its advantages from a code standpoint.
Modern OS' can fork 1000s of threads with little overhead aside from the stack memory. If you are sure of your scaling factors (i.e. you're not going to reach 10k connections or something) and you have the core memory then I would say that a thread per TCP connection could work very well. I've very successfully run applications with 1000s of threads and have not seen fall offs in performance due to context switching which used to be the case with earlier processors/kernels.
I have two IO intensive processes that don't do much computing: one is getting and parsing a webpage and the other is storing some data obtained with the parsing in a database. This is going to repeat while the crawling of the web continues.
Is there a method for adding and subtracting the number of threads that are working on each task dynamically so the performance is optimal for the machine where the whole system is running? The method should not involve benchmarking because it's going to be distributed to a number of machines I cannot access beforehand.
Please guide me to some sources or information.
Instead of using threads directly you should just create a ThreadPool to which you add a number of Runnables which do the actual work. From your description a CachedThreadPool might be suitable. Check out http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ExecutorService.html for some guidelines how to implement.
Well dynamically adjusting thread count should be no problem (using ThreadPoolExecutor for example).
But it looks to me that the optimal number of threads is limited by two factors:
The network bandwidth for your "downloading threads"
The maximum number of allowed database connections for your "database threads"
I'm not sure if the downloading part should be multithreaded at all, because each thread will just steal bandwidth from the others unless the pages are really small.
I've read several posts about java.net vs java.nio here on StackOverflow and on some blogs. But I still cannot catch an idea of when should one prefer NIO over threaded sockets. Can you please examine my conclusions below and tell me which ones are incorrect and which ones are missed?
Since in threaded model you need to dedicate a thread to each active connection and each thread takes like 250Kilobytes of memory for it's stack, with thread per socket model you will quickly run out of memory on large number of concurrent connections. Unlike NIO.
In modern operating systems and processors a large number of active threads and context switch time can be considered almost insignificant for performance
NIO throughoutput can be lower because select() and poll() used by asynchronous NIO libraries in high-load environments is more expensive than waking up and putting to sleep threads.
NIO has always been slower but it allows you to process more concurrent connections. It's essentially a time/space trade-off: traditional IO is faster but has a heavier memory footprint, NIO is slower but uses less resources.
Java has a hard limit per concurrent threads of 15000 / 30000 depending on JVM and this will limit thread per connection model to this number of concurrent connections maximum, but JVM7 will have no such limit (cannot confirm this data).
So, as a conclusion, you can have this:
If you have tens of thousands concurrent connections - NIO is a better choice unless a request processing speed is a key factor for you
If you have less than that - thread per connection is a better choice (given that you can afford amount of RAM to hold stacks of all concurrent threads up to maximum)
With Java 7 you may want to go over NIO 2.0 in either case.
Am I correct?
That seems right to me, except for the part about Java limiting the number of threads – that is typically limited by the OS it's running on (see How many threads can a Java VM support? and Can't get past 2542 Threads in Java on 4GB iMac OSX 10.6.3 Snow Leopard (32bit)).
To reach that many threads you'll probably need to adjust the stack size of the JVM.
I still think the context switch overhead for the threads in traditional IO is significant. At a high level, you only gain performance using multiple threads if they won't contend for the same resources as much, or they spend time much higher than the context switch overhead on the resources.
The reason for bringing this up, is with new storage technologies like SSD, your threads come back to contend on the CPU much quicker
There is not a single "best" way to build NIO servers, but the preponderance of this particular question on SO suggests that people think there is! Your question summarizes the use cases that are suited to both options well enough to help you make the decision that is right for you.
Also, hybrid solutions are possible too! You could hand the channel off to threads when they are going to do something worthy of their expense, and stick to NIO when it is better.
I would say start with thread-per-connection and adapt from there if you run into problems.
If you really need to handle a million connections you should consider writing (or finding) a simple request broker in C (or whatever) that will use far less memory per connection than any java implementation can. The broker can receive requests asynchronously and queue them to backend workers written in your language of choice.
The backends thus only need a thread per active request, and you can just have a fixed number of them so the memory and database use is predetermined to some degree. When large numbers of requests are running in parallel the requests are made to wait a bit longer.
Thus I think you should never have to resort to NIO select channels or asynchronous I/O (NIO 2) on 64-bit systems. The thread-per-connection model works well enough and you can do your scaling to "tens or hundreds of thousands" of connections using some more appropriate low-level technology.
It is always helpful to avoid premature optimization (i.e. writing NIO code before you really have massive numbers of connections coming in) and don't reinvent the wheel (Jetty, nginx, etc.) if possible.
What most often is overlooked is that NIO allows zero copy handling. E.g. if you listen to the same multicast traffic from within multiple processes using old school sockets on one single server, any multicast packet is copied from the network/kernel buffer to each listening application. So if you build a GRID of e.g. 20 processes, you get memory bandwidth issues. With nio you can examine the incoming buffer without having to copy it to application space. The process then copies only parts of the incoming traffic it is interested in.
another application example:
see http://www.ibm.com/developerworks/java/library/j-zerocopy/ for an example.