I am designing a micro-services based system. Most of the services are deployed as standalone Jersey processes with an embedded Grizzly web server.
Assuming that many of those services will execute on the same machine, shall I change any threading configuration in Grizzly to prevent a situation of too many threads machine-wide?
What is the default threading model for Grizzly? Is there a limit for number of threads that a single web server can create?
It depends on what you do with the incoming data.
If you need to process the data (cpu time > io time), then you need to match the number of physical cores to the number of data processing threads.
If most of the time is spent in IO (retrieving/storing the data) then you can start with cores * 2 and set the max to something that you must determine through testing the cpu usage and the throughput. I personally like the powers of 4 per core (4, 16, 64, 256). This will quickly narrow you down down onto the order of magnitude.
https://javaee.github.io/grizzly/coreconfig.html#/Thread_Pool_Configuration
Related
Assume we have a computer with four physical cores, and we want to decrease latency of a task and best number of threads to do that is 4.
But if we are in a web application, and we use an application server(or servlet container) like tomcat, jetty, netty, I think for doing throughput performance issue, that application server uses 4 threads.
At this point, if we want to use 4 threads to decrease latency of a task, Given that 4 threads is used by application server. With using multi threading, we can not get the most or great benefits. Is that true for web applications?
Thank you so much in advance.
There is this stateless REST application/API written and being maintained by me using Spring Integration API with the following underlying concepts working hand-in-hand:
1) Inbound HTTP gateway as the RESTful entrypoint
2) A handful of Service Activators, Routers, Channels and Transformers
3) A Splitter (and an Aggregator) with the former subscribed to a channel which in turn, has a task executor wired-in comprising a thread pool of size 100 for parallelised execution of the split(ted) messages
The application is performing seamlessly so far - as the next step, my attempt is to scale this application to handle a higher number of requests in order to accommodate a worst case situation where all 100 threads in the pool are occupied at the exact same time.
Please note that the behaviour of the service is always meant to be synchronous (this is a business need) and there are times when the service can be a slightly long-running one. The worst-case roundtrip is ~15 seconds and the best case is ~2 seconds, both of which are within acceptable limits for the business team.
The application server at hand is WebSphere 8.5 in a multi-instance clustered environment and there is a provision to grow the size of the cluster as well as the power of each instance in terms of memory and processor cores.
That said, I am exploring ways to solve the problem of scaling the application within the implementation layer and these are a couple of ways I could think of:
1) Increase the size of the task executor thread pool by many times, say, to 1000 or 10000 instead of 100 to accommodate a higher number of parallel requests.
2) Keep the size of the task executor thread pool intact and instead, scale-up by using some Spring code to convert the single application context into a pool of contexts so that each request can grab one that is available and every context has full access to the thread pool.
Example: A pool of 250 application contexts with each context having a thread pool of size 100, facilitating a total of 250 × 100 = 25000 threads in parallel.
The 2nd approach may lead to high memory consumption so I am thinking if I should start with approach 1).
However, what I am not sure of is if either of the approaches is practical in the long run.
Can anyone kindly throw some light? Thanks in advance for your time.
Sincerely,
Bharath
In my experience, it is very easy to hit a road block when scaling up. In contrast, scaling out is more flexible but adds complexity to the system.
The application server at hand is WebSphere 8.5 in a multi-instance
clustered environment and there is a provision to grow the size of the
cluster as well as the power of each instance in terms of memory and
processor cores.
I would continue in this direction (scaling out by adding instances to the cluster), if possible I would add a load balance mechanism in front of it. Start by distributing the load randomly and enhance by distributing the load by "free threads in the instance's pool".
Moreover, identify the heavier portions of the systems and evaluate if you would gain anything by migrating them to their own dedicated services.
Please note that the behaviour of the service is always meant to be
synchronous (this is a business need) and there are times when the
service can be a slightly long-running one.
The statement above raises some eyebrows. I understand when the business says "only return the results when everything is done". If that is the case then this system would benefit a lot if you could change the paradigm from a synchronous request/response to an Observer Pattern.
We are running same Jetty service on two servers but are seeing different number of threads created by both services (50 vs ~100 threads).
Both servers are running identical Java code on RedHat5 (they do have slightly different kernels). Yet Jetty on one of the servers creates more threads than the other one. How is it possible?
Thread counts are dynamic, depends on many many factors.
The number of threads that you see at any one point can vary greatly, based on hardware differences (number of cpu cores, number of network interfaces, etc), kernel differences, java differences, load differences, active user counts, active connection counts, transactions per second, if there are external dependencies (like databases), how async processing is done, how async I/O is done, use of http/2 vs http/1, use of websocket, and even ${jetty.base} configuration differences.
As for the counts you are seeing, 50 vs 100, that's positively tiny for a production server. Many production servers on moderately busy systems can use 500 (java) threads, and on very busy commodity systems its can be in the 5,000+ range. Even on specialized hardware (like an Azul systems devices) its not unheard of to be in the 90,000+ thread range with multiple active network interfaces.
I'm developing an enrollment application. The client side is an Android application enabling the client to enter their information which are stored using the data storage service of the Google cloud and the images are entered are stored using the blob storage service.
The server side is a J2EE application extracting the data and blobs entered previously and doing some tests such as face recognition, alphanumeric matching etc. These tests are done asynchronously and continuously.I thaught to use the multithreading for these processes done by the server side.
So is that recommended for such case ? Is there other solution ?
There are several limitations in GAE that somewhat limit it's multiprocessing abilities:
Each request can make up to 50 threads, but threads can not outlive the request, which itself has a 60 seconds limit. Also, threads must be created via GAE's own ThreadManager, which limits the use of most external processing libraries.
Background threads independent of current request are available and can be long-lived, but there is a limit of 10 background threads per instance.
For async processing you should look into Task Queues - it has all above limitations, but can run for 10 minutes. You can start periodic processing via Cron jobs.
Note that GAE instances are quite limited (default is single core 600MHz, 128Mb RAM). They are also quite expensive given how low-power they are. If you need more processing power you should look into Compute Engine (powerful, stand-alone, unmanaged, no GAE-services access, fairly priced for the power), or in your case preferably Managed VMs (powerful, managed, limited GAE-service access, same price as CE).
So if you have light processing, use Task Queues, if you need more power use Managed VMs (currently in preview).
a jk_connector worker is basically a tomcat instance waiting to process requests from a web server.
The apache docs tell you that you should have multiple workers if you have multiple apps, but doesnt really explain why.
What are the pros/cons of having a worker per web app vs 1 worker for multiple apps?
Processor affinity for one. If the workset is bound to one executional unit its built in cache be utilized more effectively. The more applications to share the space the more contention.
Most systems today are based on multiple cpu cores where threads can execute independently on each core. This means that a busy server can better utilize system resources if there are more threads (e.g., 1 thread/cpu), both for multicore (SMP) and multithreading (SMT) systems. A common way for servers is to provide a process/thread pool of workers which can be used and reused to serve multiple simultaneous requests.