Sharing a thread pool between different WARs

Sharing a thread pool between different WARs - java

Within a Tomcat 8 server, we have several WAR projects that need thread pools to execute tasks (schedulers and parallel processing for faster performance specifically).
As each pool handles their own threads, it ended up adding too many threads to the container, so the evident question came up: is it possible to somehow share a single thread pool with several war projects within Tomcat?
The pools are a mix between Spring's schedulers and the standard Java ThreadPoolExecutor, but I guess they could be standardized into a single type if needed.
PS: Does this actually help The Executor (thread pool) If so, how?

You can configure a single ThreadPool as a global JNDI resource and then use ResourceLinks to make that resource available to as many or as few web applications as you require. You'll probably need to code up a simple custom resource factory to make this work.
Tomcat's JNDI documentation provides a worked example for a simple factory.

Related

What is best practise when creating new threads when inside a servlet container?

Today I was looking at some web app code that used Executors.newSingleThreadScheduledExecutor() on a per request basis to poll a REST API for a final status (think COMPLETE, FAILED, etc.). Knowing that a servlet container is already a highly concurrent environment, it worried me each request thread was creating another thread.
I have looked at creating a fixed sized thread pool but I'm not certain if it has to be managed in a servlet container specific way. I'm unsure how to decide on an optimal size. I'm also not completely confident that this is the right course of action.
I'd like to confirm that this indeed is dangerous with of an explanation of why and understand what a superior solution might look like.

The servlet AsyncContext.start(Runnable) is the preferred way if you want to interact with the Servlet API and contexts from within your Runnable (this will run within the Servlet Container's Thread Pool and allow the container to manage the various contexts around your thread, like classloaders, security, cdi, sessions, etc).
The ONLY downside with the AsyncContext approach is that you are consuming Servlet Container threads to do your processing. If you also use Servlet Async I/O, then you've offset this negative in a grand way, and you'll actually have a noticeable improvement in your ability to scale.
If you have no requirement to interact with the Servlet API from your Runnable, then go with simple threads executed from a Thread Pool you have obtained from the Java Executors.

Difference between Quartz Thread Pool and Task Executor

I'm working in Spring-quartz batch. I'm trying to implement Multi-threading for the Batch application.
I come across 2 possible way of multi threading,
Use Quartz Thread pool
Use Task Executors.
I used Quartz thread pool and it is working fine but was wondering what the advantage i will get if i also implement task Executor.
I'm doing all this as xml configuration.
Please suggest me which should be used and what is the benefit of one over the other.
Thanks

I would choose task executors if all you need is to keep N workers picking pieces of work from the common queue. The advantage is that you do not need any external libraries for this. Quartz thread pool was created before Java 5 - that is why it exists.

Executor is good enough for running concurrent tasks within a JVM. But if you want to distribute tasks across multiple JVMs in a clustered environment, then you should explore Quartz using the JDBC Store.
Quartz is more of a scheduling framework where you can setup jobs to run on a periodic basis. But I have also used it heavily for concurrent programming.

Mapping the heroku cedar model to a multithreaded application

I'm not really understanding the dyno and worker process model of Heroku as it relates to a single process but multi-threaded Java-based server.
For example: How do I know (for a single dyno) how many processors are available for my background threads? Do I need to use something like RabbitMQ and create a separate process (app) for each background processing task and communicate between the server and these? Seems a little overkill for some Scheduled Tasks using Thread Cached Executors. Should all Futures be changed to inter-process Futures?
I guess it comes down to this question. Can I no longer write a multi-threaded server and scale the processors available to my server process in order to accommodate my thread activity? Or do I need to refactor my architecture to use separate processes for concurrency? If the former, do I need workers or just multiple dynos?
Thanks.

Heroku supports multiple concurrency models, so it's really up to you how you would like to architect your application. You have access to the full Java stack, so if something makes more sense to just be run as multiple threads in your web processes, you can definitely do that, or you can always enqueue jobs on something like RabbitMQ or Redis and process them on separate worker dynos. Multithreading is simpler and makes sense if the amount of work is light and proportional to your web requests because it will be scaled along with the web dynos; however, if the work is large, not proportional, and/or needs to be scaled independently, then breaking it out into a separate process would be better.
Heroku was originally just a Ruby platform, which does not have the same threading capabilities as Java, so the use of separate worker dynos is more important for Ruby and this is reflected in some of the documentation and examples out there, which might have led to your confusion. Luckily, with Java you have more options available to you and can use what's best for the job at hand.

Servlets should not start threads due to issues that may arise when clustering ....what issues?

I know that we should not start threads in a servlet is that threads should be managed by the container. If the container is told to shutdown if there are threads that it does not know about hanging around it wont shutdown. I take care of this by making it a daemon thread...
But other than the above "unable to shutdown" situation what other reason could there be to not allow a servlet to start threads. I have seen some mentions that if the environment is clustered it will cause a problem. But no actual walk-through of what could happen that would be BAD.
EDIT:
This is currently being done in a servlet and I am having trouble convincing the author of this code that is not a good idea. The argument that one has to understand complexity is not going to fly...
I am looking for one specific concrete case when something bad can happen, without intending it to
In my situation: the servlet in question is launches n threads and this happens in each vm on the cluster by design.
There is no transactional requirement

From the official FAQ:
Why is thread creation and management
disallowed?
The EJB specification assigns to the
EJB container the responsibility for
managing threads. Allowing enterprise
bean instances to create and manage
threads would interfere with the
container's ability to control its
components' lifecycle. Thread
management is not a business function,
it is an implementation detail, and is
typically complicated and
platform-specific. Letting the
container manage threads relieves the
enterprise bean developer of dealing
with threading issues. Multithreaded
applications are still possible, but
control of multithreading is located
in the container, not in the
enterprise bean.
That said, if the problem of startup and shutdown is not considered, it is partly a "philosophical" issue in the sense that thread is an implementation detail, and also the fact that multi-threading is considered a scalability concern, which should be managed by the app. server.
For instance, most app. servers allow the integrator to define pools and configure the number of threads, etc. An app that spawns thread itself escapes this configuration, and does not cooperate nicely in the scalability plan.
Also, if you want a single background thread in a clustered environment, it becomes tricky.
And finally, the app. server controls the transactions. If you spawn threads yourself, you must take care to understand all the details of what can be used safely or not (e.g. get a connection from the pool) and how to use UserTransaction if necessary. The idea is that you shouldn't worry about such detail if you use an app. server, but you will need to if you start dealing with threads yourself.
I've however seen web app spawning a background thread from a ServletContextListener, and guess what, that was fine, even if the app was deployed on more than one node. You just need to understand what it means to have several JVM running and make sure you support that correctly.

There are a lot of issues, depending on your use case. What if the particular server in the cluster that your thread/job is running on crashes, which makes your thread go away, would that be a bad thing? Should someone be notified? Should the job move over to another server in the cluster? Should it restart once the server starts up again? All of this, you've got to implement in your thread....or you could use JMS, which will even run in Tomcat, with the addon of ActiveMQ, or some other messaging container of your choice, and just write the code that executes your logic, and let the container worry about all the rest of this.
YMMV

Why is spawning threads in Java EE container discouraged?

One of the first things I've learned about Java EE development is that I shouldn't spawn my own threads inside a Java EE container. But when I come to think about it, I don't know the reason.
Can you clearly explain why it is discouraged?
I am sure most enterprise applications need some kind of asynchronous jobs like mail daemons, idle sessions, cleanup jobs etc.
So, if indeed one shouldn't spawn threads, what is the correct way to do it when needed?

It is discouraged because all resources within the environment are meant to be managed, and potentially monitored, by the server. Also, much of the context in which a thread is being used is typically attached to the thread of execution itself. If you simply start your own thread (which I believe some servers will not even allow), it cannot access other resources. What this means, is that you cannot get an InitialContext and do JNDI lookups to access other system resources such as JMS Connection Factories and Datasources.
There are ways to do this "correctly", but it is dependent on the platform being used.
The commonj WorkManager is common for WebSphere and WebLogic as well as others
More info here
And here
Also somewhat duplicates this one from this morning
UPDATE: Please note that this question and answer relate to the state of Java EE in 2009, things have improved since then!

For EJBs, it's not only discouraged, it's expressly forbidden by the specification:
An enterprise bean must not use thread
synchronization primitives to
synchronize execution of multiple
instances.
and
The enterprise bean must not attempt
to manage threads. The enterprise
bean must not attempt to start, stop,
suspend, or resume a thread, or to
change a thread’s priority or name.
The enterprise bean must not attempt
to manage thread groups.
The reason is that EJBs are meant to operate in a distributed environment. An EJB might be moved from one machine in a cluster to another. Threads (and sockets and other restricted facilities) are a significant barrier to this portability.

The reason that you shouldn't spawn your own threads is that these won't be managed by the container. The container takes care of a lot of things that a novice developer can find hard to imagine. For example things like thread pooling, clustering, crash recoveries are performed by the container. When you start a thread you may lose some of those. Also the container lets you restart your application without affecting the JVM it runs on. How this would be possible if there are threads out of the container's control?
This the reason that from J2EE 1.4 timer services were introduced. See this article for details.

Concurrency Utilities for Java EE
There is now a standard, and correct way to create threads with the core Java EE API:
JSR 236: Concurrency Utilities for Java™ EE
By using Concurrency Utils, you ensure that your new thread is created, and managed by the container, guaranteeing that all EE services are available.
Examples here

There is no real reason not to do so. I used Quarz with Spring in a webapp without problems. Also the concurrency framework java.util.concurrent may be used. If you implement your own thread handling, set the theads to deamon or use a own deamon thread group for them so the container may unload your webapp any time.
But be careful, the bean scopes session and request do not work in threads spawned! Also other code beased on ThreadLocal does not work out of the box, you need to transfer the values to the spawned threads by yourself.

You can always tell the container to start stuff as part of your deployment descriptors. These can then do whatever maintainance tasks you need to do.
Follow the rules. You will be glad some day you did :)

Threads are prohibited in Java EE containers according to the blueprints. Please refer to the blueprints for more information.

I've never read that it's discouraged, except from the fact that it's not easy to do correctly.
It is fairly low-level programming, and like other low-level techniques you ought to have a good reason. Most concurrency problems can be resolved far more effectively using built-in constructs like thread pools.

One reason I have found if you spawn some threads in you EJB and then you try to have the container unload or update your EJB you are going to run into problems. There is almost always another way to do something where you don't need a Thread so just say NO.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.