I know there are some sources in the web for this topic, but it is confusing me that every application server seems to have its own measurement of how to handle concurrency and offering there own thread pools.
Is it a No-Go to create Threads on it's own on an application server? (e.g. new Thread()...)
If yes, does there a common approach exist independently of the used application server?
The EJB specification assigns to the EJB container the responsibility for managing threads.
Allowing enterprise bean instances to create and manage threads would interfere with the container's ability to control its components' lifecycle. Thread management is not a business function, it is an implementation detail, and is typically complicated and platform-specific.
Letting the container manage threads relieves the enterprise bean developer of dealing with threading issues. Multithreaded applications are still possible, but control of multithreading is located in the container, not in the enterprise bean.
source: http://java.sun.com/blueprints/qanda/ejb_tier/restrictions.html#threads
EJB 3.1 introduces #Asynchronous annotation that lets you run some code in separate thread, managed by container. Howto here.
You should not create your own threads when you are in an EJB. Indeed, EJB 3.1 introduces #Asynchronous annotation for this purpose. Previous versions of WebSphere and WebLogic implement WorkManager, another ad-hoc standard (the JSR 237 for making it a standard was abandoned in the end) for worker threads that can be launched from an EJB.
Related
There is a Java EE application where we have batches of jobs to process. Processing involves calling an external service that has a limitation so that we can send only N number of requests concurrently. This bottleneck has to be implemented in our application logic and I am wondering how could we achieve this in the best way. Fortunately clustering is not a requirement, so we can confine the problem to a single server instance.
My first idea would be using an ExecutorService backed by a
ThreadPool with N working threads so that the ThreadPool object
would act as the regulator. Of course this is not an EE solution.
My second idea would be somehow configuring such a ThreadPool in
the container and using that, but I have not found any feature like
this so far.
The third idea is using a Semaphore(N) object in a #Singleton
EJB.
The fourth idea is somehow creating a limited pool of stateless
session beans and putting the limited-resource access in those. As
the bean number is managed by the container, the resource usage will
be limited as well
(To clarify: a general solution would be the best, but it is known that we're running on Glassfish 3.1.1 and maybe later on JBoss 6.x)
Could you suggest me a good architecture for this problem and/or comment on my ideas to help my decision?
Why don't you use Works? Have a look here for an overview of how to use Works in JBoss and Weblogic. I don't know about Glasshfish, I'll leave the research to you now ;)
In short, Works are EE compliant threads.
The canonical solution for concurrent message processing in Java EE is to use MDBs. You can limit the number of concurrently running tasks by limiting the MDB pool size.
Setting MDB Pool Size in Glassfish
JBoss 7 EJB3 Subsystem Configuration Guide
Okay. This is again a question of industry practice.
Tomcat = Web Container
JBoss, WebLogic, etc = Application Servers that have Web Container within (for JBoss, its forked Tomcat)
Spring does not need Application Server like JBoss. If we use enterprise services like JMS, etc we can use independent systems like RabbitMQ, ApacheMQ, etc.
Question is why do people still use JBoss and other Application Serves for purely spring based applications?
What are the advantages Spring can make use of, by using Application Servers? Like object pooling? What specific advantages does Application Server offers? How are those configured?
If not for spring, for what other purposes Application Servers are used for Spring/Hibernate, etc stack? (Use cases)
Actually I would say listening for JMS is probably the best reason for an application server. A stand alone message broker does not fix the problem since you still need a component that's listening for messages. The best way to do this is to use a MDB. In theory you can use Springs MessageListenerContainer. However this has several disadvantages like JMS only supports blocking reads and Spring therefore needs to spin up it's own threads which is totally unsupported (even in Tomcat) and can break transactions, security, naming (JNDI) and class loading (which in turn can break remoting). A JCA resource adapter is free to do whatever it wants including spinning up threads via WorkManager. Likely a database is used besides JMS (or another destination) at which point you need XA-transactions and JTA, in other words an application server. Yes you can patch this into servlet container but that this point it becomes indistinguishable from an application server.
IMHO the biggest reason against application servers is that it takes years after a spec is published (which in turn takes years as well) until severs implement the spec and have ironed out the worst bugs. Only now, right before EE 7 is about to be published do we have are EE 6 servers starting to appear that are not totally riddled with bugs. It gets comical to the point where some vendors do no longer fix bugs in their EE 6 line because they're already busy with the upcoming EE 7 line.
Edit
Long explanation of the last paragraph:
Java EE in a lot of places relies on what's called contextual information. Information that's not explicitly passed as an argument from the server/container to the application but implicitly "there". For example the current user for security checks. The current transaction or connection. The current application for looking up classes to lazily load code or deserialize objects. Or the current component (servlet, EJB, …) for doing JNDI look ups. All this information is in thread locals that the server/container sets before calling a component (servlet, EJB, …). If you create your own threads then the server/container doesn't know about them and all the features relying on this information don't work anymore. You might get away with this by just not using any of those features in threads you spawn.
Some links
http://www.oracle.com/technetwork/java/restrictions-142267.html#threads
http://www.ibm.com/developerworks/websphere/techjournal/0609_alcott/0609_alcott.html#spring-4
If we check the Servlet 3.0 specification we find:
2.3.3.3 Asynchronous processing
Java Enterprise Edition features such as Section 15.2.2, “Web Application Environment” on page 15-174 and Section 15.3.1, “Propagation of Security Identity in EJBTM Calls” on page 15-176 are available only to threads executing the initial request or when the request is dispatched to the container via the AsyncContext.dispatch method. Java Enterprise Edition features may be available to other threads operating directly on the response object via the AsyncContext.start(Runnable) method.
This is about asynchronous processing but the same restrictions apply for custom threads.
public void start(Runnable r) - This method causes the container to dispatch a thread, possibly from a managed thread pool, to run the specified Runnable. The container may propagate appropriate contextual information to the Runnable.
Again, asynchronous processing but the same restrictions apply for custom threads.
15.2.2 Web Application Environment
This type of servlet container should support this behavior when performed on threads created by the developer, but are not currently required to do so. Such a requirement will be added in the next version of this specification. Developers are cautioned that depending on this capability for application-created threads is not recommended, as it is non-portable.
Non-portable means it can may in one server but not in an other.
When you want do receive messages with JMS outside of an MDB you can use four methods on javax.jms.MessageConsumer:
#receiveNoWait() you can to this in a container thread, it doesn't block, but it's like peeking. If no message is present it just returns null. This isn't very well suited for listening to messages.
#receive(long) you can to this in a container thread, it does block. You generally don't wan't to do blocking waits in a container thread. Again not very well suited for listening to messages.
#receive(), this blocks possibly indefinitely. Again not very well suited for listening to messages.
#setMessageListener() this is what you want, your get a callback when a message arrives. However unless the library can hook into the application server this won't be a container thread. The hooks into the application server are only available via JCA to resource adapters.
So yes, it may work, but it's not guaranteed and there are a lot of things that may break.
You are right that you don't really need a true application server (implementing all Java EE specs) to use Spring. The biggest reason people don't use true Java EE apps like JBoss is that then have been slow as #$##% on cold start up time making development a pain (hot deploy still doesn't work that well).
You see there are two camps:
Java EE
Spring Framework.
One of the camps believes in the spec/committee process and the other believes in benevolent dictator / organic OSS process. Both have people with their "agendas".
Your probably not going to get a very good unbiased answer as these two camps are much like the Emacs vs VIM war.
Answer your questions w/ a Spring bias
Because it in theory buys your less vendor lock-in (albeit I have found this to be the opposite).
Spring's biggest advantage is AspectJ AOP. By far.
I guess see Philippe's answer.
(start of rant)
Since #PhilippeMarschall defended Java EE I will say that I have done the Tomcat+RabbitMQ+Spring route and it works quite well. #PhilippeMarschall discussion is valid if you want proper JTA+JMS but with proper setup with Sprig AMQP and an a good transactional database like Postgresql this is less of an issue. Also he is incorrect about the message queue transactions not being bound/synchronized to the platform transactions as Spring supports this (and IMHO much more elegantly with #Transactional AOP). Also AMQP is just plain superior to JMS.
(end of rant)
We are using JBoss over tomcat for the JNDI data sources and pooling.. It makes it so the programmer don't have to know anything about the database but its JNDI name
I'd like to have a Spring-managed bean schedule execution of itself (or some other bean, simple factoring) if certain conditions are met (i.e. checking successul startup etc.)
I'd also like to be able to see and control the timer from within the application, which will be running on a Java EE 5-compliant container.
Not sure how best to do this - I know about the dangers of doing thread management myself in an EE environment.
You could have a base class that is a wrapper to schedule background tasks (could be e.g. an Executor or TimerTask) and be parameterized by the timing intervals or even the task to schedule and you could derive more specific classes specialized on certain tasks.
These you would configure/instantiate via Spring configuration and of course your app could modify these via the properties of the classes/beans.
Concerning thread management, I also had concerns regarding threads since JavaEE specs (I believe specifically EJB specs) disallow it but this perhaps depends on the container. For example in Tomcat which of course is not a fully EE container, I never had issue with my own threads.
You don't mention which container you are interested in.
Also (friends here can correct if I am wrong) my understanding is that threads are disallowed e.g. in EJB containers etc if you access various resources handled by the container threads.
So if you only want to do some e.g. sanity checks (checking succesful startup) and similar, I don't think that this would be an issue. But this is MHO. I am not sure to be honest
Probably a repeat! I am using Tomcat as my server and want to know what is best way to spawn threads in the servlet with deterministic outcomes. I am running some long running updates from a servlet action and would like for the request to complete and the updates to happen in the background. Instead of adding a messaging middleware like RabbitMQ, I thought I could spawn a thread that could run in the background and finish in its own time. I read in other SO threads that the server terminates threads spawned by the server in order for it to manage resources well.
Is there a recommended way of spawning threads, background jobs when using Tomcat. I also use Spring MVC for the application.
In a barebones servletcontainer like Tomcat or Jetty, your safest bet is using an applicaton wide thread pool with a max amount of threads, so that the tasks will be queued whenever necessary. The ExecutorService is very helpful in this.
Upon application startup or servlet initialization use the Executors class to create one:
executor = Executors.newFixedThreadPool(10); // Max 10 threads.
Then during servlet's service (you could ignore the result for the case that you aren't interested, or store it in the session for later access):
Future<ReturnType> result = executor.submit(new YourTask(yourData));
Where YourTask must implement Runnable or Callable and can look something like this, whereby yourData is just your data, e.g. populated with request parameter values (just keep in mind that you should absolutely not pass Servlet API artifacts such as HttpServletRequest or HttpServletResponse along!):
public class YourTask implements Runnable {
private YourData yourData;
public YourTask(YourData yourData) {
this.yourData = yourData;
}
#Override
public void run() {
// Do your task here based on your data.
}
}
Finally, during application's shutdown or servlet's destroy you need to explicitly shutdown it, else the threads may run forever and prevent the server from properly shutting down.
executor.shutdownNow(); // Returns list of undone tasks, for the case that.
In case you're actually using a normal JEE server such as WildFly, Payara, TomEE, etc, where EJB is normally available, then you can simply put #Asynchronous annotation on an EJB method which you invoke from the servlet. You can optionally let it return a Future<T> with AsyncResult<T> as concrete value.
#Asynchronous
public Future<ReturnType> submit() {
// ... Do your job here.
return new AsyncResult<ReturnType>(result);
}
see also:
Using special auto start servlet to initialize on startup and share application data
How to run a background task in a servlet based web application?
Is it safe to manually start a new thread in Java EE?
You could maybe use a CommonJ WorkManager (JSR 237) implementation like Foo-CommonJ:
CommonJ − JSR 237 Timer & WorkManager
Foo-CommonJ is a JSR 237 Timer and
WorkManager implementation. It is
designed to be used in containers that
do not come with their own
implementation – mainly plain servlet
containers like Tomcat. It can also be
used in fully blown Java EE applications
servers that do not have a WorkManager
API or have a non-standard API like
JBoss.
Why using WorkManagers?
The common use case is that a Servlet
or JSP needs to aggregate data from
multiple sources and display them in
one page. Doing your own threading a
managed environement like a J2EE
container is inappropriate and should
never be done in application level
code. In this case the WorkManager API
can be used to retrieve the data in
parallel.
Install/Deploy CommonJ
The deployment of JNDI resources
vendor dependant. This implementation
comes with a Factory class that
implements the
javax.naming.spi.ObjectFactory
interface with makes it easily
deployable in the most popular
containers. It is also available as a
JBoss service. more...
Update: Just to clarify, here is what the Concurrency Utilities for Java EE Preview (looks like this is the successor of JSR-236 & JSR-237) writes about unmanaged threads:
2.1 Container-Managed vs. Unmanaged Threads
Java EE application servers
require resource management in order
to centralize administration and
protect application components from
consuming unneeded resources. This can
be achieved through the pooling of
resources and managing a resource’s
lifecycle. Using Java SE concurrency
utilities such as the
java.util.concurrency API,
java.lang.Thread and
java.util.Timer in a server
application component such as a
servlet or EJB are problematic since
the container and server have no
knowledge of these resources.
By extending the
java.util.concurrent API,
application servers and Java EE
containers can become aware of the
resources that are used and provide
the proper execution context for the
asynchronous operations to run with.
This is largely achieved by providing
managed versions of the predominant
java.util.concurrent.ExecutorService
interfaces.
So nothing new IMO, the "old" problem is the same, unmanaged thread are still unmanaged threads:
They are unknown to the application server and do not have access to Java EE contextual information.
They can use resources on the back of the application server, and without any administration ability to control their number and resource usage, this can affect the application server's ability to recover resources from failure or to shutdown gracefully.
References
Concurrency Utilities for Java EE interest site
Concurrency Utilities for Java EE Preview (PDF)
I know it is an old question, but people keep asking it, trying to do this kind of thing (explicitly spawning threads while processing a servlet request) all the time... It is a very flawed approach - for more than one reason... Simply stating that Java EE containers frown upon such practice is not enough, although generally true...
Most importantly, one can never predict how many concurrent requests the servlet will be receiving at any given time. A web application, a servlet, by definition, is meant to be capable of processing multiple requests on the given endpoint at a time. If you are programming you request processing logic to explicitly launch a certain number of concurrent threads, you are risking to face an all but inevitable situation of running out of available threads and choking your application. Your task executor is always configured to work with a thread pool that is limited to a finite reasonable size. Most often, it is not larger than 10-20 (you don't want too many threads executing your logic - depending on the nature of the task, resources they compete for, the number of processors on your server, etc.) Let's say, your request handler (e.g. MVC controller method) invokes one or more #Async-annotated methods (in which case Spring abstracts the task executor and makes things easy for you) or uses the task executor explicitly. As your code executes it starts grabbing the available threads from the pool. That's fine if you are always processing one request at a time with no immediate follow-up requests. (In that case, you are probably trying to use the wrong technology to solve your problem.) However, if it is a web application that is exposed to arbitrary (or even known) clients who may be hammering the endpoint with requests, you will quickly deplete the thread pool, and the requests will start piling up, waiting for threads to be available. For that reason alone, you should realize that you may be on a wrong path - if you are considering such design.
A better solution may be to stage the data to be processed asynchronously (that could be a queue, or any other type of a temporary/staging data store) and return the response. Have an external, independent application or even multiple instances of it (deployed outside your web container) poll the staging endpoint(s) and process the data in the background, possibly using a finite number of concurrent threads. Not only such solution will give you the advantage of asynchronous/concurrent processing, but will also scale because you will be able to run as many instances of such poller as you need, and they can be distributed, pointing to the staging endpoint.
HTH
Spring supports asynchronous task (in your case long running) through spring-scheduling. Instead of using Java threads direct I suggest to use it with Quartz.
Recourses:
Spring reference: Chapter 23
Strictly speaking, you're not allowed to spawn threads according to the Java EE spec. I would also consider the possibility of a denial of service attack (deliberate or otherwise) if multiple requests come in at once.
A middleware solution would definitely be more robust and standards-compliant.
I have a Java EE application that has two components: First is a service that scrapes some information from internet and fills it into database. Second is a web interface (deployed on tomcat) from where user can browse that information.
What could be the best approach to implement the first component? Should it be run as a background Daemon/Service or a thread within the container?
I would personally separate them into different processes. Aside from anything else, it means you can restart one without worrying about the other. It also means you can really easily deploy them on different machines without pointlessly installing Tomcat for a service which doesn't actually need a web interface.
Depending on the type of application framework, Spring lets you use Quartz or the java.util.concurrent framework. Spring has a TaskExecutor abstraction (see the Spring documentation) which simplifies a lot of this, but check to see which fits best with your design.
Spring or Quartz (managed by Spring) then controls the creation and starting/stopping of Threads or Executors or Jobs, along with their frequency/period and other scheduling parameters, and also manages any pooling of jobs you might require.
I use these for all my background tasks and batch jobs in any Java EE applications I write with no problems. Since the jobs are Spring managed POJOs, they have access to the full dependency injection framework and so on that that Spring entails, and of course you can switch between scheduler frameworks with a simple change to you application configuration XML file as your needs change or scale.
There is nothing wrong with having background jobs inside a web container, but you MUST let the web container know about it so it can be stopped and started properly.
Have a look at the load-on-startup tag in web.xml. There are some advice on http://wiki.metawerx.net/wiki/Web.xml.LoadOnStartup