i'm running jboss AS 4.2 with java 1.6 and my problem is as follows:
our application is distributed as an ear and a war containing some servlets.
these servlets call the needed EJBs which do some processing and then return the result.
i'm looking for a way to determine a timeout, which in case the processing takes too much time, and it'll just return and end the processing (the processing involves several EJBs).
i can use ThreadLocal and add some code that will check at the beginning of each method if it exceeded the time - but is there some other mechanism i can use without adding such code to my application?
any idea / reference will be good.
There might be different ways, but very first thing comming to my mind based on your requiremens are interceptors.
So that you would intercept all your problematic (possibly long taking calls) and throw exception in case they exceeded the time.
And yes, threadlocal could do the job for keeping the start time (In case you don't use asynchronous stateless session bean calls).
To get some idea, of what interceptors are capable of, see: http://www.adam-bien.com/roller/abien/entry/interceptors_ejb_3_for_absolute
The main advantage of this solution is that you would have clear separation of timeout functionality (in Interceptor) from your bussiness logic (in EJBs).
Related
I'm developing an MVC spring web app, and I would like to store the actions of my users (what they click on, etc.) in a database for offline analysis. Let's say an action is a tuple (long userId, long actionId, Date timestamp). I'm not specifically interested in the actions of my users, but I take this as an example.
I expect a lot of actions by a lot of (different) users par minutes (seconds). Hence the processing time is crucial.
In my current implementation, I've defined a datasource with a connection pool to store the actions in a database. I call a service from the request method of a controller, and this service calls a DAO which saves the action into the database.
This implementation is not efficient because it waits that the call from the controller and all the way down to the database is done to return the response to the user. Therefore I was thinking of wrapping this "action saving" into a thread, so that the response to the user is faster. The thread does not need to be finished to get the reponse.
I've no experience in these massive, concurrent and time-critical applications. So any feedback/comments would be very helpful.
Now my questions are:
How would you design such system?
would you implement a service and then wrap it into a thread called at every action?
What should I use?
I checked spring Batch, and this JobLauncher, but I'm not sure if it is the right thing for me.
What happen when there are concurrent accesses at the controller, the service, the DAO and the datasource level?
In more general terms, what are the best practices for designing such applications?
Thank you for your help!
Take a singleton object # apps level and update it with every user action.
This singleton object should have a Hashmap as generic, which should get refreshed periodically say after it reached a threshhold level of 10000 counts and save it to DB, as a spring batch.
Also, periodically, refresh it / clean it upto the last no.# of the records everytime it processed. We can also do a re-initialization of the singleton instance , weekly/ monthly. Remember, this might lead to an issue of updating the same in case, your apps is deployed into multiple JVM. So, you need to implement the clone not supported exception in singleton.
Here's what I did for that :
Used aspectJ to mark all the actions of the user I wanted to collect.
Then I sent this to log4j with an asynchronous dbAppender...
This lets you turn it on or off with log4j logging level.
works perfectly.
If you are interested in the actions your users take, you should be able to figure that out from the HTTP requests they send, so you might be better off logging the incoming requests in an Apache webserver that forwards to your application server. Putting a cluster of web servers in front of application servers is a typical practice (they're good for serving static content) and they are usually logging requests anyway. That way the logging will be fast, your application will not have to deal with it, and the biggest work will be writing a script to slurp the logs into a database where you can do analysis.
Typically it is considered bad form to spawn your own threads in a Java EE application.
A better approach would be to write to a local queue via JMS and then have a separate component, e.g., a message driven bean (pretty easy with EJB or Spring) which persists it to the database.
Another approach would be to just write to a log file and then have a process read the log file and write to the database once a day or whenever.
The things to consider are: -
How up-to-date do you need the information to be?
How critical is the information, can you lose some?
How reliable does the order need to be?
All of these will factor into how many threads you have processing your queue/log file, whether you need a persistent JMS queue and whether you should have the processing occur on a remote system to your main container.
Hope this answers your questions.
If my web application and ejb application are on the same machine (on same JVM) and all the ejb calls are local calls , will the use of ThreadLocal create any issue while passing information from web to ejb?
Any workaround if the ejb calls are remote? Will ThreadLocal information be available from web application to ejb application? Is use of ThreadLocal advisable in such scenario?
For the first question, there is no problem as long as you remove the ThreadLocal variables at the end of every call. This is important because containers (servlet or ejb) typically use threadpools and therefore reuse threads, this has two effects: one "call" may see threadlocal info coming from a previous call, and if you remove an app from the container without stopping the JVM some classes may not be garbage collected because they are still referenced by a container thread. So put data in a threadlocal in a try / finally block and remove in the finally part.
Here is a post showing one way to handle the problem: ThreadLocal in web applications
For the second question as data is threadlocal it will not come with a remote call, you have to add a parameter to your interfaces, extract threadlocal data on one side and recreate it on the other side...
When using EJB 3.1 you can pass around contextual information in the EJBContext using its context data. This is just a Map<String,Object>.
ThreadLocal shouldn't be used in EJB contexts. One cannot guarantee that the EJB method invocations are all on the same thread (of course the should be).
In EJB there is a different approach call TransactionSyncrhonizationRegistry. See Explanation/Usage for further details.
all the ejb calls are local calls , will the use of ThreadLocal create
any issue while
No, you answered your question yourself. Since calls are local they are executed in the context of one thread.
Any workaround if the ejb calls are remote?
In case of remote calls, the Java EE container will be run in an other JVM, it will spawn its own threads to handle incoming RMI request, there is no way for a remote Java EE container to know about thread local variables that were declared on the other side. Pass it as a parameter object.
It depends what information you are passing! The first question it too generic. I suggest to read the JavaDoc related to ThreadLocal here.
ThreadLocal lives from the server side of the application and are used to let Thread-safe the calls of your Thread objects.
For local calls, the ThreadLocal should work fine, as long as everything is done in the same thread.
For remote calls, which can potentially run on a different server, you will need to come up with something else. Either pass all values as parameters (which will work, but introduces complexity in the code) or use something like a distributed cache, e.g. Hazelcast, which will function like a global HashMap, which all cluster nodes have access to.
ThreadLocal cannot be used with 100% certainty in web applications. You simply do not have the guarantee that one thread will be used for one session. In my point of view this can get a very hard to find security hole!
ctx.getContextData() does not work for me, it always returns null!
I also tried TransactionSynchronizationRegistry, but I get null as well.
The only thing that worked is using JAAS as a workaround.But it is not a nice solution.
Probably a repeat! I am using Tomcat as my server and want to know what is best way to spawn threads in the servlet with deterministic outcomes. I am running some long running updates from a servlet action and would like for the request to complete and the updates to happen in the background. Instead of adding a messaging middleware like RabbitMQ, I thought I could spawn a thread that could run in the background and finish in its own time. I read in other SO threads that the server terminates threads spawned by the server in order for it to manage resources well.
Is there a recommended way of spawning threads, background jobs when using Tomcat. I also use Spring MVC for the application.
In a barebones servletcontainer like Tomcat or Jetty, your safest bet is using an applicaton wide thread pool with a max amount of threads, so that the tasks will be queued whenever necessary. The ExecutorService is very helpful in this.
Upon application startup or servlet initialization use the Executors class to create one:
executor = Executors.newFixedThreadPool(10); // Max 10 threads.
Then during servlet's service (you could ignore the result for the case that you aren't interested, or store it in the session for later access):
Future<ReturnType> result = executor.submit(new YourTask(yourData));
Where YourTask must implement Runnable or Callable and can look something like this, whereby yourData is just your data, e.g. populated with request parameter values (just keep in mind that you should absolutely not pass Servlet API artifacts such as HttpServletRequest or HttpServletResponse along!):
public class YourTask implements Runnable {
private YourData yourData;
public YourTask(YourData yourData) {
this.yourData = yourData;
}
#Override
public void run() {
// Do your task here based on your data.
}
}
Finally, during application's shutdown or servlet's destroy you need to explicitly shutdown it, else the threads may run forever and prevent the server from properly shutting down.
executor.shutdownNow(); // Returns list of undone tasks, for the case that.
In case you're actually using a normal JEE server such as WildFly, Payara, TomEE, etc, where EJB is normally available, then you can simply put #Asynchronous annotation on an EJB method which you invoke from the servlet. You can optionally let it return a Future<T> with AsyncResult<T> as concrete value.
#Asynchronous
public Future<ReturnType> submit() {
// ... Do your job here.
return new AsyncResult<ReturnType>(result);
}
see also:
Using special auto start servlet to initialize on startup and share application data
How to run a background task in a servlet based web application?
Is it safe to manually start a new thread in Java EE?
You could maybe use a CommonJ WorkManager (JSR 237) implementation like Foo-CommonJ:
CommonJ − JSR 237 Timer & WorkManager
Foo-CommonJ is a JSR 237 Timer and
WorkManager implementation. It is
designed to be used in containers that
do not come with their own
implementation – mainly plain servlet
containers like Tomcat. It can also be
used in fully blown Java EE applications
servers that do not have a WorkManager
API or have a non-standard API like
JBoss.
Why using WorkManagers?
The common use case is that a Servlet
or JSP needs to aggregate data from
multiple sources and display them in
one page. Doing your own threading a
managed environement like a J2EE
container is inappropriate and should
never be done in application level
code. In this case the WorkManager API
can be used to retrieve the data in
parallel.
Install/Deploy CommonJ
The deployment of JNDI resources
vendor dependant. This implementation
comes with a Factory class that
implements the
javax.naming.spi.ObjectFactory
interface with makes it easily
deployable in the most popular
containers. It is also available as a
JBoss service. more...
Update: Just to clarify, here is what the Concurrency Utilities for Java EE Preview (looks like this is the successor of JSR-236 & JSR-237) writes about unmanaged threads:
2.1 Container-Managed vs. Unmanaged Threads
Java EE application servers
require resource management in order
to centralize administration and
protect application components from
consuming unneeded resources. This can
be achieved through the pooling of
resources and managing a resource’s
lifecycle. Using Java SE concurrency
utilities such as the
java.util.concurrency API,
java.lang.Thread and
java.util.Timer in a server
application component such as a
servlet or EJB are problematic since
the container and server have no
knowledge of these resources.
By extending the
java.util.concurrent API,
application servers and Java EE
containers can become aware of the
resources that are used and provide
the proper execution context for the
asynchronous operations to run with.
This is largely achieved by providing
managed versions of the predominant
java.util.concurrent.ExecutorService
interfaces.
So nothing new IMO, the "old" problem is the same, unmanaged thread are still unmanaged threads:
They are unknown to the application server and do not have access to Java EE contextual information.
They can use resources on the back of the application server, and without any administration ability to control their number and resource usage, this can affect the application server's ability to recover resources from failure or to shutdown gracefully.
References
Concurrency Utilities for Java EE interest site
Concurrency Utilities for Java EE Preview (PDF)
I know it is an old question, but people keep asking it, trying to do this kind of thing (explicitly spawning threads while processing a servlet request) all the time... It is a very flawed approach - for more than one reason... Simply stating that Java EE containers frown upon such practice is not enough, although generally true...
Most importantly, one can never predict how many concurrent requests the servlet will be receiving at any given time. A web application, a servlet, by definition, is meant to be capable of processing multiple requests on the given endpoint at a time. If you are programming you request processing logic to explicitly launch a certain number of concurrent threads, you are risking to face an all but inevitable situation of running out of available threads and choking your application. Your task executor is always configured to work with a thread pool that is limited to a finite reasonable size. Most often, it is not larger than 10-20 (you don't want too many threads executing your logic - depending on the nature of the task, resources they compete for, the number of processors on your server, etc.) Let's say, your request handler (e.g. MVC controller method) invokes one or more #Async-annotated methods (in which case Spring abstracts the task executor and makes things easy for you) or uses the task executor explicitly. As your code executes it starts grabbing the available threads from the pool. That's fine if you are always processing one request at a time with no immediate follow-up requests. (In that case, you are probably trying to use the wrong technology to solve your problem.) However, if it is a web application that is exposed to arbitrary (or even known) clients who may be hammering the endpoint with requests, you will quickly deplete the thread pool, and the requests will start piling up, waiting for threads to be available. For that reason alone, you should realize that you may be on a wrong path - if you are considering such design.
A better solution may be to stage the data to be processed asynchronously (that could be a queue, or any other type of a temporary/staging data store) and return the response. Have an external, independent application or even multiple instances of it (deployed outside your web container) poll the staging endpoint(s) and process the data in the background, possibly using a finite number of concurrent threads. Not only such solution will give you the advantage of asynchronous/concurrent processing, but will also scale because you will be able to run as many instances of such poller as you need, and they can be distributed, pointing to the staging endpoint.
HTH
Spring supports asynchronous task (in your case long running) through spring-scheduling. Instead of using Java threads direct I suggest to use it with Quartz.
Recourses:
Spring reference: Chapter 23
Strictly speaking, you're not allowed to spawn threads according to the Java EE spec. I would also consider the possibility of a denial of service attack (deliberate or otherwise) if multiple requests come in at once.
A middleware solution would definitely be more robust and standards-compliant.
I'm refactoring a big piece of code atm where a long taking operation is executed in a servlet. Now sometimes I don't get a response after the operation has finished. (It has finished because it is printed into the logs)
What I wish to achieve would some "fire and forget" behavior by the servlet. I would pass my params to the action and the servlet would immediately return a status (something like: the operation has started, check your logs for further info)
Is this possible with servlet 2.5 spec? I think I could get such a behavior with JMS maybe any other solutions out there?
Asynchronous Servlets would serve your purpose but it is available only as part of Servlet 3.0 spec. You could read more about Async Servlets here
There are a couple of ways of doing this. Asynchronous servlets are part of the Servlet api 3.0. I've known a lot of people that would fire off a separate thread, usually a daemon thread. The drawback to spawning your own threads is that you lose any "container" advantages you might have, since the thread runs more or less independently within the JVM. What I've used most often is a message driven bean fed by JMS, it runs in the EJB container with all those attendant advantages and disadvantages. YMMV.
Instead of starting (and managing) your own treads you should consider using Java's ExecutorService abstraction (Executor/Future framework). If you're using Spring you can define Executor as just another bean in Spring's context and your servlet could just call it passing your task as instance of Runnable. There should be plenty of samples if you Google it.
If upgrading to Servlet 3.0 (part of Java EE 6, with as far Glassfish v3 as the only implementation; Tomcat 7 is still on its way and expected about next month) is not an option, then an alternative is Comet. Almost all Java servletcontainers has facilities for this. It's unclear which one you're using, so here's a Tomcat 6 targeted document: What is the Apache Tomcat Comet API.
Alternatively, you can fire a separate Thread in a servlet so that the servlet method can directly return. You can eventually store the Thread in the session so that the status can be retained in the subsequent requests. If necessary let it implement HttpSessionBindingListener as well so that you can interrupt it whenever the session expires.
I want to start a background process in a Java EE (OC4J 10) environment. It seems wrong to just start a Thread with "new Thread" But I can't find a good way for this.
Using a JMS queue is difficult in my special case, since my parameters for this method call are not serializable.
I also thought about using an onTimeout Timer Method on a session bean but this does not allow me to pass parameters (as far as I know).
Is there any "canon" way to handle such a task, or do I just have to revert to "new Thread" or a java.concurrent.ThreadPool.
Java EE usually attempts to removing threading from the developers concerns. (It's success at this is a completely different topic).
JMS is clearly the preferred approach to handle this.
With most parameters, you have the option of forcing or faking serialization, even if they aren't serializable by default. Depending on the data, consider wrapping it in a serializable object that can reload the data. This will clearly depend on the parameter and application.
JMS is the Java EE way of doing this. You can start your own threads if the container lets you, but that does violate the Java EE spec (you may or may not care about this).
If you don't care about Java EE generic compliance (if you would in fact resort to threads rather than deal with JMS), the Oracle container will for sure have proprietary ways of doing this (such as the OracleAS Job Scheduler).
Don't know OCJ4 in detail but I used the Thread approach and a java.util.Timer approach to perform some task in a Tomcat based application. In Java 5+ there is an option to use one of the Executor services (Sheduled, Priority).
I don't know about the onTimeout but you could pass parameters around in the session itself, the app context or in a static variable (discouraged would some say). But the name tells me it is invoked when the user's session times out and you want to do some cleanup.
Using the JMS is the right way to do it, but it's heavier weight.
The advantage you get is that if you need multiple servers, one server or whatever, once the servers are configured, your "Threading" can now be distributed to multiple machines.
It also means you don't want to send a message for a truly trivial amount of work or with a massive amount of data. Choose your interface points well.
see here for some more info:
stackoverflow.com/questions/533783/why-spawning-threads-in-j2ee-container-is-discouraged
I've been creating threads in a container (Tomcat, JBoss) with no problem, but they were really simple queues, and I don't rely on clustering.
However, EJB 3.1 will introduce asynchronous invocation that you may find useful:
http://www.theserverside.com/tt/articles/article.tss?track=NL-461&ad=700869&l=EJB3-1Maturity&asrc=EM_NLN_6665442&uid=2882457
Java EE doesn't really forbid you to create your own threads, it's the EJB spec that says "unmanaged threads" arn't allowed. The reason is that these threads are unknown to the application server and therefore the container cannot manage things like security and transactions on these threads.
Nevertheless there are lots of frameworks out there that do create their own threads. For example Quartz, Axis and Spring. Changes are your already using one of these, so it's not that bad to create your own threads as long as you're aware of the consequences. That said I agree with the others that the use of JMS or JCA is preferred over manual thread creation.
By the way, OC4J allows you to create your own threads. However it doesn't allow JNDI lookups from these unmanaged threads. You can disable this restriction by specifying the -userThreads argument.
I come from a .NET background, and JMS seems quite heavy-weight to me. Instead, I recommend Quartz, which is a background-scheduling library for Java and JEE apps. (I used Quartz.NET in my ASP.NET MVC app with much success.)