I'm creating a web-service written in Java and hosted on JBoss AS. I'm not a professional in web-service design yet but do I get it correctly and each call to the service initiates a new thread and not a new process? Does it make sense to have synchronized methods in my service? I need to have a method which is invoked only for one user at a time not simultaneously for multiple.
Yes, requests are handled by individual handler threads. There is a single process for all of JBoss.
Synchronization can be problematic if your application ends up getting hosted across multiple nodes in a cluster. The locks won't propagate across multiple JVMs without the help of some magic like Terracotta. For a simple solution you can use a pessimistic row lock in your database to control access. One would of course be inclined to challenge the entire design that requires a blocking method and look for an alternative that can run in parallel.
Also, Locks from the java.util.concurrent package are preferred to the synchronized keyword if you are going that route.
Related
I'm developing an MVC spring web app, and I would like to store the actions of my users (what they click on, etc.) in a database for offline analysis. Let's say an action is a tuple (long userId, long actionId, Date timestamp). I'm not specifically interested in the actions of my users, but I take this as an example.
I expect a lot of actions by a lot of (different) users par minutes (seconds). Hence the processing time is crucial.
In my current implementation, I've defined a datasource with a connection pool to store the actions in a database. I call a service from the request method of a controller, and this service calls a DAO which saves the action into the database.
This implementation is not efficient because it waits that the call from the controller and all the way down to the database is done to return the response to the user. Therefore I was thinking of wrapping this "action saving" into a thread, so that the response to the user is faster. The thread does not need to be finished to get the reponse.
I've no experience in these massive, concurrent and time-critical applications. So any feedback/comments would be very helpful.
Now my questions are:
How would you design such system?
would you implement a service and then wrap it into a thread called at every action?
What should I use?
I checked spring Batch, and this JobLauncher, but I'm not sure if it is the right thing for me.
What happen when there are concurrent accesses at the controller, the service, the DAO and the datasource level?
In more general terms, what are the best practices for designing such applications?
Thank you for your help!
Take a singleton object # apps level and update it with every user action.
This singleton object should have a Hashmap as generic, which should get refreshed periodically say after it reached a threshhold level of 10000 counts and save it to DB, as a spring batch.
Also, periodically, refresh it / clean it upto the last no.# of the records everytime it processed. We can also do a re-initialization of the singleton instance , weekly/ monthly. Remember, this might lead to an issue of updating the same in case, your apps is deployed into multiple JVM. So, you need to implement the clone not supported exception in singleton.
Here's what I did for that :
Used aspectJ to mark all the actions of the user I wanted to collect.
Then I sent this to log4j with an asynchronous dbAppender...
This lets you turn it on or off with log4j logging level.
works perfectly.
If you are interested in the actions your users take, you should be able to figure that out from the HTTP requests they send, so you might be better off logging the incoming requests in an Apache webserver that forwards to your application server. Putting a cluster of web servers in front of application servers is a typical practice (they're good for serving static content) and they are usually logging requests anyway. That way the logging will be fast, your application will not have to deal with it, and the biggest work will be writing a script to slurp the logs into a database where you can do analysis.
Typically it is considered bad form to spawn your own threads in a Java EE application.
A better approach would be to write to a local queue via JMS and then have a separate component, e.g., a message driven bean (pretty easy with EJB or Spring) which persists it to the database.
Another approach would be to just write to a log file and then have a process read the log file and write to the database once a day or whenever.
The things to consider are: -
How up-to-date do you need the information to be?
How critical is the information, can you lose some?
How reliable does the order need to be?
All of these will factor into how many threads you have processing your queue/log file, whether you need a persistent JMS queue and whether you should have the processing occur on a remote system to your main container.
Hope this answers your questions.
I have five separate java processes; which are running as business logic modules. I would like to develop my process management application were i can start/ping/monitor/message child processes.
Also, it maybe sharing resources like cache etc with child processes over rest-ws or worst case rmi calls since requires additional overhead.
I was inclined toward webservice based api, which will keep sending information about business logic running within processes. The processes can be data churning, computation, notification process engines.
Any ideas?
One option is to use JMX, and publish one or more MBeans. Oracle has documentation on it. You can use the request information from the processes, or to send them signals to change their behavior.
The bare bones outline of what you would do is decide what methods you need to expose remotely in each of your child processes. Each of them should define an interface with those methods, then an implementation of that interface. Then those implementations need to be registered with the MBeanServer.
The advantage of this approach is that you will immediately get a bare-bones 'management application', because you can open JConsole against your processes and use the MBeans. If you then wish to create a separate application that will more cleanly present your data, you can do so at your leisure, without changing the child processes.
This approach does not really get you anyway to 'sharing a cache', but sharing a cache between processes (or machines) should really be a separate question (I think).
In my current work, for a use case we are making several remote service calls (SOAP over HTTP) in sequence. These are independent calls and I have to collate the data from each call and finally prepare my response. I want to parallelize these calls.
Sounds like you should use an ExecutorService.
Make a class that performs your query and implements Runnable. You can then submit instances of this class to an Executor and it will look after running this in multiple threads (pooling etc. - all configurable). You get given back a Future object for each submission and you simply call get() on that to get your result.
The framework means you don't have to worry about instantiating threads, setting up pooling, determining what's run etc.
Here's the tutorial.
Spawning threads in Java EE is a no-go we're told. However, the OP doesn't say whether Java EE or Java SE is used.
For Java EE the WorkManager API may be useful.
Other than that yes, ExecutorService or Spring TaskScheduler (rather unlikely if I got the problem right).
If my web application and ejb application are on the same machine (on same JVM) and all the ejb calls are local calls , will the use of ThreadLocal create any issue while passing information from web to ejb?
Any workaround if the ejb calls are remote? Will ThreadLocal information be available from web application to ejb application? Is use of ThreadLocal advisable in such scenario?
For the first question, there is no problem as long as you remove the ThreadLocal variables at the end of every call. This is important because containers (servlet or ejb) typically use threadpools and therefore reuse threads, this has two effects: one "call" may see threadlocal info coming from a previous call, and if you remove an app from the container without stopping the JVM some classes may not be garbage collected because they are still referenced by a container thread. So put data in a threadlocal in a try / finally block and remove in the finally part.
Here is a post showing one way to handle the problem: ThreadLocal in web applications
For the second question as data is threadlocal it will not come with a remote call, you have to add a parameter to your interfaces, extract threadlocal data on one side and recreate it on the other side...
When using EJB 3.1 you can pass around contextual information in the EJBContext using its context data. This is just a Map<String,Object>.
ThreadLocal shouldn't be used in EJB contexts. One cannot guarantee that the EJB method invocations are all on the same thread (of course the should be).
In EJB there is a different approach call TransactionSyncrhonizationRegistry. See Explanation/Usage for further details.
all the ejb calls are local calls , will the use of ThreadLocal create
any issue while
No, you answered your question yourself. Since calls are local they are executed in the context of one thread.
Any workaround if the ejb calls are remote?
In case of remote calls, the Java EE container will be run in an other JVM, it will spawn its own threads to handle incoming RMI request, there is no way for a remote Java EE container to know about thread local variables that were declared on the other side. Pass it as a parameter object.
It depends what information you are passing! The first question it too generic. I suggest to read the JavaDoc related to ThreadLocal here.
ThreadLocal lives from the server side of the application and are used to let Thread-safe the calls of your Thread objects.
For local calls, the ThreadLocal should work fine, as long as everything is done in the same thread.
For remote calls, which can potentially run on a different server, you will need to come up with something else. Either pass all values as parameters (which will work, but introduces complexity in the code) or use something like a distributed cache, e.g. Hazelcast, which will function like a global HashMap, which all cluster nodes have access to.
ThreadLocal cannot be used with 100% certainty in web applications. You simply do not have the guarantee that one thread will be used for one session. In my point of view this can get a very hard to find security hole!
ctx.getContextData() does not work for me, it always returns null!
I also tried TransactionSynchronizationRegistry, but I get null as well.
The only thing that worked is using JAAS as a workaround.But it is not a nice solution.
When do we need single threded model in webapp while designing web application in java.
The single-threaded model should almost always be avoided. (I'm assuming you're talking about the SingleThreadModel interface.) Basically it was introduced in an attempt to save people from having to think about concurrency, but it was a bad idea. Concurrency is inherent in web applications - introducing a bottleneck like the single threaded model is the wrong solution. The right solution is to educate developers about concurrency better, and introduce better building blocks for handling it.
The interface is deprecated as of the Java Servlet API 2.4, with this note:
Note that SingleThreadModel does not
solve all thread safety issues. For
example, session attributes and static
variables can still be accessed by
multiple requests on multiple threads
at the same time, even when
SingleThreadModel servlets are used.
It is recommended that a developer
take other means to resolve those
issues instead of implementing this
interface, such as avoiding the usage
of an instance variable or
synchronizing the block of the code
accessing those resources. This
interface is deprecated in Servlet API
version 2.4.
When your Servlet has state (which is a bad idea) and you want to prevent multiple requests in stepping on their own toes (or data).
I would recommend you avoid it because at some point you will mess something up. Also, performance drops like a brick.
The single thread model for servlets is used to signal that the servlet cannot handle multiple concurrent threads from client connections. Setting a servlet to the single threading model results in the servlet container (application server) to create a servlet instance per client.
It is best practice not to use the single thread model for servlets. Data kept per client connection is typically stored in the client Session object.