Clustered event driven Java application - Should I use Websockets or polling? - java

I'm creating a monitor application that monitors the activities of a user. There are four elements in my system:
EventCatcher: The EventCatcher is responsible for catching all the events that happen in a subsystem and pushes the data to the EventHandler. Based from observation, there is an average of 10 events per second that is being pushed to the EventHandler. Some events are UserLogin, UserLogout.
EventHandler: The EventHandler is a singleton class that handles all the incoming events from the EventCatcher. It also keeps track of all the logged in users in the system. So, whenever the EventHandler receives a UserLogin event, the User object is extracted from the event and is stored in a HashMap. When a UserLogout event is received, that User object will be remove from the HashMap. This class also maintains a Set of all active Websocket sessions because everytime an event has occurred, I would want to inform all the open sessions that a particular event happened.
Websocket Endpoint: This is just a simple Java class annotated with #ServerEndpoint.
Clients: The system I will be building is for internal (company) use only. At production, at most, there will only be around 5 - 10 clients. All the clients will be receiving the same information every time an event has occurred.
So right now I am trying to convince my supervisor that Websockets is the way to go, however, my supervisor finds it really unnecessary because a simple polling solution would do the trick.
His points are:
We don't really need up-to-date information by the millisecond. We can poll every second.
If I was to maintain a list of open WebSocket sessions, how would that work in a clustered environment (we use a load balancer)
If I plan to send information to the client every time an event (UserLogin, UserLogout) has occurred, I should be able to just send small updates to all WebSocket sessions - meaning, I can't be sending a whole JSON dump of everything. So that means, for every WebSocket instance, I would have to maintain another Set of Users and properly maintain it to mirror the Set contained in the EventHandler.
What my supervisor suggests is that I lose the WebSocket and just convert it to a simple Servlet and let the clients poll every second to receive the entire JSON dump.
In this scenario, should I stick with WebSockets? Or should I just poll?
The main advantage, as far as I've read, of Websockets vs. polling is that by using Websockets, you will have a persistent connection from client to server. HTTP is not really meant for real-time data.
Also, polling requires sending an HTTP request every time and every request comes with HTTP headers. If an HTTP request header contains 800 bytes, then that's 48kb sent per minute per client. With a WebSocket, this isn't problem.
But then again, we won't really have a lot of active clients. We're not concerned about third parties sniffing our requests because this system is for company use only - internal use! And I believe my supervisor wants something simple and reliable.
I am fine with either way. I just want to be sure whether I'm using the right tool for the job.
Additional question: If WebSockets is the way to go, is there any reason why I should consider polling?

The entire purpose of WebSocket is to efficiently support continuing connections between client and server.
I’m not clear on how you are implementing your app. If this is a web app running in a Servlet environment leveraging WebSocket support in the web server, be aware that you need to use recent versions of the Servlet container. For example, with Tomcat you must use either version 8 or the latest updates to version 7.
And of course the web browser must have support for WebSocket.
Be aware that WebSocket is still a new technology that has been changing and evolving in both the specs and the implementations.
Atmosphere
You may want to consider using the Atmosphere framework. Atmosphere supports multiple techniques of Push including WebSocket & Comet.
The Vaadin web-app framework leverages Atmosphere to provide automatic support for Push in your app. By default, WebSocket is automatically attempted first. If WebSocket is not available, Vaadin+Atmosphere falls back automatically to the other techniques including polling.

Related

How to handle thousands of request in REST based web service at a time?

Is making REST based web service (POST) asynchronous is the best way to handle thousands of requests at one time (Keeping in mind that I have only single instance of server serving the request)?
Edited:
Jersey is wrongly tagged.
For eg: I have a rest based web service, which is supposed to be consumed by 100 thousand clients within a very short span of time (~60 seconds). I understand that if I am allowed to deploy multiple instance of the server, then I can use a load balancer to handle all my incoming request and delegate them accordingly. But I am restricted to use only single instance. What design could I opt within this restriction?
I could think of making the request asynchronous( which will not respond to client immediately ) in order to be able to let the server be free from this load and handle the requests at it's own pace.
For now we can ignore memory limitations.
Please let me know if this clarifies your doubt?
The term asynchronous could have different meanings in different places. For a web application code, it could refer to a Nonblocking I/O server such as Node or Netty/Akka which is a way for HTTP Requests to time multiplex on the same worker threads. If you're writing callbacks or using async or future constructs, it probably is non-blocking I/O which people sometimes refer to as asynchronous.
However, I could have REST API running on Node which implements non-blocking I/O, but the API or the overall architecture is still fully synchronous. For example, let's say I have an API endpoint POST /photos, which takes in a photo, creates image thumbnails, stores the URLs of the photo in a SQL Db and then stores the images in S3. The REST API could still block from the initial POST until after the image is processed and stored.
A second way is for the server to accept the photo process as a job and return immediately. Then the server could store the photo in a in memory or network based queue to be processed later by some other worker thread. In fact, I could even implement this async architecture even with a blocking server like some good old Java 7 and Jetty.

Restful Server Response triggered Via Client

This question might sound a bit abstract,answered (but did my search didn't stumble on a convenient answer) or not specific at all ,but I will try to provide as much information as I can.
I am building a mobile application which will gather and send sensory data to a remote server. The remote server will collect all these data in a mySQL database and make computations (not the mysql database ,another process/program) . What I wanna know is :
After some updates in the database , is it doable to send a response from a RESTful Server to a certain client (the one who like did the last update probably) ,using something like "a background thread"? Or this should be done via socket connection through server-client response?
Some remarks:
I am using javaEE, Spring MVC with hibernate and tomcat (cause I am familiar with the environment though in a more asynchronous manner).
I thought this would be a convenient way because the SQL schema is not much complicated and security and authentication issues are not needed (it's a prototype).
Also there is a front-end webpage that will have to visualize these data, so such a back-end system would look like a good option for getting the job done fast.
Lastly I saw this solution :
Is there a way to 'listen' for a database event and update a page in real time?
My issue is that besides the page I wanna update the client's side with messages from the RESTful server.
If all these above are unecessary and a more simple client-server application will prove better and less complex please be welcome to inform me.
Thank you in advance.
Generally you should upload your data to a resource on the server (e.g. POST /widgets and the server should immediately return with a 201 Created or (if creation is too slow and needs to happen later) 202 Accepted status. There are several approaches after that happens, each has their merits:
Polling - The server's response includes a location field which the client can then proceed to poll until a change happens (e.g. check for an update every second). This is the easiest approach and quite efficient if you use HTTP caching effectively and the average number of checks is relatively low.
Push notification - Server sends a push notification when the change happens, report's generated, etc. Obviously this requires you to store the client's details and their notification requirements. This is probably the cleanest approach and also easy to scale. In the case of Android (also iOS) you have free push notifications available via Google Cloud Messaging.
Set up a persistent connection between client and server, e.g. using a Websocket or low-level TCP connection. This should yield the fastest response times, but will probably be a drain on phone battery, harder to scale on the server, and more complex to code.

fire and forget compared to http request

I'm looking for opinion from you all. I have a web application that need to records data into another web application database. I not prefer to use HTTP request GET on 2nd application because of latency issue. I looking for fast way to save records on 2nd application quickly, I came across the idea of "fire and forget" , will JMS suit for this scenario? from my understanding JMS will guarantee message delivery, guarantee whether message will be 100% deliver is not important as long as can serve as many requests as possible. Let say I need to call at least 1000 random requests per seconds to 2nd application should I use JMS? HTTP request? or XMPP instead?
I think you're misunderstanding networking in general. There's positively no reason that a HTTP GET would have to be any slower than anything else, and if HTTP takes advantage of keep alives it's faster that most options.
JMX isn't a protocol, it's a specification that wraps many other protocols including, possibly, HTTP or XMPP.
In the end, at the levels where Java will operate, there's either UDP or TCP. TCP has more overhead by guarantees delivery (via retransmission) and ordering. UDP offers neither guaranteed delivery nor in-order delivery. If you can deal with UDP's limitations you'll find it "faster", and if you can't then any lightweight TCP wrapper (of which HTTP is one) is just about the same.
Your requirements seem to be:
one client and one server (inferred from your first sentence),
HTTP is mandatory (inferred from your talking about a web application database),
1000 or more record updates per second, and
individual updates do not need to be acknowledged synchronously (you are willing to use "fire and forget" approach.
The way I would approach this is to have the client threads queue the updates internally, and implement a client thread that periodically assembles queued updates into one HTTP request and sends it to the server. If necessary, the server can send a response that indicates the status for individual updates.
Batching eliminates the impact of latency on the client, and potentially allows the server to process the updates more efficiently.
The big difference between HTTP and JMS or XMPP is that JMS and XMPP allow asynchronous fire and forget messaging (where the client does not really know when and if a message will reach its destination and does not expect a response or an acknowledgment from the receiver). This would allow the first app to respond fast regardless of the second application processing time.
Asynchronous messaging is usually preferred for high-volume distributed messaging where the message consumers are slower than the producers. I can't say if this is exactly your case here.
If you have full control and the two web applications run in the same web container and hence in the same JVM, I would suggest using JNDI to allow both web applications to get access to a common data structure (a list?) which allows concurrent modification, namely to allow application A to add new entries and application B to consume the oldest entries simultaneously.
This is most likely the fastest way possible.
Note, that you should keep the information you put in the list to classes found in the JRE, or you will most likely run into class cast exceptions. These can be circumvented, but the easiest is most likely to just transfer strings in the common data structure.

Real world use of JMS/message queues? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I was just reading abit about JMS and Apache ActiveMQ.
And was wondering what real world use have people here used JMS or similar message queue technologies for ?
JMS (ActiveMQ is a JMS broker implementation) can be used as a mechanism to allow asynchronous request processing. You may wish to do this because the request take a long time to complete or because several parties may be interested in the actual request. Another reason for using it is to allow multiple clients (potentially written in different languages) to access information via JMS. ActiveMQ is a good example here because you can use the STOMP protocol to allow access from a C#/Java/Ruby client.
A real world example is that of a web application that is used to place an order for a particular customer. As part of placing that order (and storing it in a database) you may wish to carry a number of additional tasks:
Store the order in some sort of third party back-end system (such as SAP)
Send an email to the customer to inform them their order has been placed
To do this your application code would publish a message onto a JMS queue which includes an order id. One part of your application listening to the queue may respond to the event by taking the orderId, looking the order up in the database and then place that order with another third party system. Another part of your application may be responsible for taking the orderId and sending a confirmation email to the customer.
Use them all the time to process long-running operations asynchronously. A web user won't want to wait for more than 5 seconds for a request to process. If you have one that runs longer than that, one design is to submit the request to a queue and immediately send back a URL that the user can check to see when the job is finished.
Publish/subscribe is another good technique for decoupling senders from many receivers. It's a flexible architecture, because subscribers can come and go as needed.
I've had so many amazing uses for JMS:
Web chat communication for customer service.
Debug logging on the backend. All app servers broadcasted debug messages at various levels. A JMS client could then be launched to watch for debug messages. Sure I could've used something like syslog, but this gave me all sorts of ways to filter the output based on contextual information (e.q. by app server name, api call, log level, userid, message type, etc...). I also colorized the output.
Debug logging to file. Same as above, only specific pieces were pulled out using filters, and logged to file for general logging.
Alerting. Again, a similar setup to the above logging, watching for specific errors, and alerting people via various means (email, text message, IM, Growl pop-up...)
Dynamically configuring and controlling software clusters. Each app server would broadcast a "configure me" message, then a configuration daemon that would respond with a message containing all kinds of config info. Later, if all the app servers needed their configurations changed at once, it could be done from the config daemon.
And the usual - queued transactions for delayed activity such as billing, order processing, provisioning, email generation...
It's great anywhere you want to guarantee delivery of messages asynchronously.
Distributed (a)synchronous computing.
A real world example could be an application-wide notification framework, which sends mails to the stakeholders at various points during the course of application usage. So the application would act as a Producer by create a Message object, putting it on a particular Queue, and moving forward.
There would be a set of Consumers who would subscribe to the Queue in question, and would take care handling the Message sent across. Note that during the course of this transaction, the Producers are decoupled from the logic of how a given Message would be handled.
Messaging frameworks (ActiveMQ and the likes) act as a backbone to facilitate such Message transactions by providing MessageBrokers.
I've used it to send intraday trades between different fund management systems. If you want to learn more about what a great technology messaging is, I can thoroughly recommend the book "Enterprise Integration Patterns". There are some JMS examples for things like request/reply and publish/subscribe.
Messaging is an excellent tool for integration.
We use it to initiate asynchronous processing that we don't want to interrupt or conflict with an existing transaction.
For example, say you've got an expensive and very important piece of logic like "buy stuff", an important part of buy stuff would be 'notify stuff store'. We make the notify call asynchronous so that whatever logic/processing that is involved in the notify call doesn't block or contend with resources with the buy business logic. End result, buy completes, user is happy, we get our money and because the queue is guaranteed delivery the store gets notified as soon as it opens or as soon as there's a new item in the queue.
I have used it for my academic project which was online retail website similar to Amazon.
JMS was used to handle following features :
Update the position of the orders placed by the customers, as the shipment travels from one location to another. This was done by continuously sending messages to JMS Queue.
Alerting about any unusual events like shipment getting delayed and then sending email to customer.
If the delivery is reached its destination, sending a delivery event.
We had multiple also implemented remote clients connected to main Server. If connection is available, they use to access the main database or if not use their own database. In order to handle data consistency, we had implemented 2PC mechanism.
For this, we used JMS for exchange the messages between these systems i.e one acting as coordinator who will initiate the process by sending message on the queue and others will respond accordingly by sending back again a message on the queue.
As others have already mentioned, this was similar to pub/sub model.
I have seen JMS used in different commercial and academic projects. JMS can easily come into your picture, whenever you want to have a totally decoupled distributed systems. Generally speaking, when you need to send your request from one node, and someone in your network takes care of it without/with giving the sender any information about the receiver.
In my case, I have used JMS in developing a message-oriented middleware (MOM) in my thesis, where specific types of object-oriented objects are generated in one side as your request, and compiled and executed on the other side as your response.
Apache Camel used in conjunction with ActiveMQ is great way to do Enterprise Integration Patterns
We have used messaging to generate online Quotes
We are using JMS for communication with systems in a huge number of remote sites over unreliable networks. The loose coupling in combination with reliable messaging produces a stable system landscape: Each message will be sent as soon it is technically possible, bigger problems in network will not have influence on the whole system landscape...

Whats the best way to process an asynchronous queue continuously in Java?

I'm having a hard time figuring out how to architect the final piece of my system. Currently I'm running a Tomcat server that has a servlet that responds to client requests. Each request in turn adds a processing message to an asynchronous queue (I'll probably be using JMS via Spring or more likely Amazon SQS).
The sequence of events is this:
Sending side:
1. Take a client request
2. Add some data into a DB related to this request with a unique ID
3. Add a message object representing this request to the message queue
Receiving side:
1. Pull a new message object from the queue
2. Unwrap the object and grab some information from a web site based on information contained in the msg object.
3. Send an email alert
4. update my DB row (same unique ID) with the information that operation was completed for this request.
I'm having a hard figuring out how to properly deal with the receiving side. On one hand I can probably create a simple java program that I kick off from the command line that picks each item in the queue and processes it. Is that safe? Does it make more sense to have that program running as another thread inside the Tomcat container? I will not want to do this serially, meaning the receiving end should be able to process several objects at a time -- using multiple threads. I want this to be always running, 24 hours a day.
What are some options for building the receiving side?
"On one hand I can probably create a simple java program that I kick off from the command line that picks each item in the queue and processes it. Is that safe?"
What's unsafe about it? It works great.
"Does it make more sense to have that program running as another thread inside the Tomcat container?"
Only if Tomcat has a lot of free time to handle background processing. Often, this is the case -- you have free time to do this kind of processing.
However, threads aren't optimal. Threads share common I/O resources, and your background thread may slow down the front-end.
Better is to have a JMS queue between the "port 80" front-end, and a separate backend process. The back-end process starts, connects to the queue, fetches and executes the requests. The backend process can (if necessary) be multi-threaded.
If you are using JMS, why are you placing the tasks into a DB?
You can use a durable Queue in JMS. This would keep tasks, even if the JMS broker dies, until they have been acknowledged. You can have redundant brokers so that if one broker dies, the second automatically takes over. This could be more reliable than using a single DB.
If you are already using Spring, check out DefaultMessageListenerContainer. It allows you to create a POJO message driven bean. This can be used from within an existing application container (your WAR file) or as a separate process.
I've done this sort of thing by hosting the receiver in an app server, weblogic in my case, but tomcat works fine, too. Don't poll the queue, use an event-based model. This could be hand-coded or it could be a message-driven web service. If the database update is idempotent, you could update the database and send the email, then issue the commit on the queue. It's not a problem to have several threads that all read from the same queue.
I've use various JMS solutions, including tibco, activemq (before apache subsumed it) and joram. Joram was the more reliable opensource solution, but that may have changed now that it's part of apache.

Categories

Resources