I'd like to implement a dashboard that is web-based and has a variety of metrics where one changes every minute and others change like twice a day. Via AJAX the metrics should be updated as quick as possible if a change occured. This means the same page would be running for at least several hours.
What would be the most efficient way (technology-/implementation-wise) of dealing with this in the Java world?
Well, there are two obvious options here:
Comet, aka long polling: the AJAX request is held open by the server until it times out after a few minutes or until a change occurs, whichever happens first. The downside of this is that handling many connections can be tricky; aside from anything else, you won't want the typical "one thread per request, handling it synchronously" model which is common.
Frequent polling from the AJAX page, where each request returns quickly. This would probably be simpler to implement, but is less efficient in network terms (far more requests) and will be less immediate; you could send a request every 5 seconds for example, but if you have a lot of users you're going to end up with a lot of traffic.
The best solution will depend on how many users you've got. If there are only going to be a few clients, you may well want to go for the "poll every 5 seconds" approach - or even possibly long polling with a thread per request (although that will probably be slightly harder to implement). If you've got a lot of clients I'd definitely go with long polling, but you'll need to look at how to detach the thread from the connection in your particular server environment.
I think time of Comet has gone. The brand new Socket.IO protocol gaining popularity. And i suggest to use netty-socketio, it supports both long-polling and websocket protocols. javascript, ios, android client libs also available.
Related
We are developing a site that will allow users to send semi-real-time events to other users. The UI will display an icon when there is a new event for a user (pretty standard stuff).
I have read that periodic short polling does not scale as well as websockets because it puts more pressure on the web server. I am not quite sure why this would be the case?
We are using tomcat NIO (which does not have a one-to-one connection per thread ratio). As I understand it, Tomcat NIO is pretty good at handling longer HTTP connection timeouts with a small number of threads.
So, if the periodic polling time is less than the connection timeout, then the polling should not have to create another TCP handshake, as it will just reuse an existing HTTP 1.1 connection.
Thus, the above does not seem like it would create too much pressure on the server. It may not be as real-time as long polling or websockets, but I do not see why it should not scale (assuming that the server can quickly respond with a response indicating a new event or not – we use an in memory concurrent hashmap, so this should be pretty fast with no necessary DB access).
Am I missing anything?
Thanks,
-Adam
Short polling may not be as trendy as long polling and web sockets but it works and works everywhere.
Trello (backed by some of the same people as SO) normally uses web sockets but when they encountered a crippling bug in their web sockets implementation on launch day they were saved by short polling:
We hit a problem right after launch. Our WebSocket server implementation started behaving very strangely under the sudden and heavy real-world usage of launching at TechCrunch disrupt, and we were glad to be able to revert to plain polling and tune server performance by adjusting the active and idle polling intervals. It allowed us to degrade gracefully as we increased from 300 to 50,000 users in under a week. We’re back on WebSockets now, but having a working short-polling system still seems like a very prudent fallback.
The full story is well worth a read.
I'd particularly highlight,
The use of HAProxy to terminate the client connection. Meaning that internal web servers are shielded from slow and misbehaving clients and the overhead of repeatedly creating connections becomes less of an issue due to HAProxy's scalability/efficiency;
Trello's polling frequency was adjustable meaning that under heavy load they could tell all clients to poll less frequently thus exchanging responsiveness for increased capacity.
In Brazil at least there are many retail trading platforms that use short polling, with very short polling intervals for rapid publication of stock prices, and regularly support thousands of concurrent users.
Unlike long polling and web sockets, short polling doesn't require a persistent connection so with something like HAProxy in the middle your maximum number of "connections" could actually be greater than the number of concurrent sockets supported by your hardware (although at that point you'd probably be seeing some degradation in responsiveness).
My Java web application pulls some data from external systems (JSON over HTTP) both live whenever the users of my application request it and batch (nightly updates for cases where no user has requested it). The data changes so caching options are likely exhausted.
The external systems have some throttling in place, the exact parameters of which I don't know, and which likely change depending on system load (e.g., peak times 10 requests per second from one IP address, off-peak times 100 requests per second from open IP address). If the requests are too frequent, they time out or return HTTP 503.
Right now I am attempting the request 5 times with 2000ms delay between each, giving up if an error is received each time. This is not optimal as sometimes at peak-times nearly all requests fail; I could avoid making these requests and perhaps get at least some to succeed instead.
My goals are to have a somewhat simple, reliable design, and enough flexibility so that I could both pull some metrics from the throttler to understand how well the external systems are responding (and thus adjust how often they are invoked), and to auto-adjust the interval with which I call them (individually per system) so that it is optimal both on off-peak and peak hours.
My infrastructure is Java with RabbitMQ over MongoDB over Linux.
I'm thinking of three main options:
Since I already have RabbitMQ used for batch processing, I could just introduce a queue to which the web processes would send the requests they have for external systems, then worker processes would read from that queue, throttle themselves as needed, and return the results. This would allow running multiple parallel worker processes on more servers if needed. My main concern is that it isn't a very simple solution, and how to manage peak-hour throughput being low and thus the web processes waiting for a long while. Also this converts my RabbitMQ into a critical single failure point; if it dies the whole system stops (as opposed to the nightly batch processes just not running any more, which is less critical). I suppose rpc is the correct pattern of RabbitMQ usage, but not sure. Edit - I've posted a related question How to properly implement RabbitMQ RPC from Java servlet web container? on how to implement this.
Introduce nginx (e.g. ngx_http_limit_req_module), HAProxy (link) or other proxy software to the mix (as reverse proxies?), have them take care of the throttling through some configuration magic. The pro is that I don't have to make code changes. The con is that it is more technology used, and one I've not used before, so chances of misconfiguring something are quite high. It would also likely not be easy to do dynamic throttling depending on external server load, or prioritizing live requests over batch requests, or get statistics of how the throttling is doing. Also, most documentation and examples will likely be on throttling incoming requests, not outgoing.
Do a pure-Java solution (e.g., leaky bucket implementation). Would be simple in the sense that it is "just code", but the devil is in the details; debugging all the deadlocks, starvations and race conditions isn't always fun.
What am I missing here?
Which is the best solution in this case?
P.S. Somewhat related question - what's the proper approach to log all the external system invocations, so that statistics are collected as to how often I invoke them, and what the success rate is?
E.g., after every invocation I'd invoke something like .logExternalSystemInvocation(externalSystemName, wasSuccessful, elapsedTimeMills), and then get some aggregate data out of it whenever needed.
Is there a standard library/tool to use, or do I have to roll my own?
If I use option 1. with RabbitMQ, is there a way to organize the flow so that I get this out of the box from the RabbitMQ console? I wouldn't want to send all failed messages to poison queue, it would fill up too quickly though and in most cases there is no need to re-process these failed requests as the user has already sadly moved on.
Perhaps this open source system can help you a little: http://code.google.com/p/valogato/
I just started to code in node.js for a little while. Now here is one of my questions about it:
In HTTP apps, given the request-response model, the single app thread is blocked until all the back end tasks are done and response is returned to the client, so the performance improvement seems to be limited only to fine-tuning back end things like parallelizing IO requests. (Well, this improvement matters when it comes to many heavy and independent IO operations being involved, but usually the condition also implies that by redesigning the data structure you could eliminate a large number of IO request and, possibly, end up with even better performance than just issuing parallelized operations.)
If that is true, how could it produce superior performance than those frameworks based on Java (or PHP, python, etc.) do?
I also referred to an article Understanding the node.js event loop, which also explains that situation:
It really is a single thread running: you can’t do any parallel code
execution; doing a “sleep” for example will block the server for one
second:
while(new Date().getTime() < now + 1000) {
// do nothing
}
…however, everything runs in parallel except your code.
I personally verified that by putting exact the "sleep" code into one IO callback closure, and tried submitting a request leading to this callback, then submitted another one. Both requests will trigger a console log when it is processed. And my observation is that the later was blocked until the former returned a response.
So, does it imply that only in socket mode, where both sides can emit events and push messages to each other at any time, would the full power of its asynchronous processing capability be utilized?
I'm a little confused about that. Any comment or advice is welcome. Thanks!
update
I ask this question because some performance evaluation cases are
reported, for instance Node.js is taking over the Enterprise –
whether you like it or
not,
and LinkedIn Moved from Rails to Node: 27 Servers Cut and Up to 20x
Faster.
Some radical opinion claims that J2EE will be totally replaced: J2EE
is Dead: Long-live Javascript Backed by JSON
Services.
NodeJS uses libuv, so IO operations are non-blocking. Yes, your Node app uses 1 thread, however, all the IO requests are pushed to an event queue. Then when the request is made, it is obvious that its response will not be read from socket, file etc. at zero-time. So, whatever is ready in the queue is popped and it is handled. In the mean time, your requests can be answered, there might be chunks or full data to be read, however they are just waiting in the queue to be processed. This goes on until there is no event remains, or the open sockets are closed. Then the NodeJS can finally end its execution.
As you see, NodeJS is not like other frameworks, pretty different. If you have a long going and Non-IO operation, so it is blocking, like matrix operations, image&video processing, you can spawn another processes and assign them the job, use message passing, the way you like TCP, IPC.
The main point of NodeJS is to remove unncesseary context switches which brings significant overhead when not used properly. In NodeJS, why would you want context switches? All the jobs are pushed to event queue and they are probably small in computation, since all they do to make multiple IO/s, (read from db, update db, write to client, write to bare TCP socket, read from cache), it is not logical to stop them in the middle and switch to another job. So with the help of libuv, whichever IO is ready can be executed right now.
For reference please look at libuv documentation: http://nikhilm.github.io/uvbook/basics.html#event-loops
I have also noticed a lot of radical opinions regarding Node.js performance when compared to Java. From a queuing theory perspective, I was skeptical how a single thread with no blocking could out perform multiple threads that blocked. I thought that I would conduct my own investigation into just how well Node.js performed against a more established and mature technology.
I evaluated Node.js by writing a functionally identical, multiple datasource micro-service both in Node.js and in DropWizard / Java then subjected both implementions to the same load test. I collected performance measurements of the results from both tests and analyzed the data.
At one fifth the code size, Node.js had comparable latency and 16% lower throughput than DropWizard.
I can see how Node.js has caught on with early stage start-up companies. It is easier to write micro-services very quickly in Node.js and get them running than it is with Java. As companies mature, their focus tends to shift from finding product / market fit to improving economies of scale. This might explain why more established companies prefer Java with its higher scalability.
As far as my experience(though brief) goes with node.js, i agree the performance of node.js server can not be compared with other webservers like tomcat etc as stated in node.js doc somewhere
It really is a single thread running: you can’t do any parallel code
execution; doing a “sleep” for example will block the server for one
second:
So we used it not as alternative to full fledged webserver like tomcat but just to distrubute some load from tomcat where we can take of single thread model. So it has to be trade-off somewhere
Also see http://www.sitepoint.com/node-js-is-the-new-black/ Thats the beautiful article about node.js
I have a Java servlet that's getting overloaded by client requests during peak hours. Some clients span concurrent requests. Sometimes the number of requests per second is just too great.
Should I implement application logic to restrict the number of request client can send per second? Does this need to be done on the application level?
The two most common ways of handling this are to turn away requests when the server is too busy, or handle each request slower.
Turning away requests is easy; just run a fixed number of instances. The OS may or may not queue up a few connection requests, but in general the users will simply fail to connect. A more graceful way of doing it is to have the service return an error code indicating the client should try again later.
Handling requests slower is a bit more work, because it requires separating the servlet handling the requests from the class doing the work in a different thread. You can have a larger number of servlets than worker bees. When a request comes in it accepts it, waits for a worker bee, grabs it and uses it, frees it, then returns the results.
The two can communicate through one of the classes in java.util.concurrent, like LinkedBlockingQueue or ThreadPoolExecutor. If you want to get really fancy, you can use something like a PriorityBlockingQueue to serve some customers before others.
Me, I would throw more hardware at it like Anon said ;)
Some solid answers here. I think more hardware is the way to go. Having too many clients or traffic is usually a good problem to have.
However, if you absolutely must throttle clients, there are some options.
The most scalable solutions that I've seen revolve around a distributed caching system, like Memcached, and using integers to keep counts.
Figure out a rate at which your system can handle traffic. Either overall, or per client. Then put a count into memcached that represents that rate. Each time you get a request, decrement the value. Periodically increment the counter to allow more traffic through.
For example, if you can handle 10 requests/second, put a count of 50 in every 5 seconds, up to a maximum of 50. That way you aren't refilling it all the time, but you can also handle a bit of bursting limited to a window. You will need to experiment to find a good refresh rate. The key for this counter can either be a global key, or based on user id if you need to restrict that way.
The nice thing about this system is that it works across an entire cluster AND the mechanism that refills the counters need not be in one of your current servers. You can dedicate a separate process for it. The loaded servers only need to check it and decrement it.
All that being said, I'd investigate other options first. Throttling your customers is usually a good way to annoy them. Most probably NOT the best idea. :)
I'm assuming you're not in a position to increase capacity (either via hardware or software), and you really just need to limit the externally-imposed load on your server.
Dealing with this from within your application should be avoided unless you have very special needs that are not met by the existing solutions out there, which operate at HTTP server level. A lot of thought has gone into this problem, so it's worth looking at existing solutions rather than implementing one yourself.
If you're using Tomcat, you can configure the maximum number of simultaneous requests allowed via the maxThreads and acceptCount settings. Read the introduction at http://tomcat.apache.org/tomcat-6.0-doc/config/http.html for more info on these.
For more advanced controls (like per-user restrictions), if you're proxying through Apache, you can use a variety of modules to help deal with the situation. A few modules to google for are limitipconn, mod_bw, and mod_cband. These are quite a bit harder to set up and understand than the basic controls that are probably offered by your appserver, so you may just want to stick with those.
I am willing to implement a chat website on App Engine. But I found that App Engine will not allow me to go with server push. (as it will kill the response after 30 sec).
So whats the other method that can
be used? Will polling cause bad user
experience? Meaning will the user
have to wait for some time to
retrieve new messages from the server?
What will be the ideal polling
interval?
If you use very small polling intervals, will my bandwidth get exhausted? Will I suffer performance problems?
This is a quite old question now, but I was looking for a similar answer. I think the Channel API (http://code.google.com/appengine/docs/java/channel/) is much better suited for the task. For what I understand, XMPP is good to interact with the app, but not with other users. Channel API implement push notifications via HttpRequest. I just found an example of a chat room here: https://bitbucket.org/keakon/channelchat
Can't you just use XMPP instead of a website? It would be a much better approach. Polling certainly isn't going to scale very well and will definitely not give a good user experience.
XMPP with appengine
I've heard of people working around that by holding the connection (i.e. sending no response) until it dies then reestablishing it. 30 seconds is not that much though.
If done this way it would still feel more responsive to the user than polling every 30 secs.
About the bandwith usage: Depending on the payload "typical" HTTP requests can range from a few hundred bytes to some kBytes especially with cookies.
With an average size of let's say 5kB (pessimistic) every 30 sec that would sum up to around 14 MB per 24 hrs. Maybe you can cut down the size by setting a path in your cookies so they don't get send for these connections. Maybe you don't need to send the whole payload again every 30 secs.
yeah channel api is the best solution, with gwt is even better
http://www.dev-articles.com/article/Google-App-Engine-sending-messages-with-XMPP-393002