Limit number of calls to RESTful service

Limit number of calls to RESTful service - java

we have a RESTful service deployed on multiple nodes and we want to limit the number of calls coming to our service from each client with different quota for each client per minute.
our stack : Jboss application server, Java/Spring RESTful service.
What cloud be the possible technique to implement this?

Sometimes ago I read a good article where the same theme was highlighted.
The idea is to move this logic into load balancing proxy and here some good reasons to do it:
Eliminates technical debt - If you’ve got rate limiting logic coupled in with app logic, you’ve got technical debt you don’t need. You can lift and shift that debt
Efficiency gains - You’re offloading logic upstream, which means all your compute resources are dedicated to compute. You can better predict
Security - It’s well understood that application layer (request-response) attacks are on the rise, including denial of service. By leveraging an upstream proxy with greater capacity for connections you can stop those attacks in their tracks, because they never get anywhere near the actual server.

If the only way to access your API is through a UI client which you manages , then you can add a check on the client code (javascript in case of web app) to make a call only when the limit is not crossed by that user. Else there is no way, since a user can always access your API and the only thing at the server level which you can do is to check whether to send an error or valid result as a part of API response.

To limit the stack, it means you need to keep state, at least based on some specific client identification. This may require you to maintain a central counter e.g. db (cassandra) which can allow you to look up the current request count per minute, and then within a java filter, you can restrict request counts as necessary.
Or if you can track the client's session, then you can track and then use sticky session, enforcing clients to use specific node for the duration of the client session, and hence you can simply track within a java filter, the number of requests per client, and send 503 code or something more relevant.

Related

Restful Server Response triggered Via Client

This question might sound a bit abstract,answered (but did my search didn't stumble on a convenient answer) or not specific at all ,but I will try to provide as much information as I can.
I am building a mobile application which will gather and send sensory data to a remote server. The remote server will collect all these data in a mySQL database and make computations (not the mysql database ,another process/program) . What I wanna know is :
After some updates in the database , is it doable to send a response from a RESTful Server to a certain client (the one who like did the last update probably) ,using something like "a background thread"? Or this should be done via socket connection through server-client response?
Some remarks:
I am using javaEE, Spring MVC with hibernate and tomcat (cause I am familiar with the environment though in a more asynchronous manner).
I thought this would be a convenient way because the SQL schema is not much complicated and security and authentication issues are not needed (it's a prototype).
Also there is a front-end webpage that will have to visualize these data, so such a back-end system would look like a good option for getting the job done fast.
Lastly I saw this solution :
Is there a way to 'listen' for a database event and update a page in real time?
My issue is that besides the page I wanna update the client's side with messages from the RESTful server.
If all these above are unecessary and a more simple client-server application will prove better and less complex please be welcome to inform me.
Thank you in advance.

Generally you should upload your data to a resource on the server (e.g. POST /widgets and the server should immediately return with a 201 Created or (if creation is too slow and needs to happen later) 202 Accepted status. There are several approaches after that happens, each has their merits:
Polling - The server's response includes a location field which the client can then proceed to poll until a change happens (e.g. check for an update every second). This is the easiest approach and quite efficient if you use HTTP caching effectively and the average number of checks is relatively low.
Push notification - Server sends a push notification when the change happens, report's generated, etc. Obviously this requires you to store the client's details and their notification requirements. This is probably the cleanest approach and also easy to scale. In the case of Android (also iOS) you have free push notifications available via Google Cloud Messaging.
Set up a persistent connection between client and server, e.g. using a Websocket or low-level TCP connection. This should yield the fastest response times, but will probably be a drain on phone battery, harder to scale on the server, and more complex to code.

Restful services reliabilty

I am developing Restful services where we will be inserting/updating new records into database.
Since REST uses HTTP for its communication and HTTP is not reliable, I am worried that, the request may not be sent to the server in case of connection failure.
One of the suggestions I found in the link was "if connection fails just retry again from the client side." But we don't have any control over the client applications.
Other solution was to implement messaging systems like RabbitMQ/JMS to ensure reliability.
I also found in the following link that adding session states improves reliability. I am not able to understand how this happen and more importantly doesn't a good restful service is always stateless?
So to summarize my questions:
To achieve reliability, is Messaging systems best possible approach?
How does session management help me in achieving reliability?

Messaging can help, as long as you don't do any processing when you receive a command to insert or update information, as you need to immediately put the command in a queue. This solution usually adds quite a bit of complexity as you need to notify your client asynchronously when you finish processing the command (was it successful or did it fail?... or did I fail to send the outcome?).
Session management? For reliability? Never heard of that :). Restful services are usually stateless... so no sessions here!
Another option (but depends how your clients integrate with you) is to allow your clients to generate the ids of the items you are going to be storing/updating, in this case, if they get an error back, but you have processed the command successfully, the client can retry, and the same update will happen. You can pair this with versioning to prevent stale updates arriving late.

what does it mean when they say http is stateless

I am studing java for web and it mentions http is stateless.
what does that mean and how it effects the programming
I was also studying the spring framework and there it mentions some beans have to declared as inner beans as their state changes . What does that means?

HTTP -- that is the actual transport protocol between the server and the client -- is "stateless" because it remembers nothing between invocations. EVERY resource that is accessed via HTTP is a single request with no threaded connection between them. If you load a web page with an HTML file that within it contains three <img> tags hitting the same server, there will be four TCP connections negotiated and opened, four data transfers, four connections closed. There is simply no state kept at the server at the protocol level that will have the server know anything about you as you come in.
(Well, that's true for HTTP up to 1.0 at any rate. HTTP 1.1 adds persistent connection mechanisms of various sorts because of the inevitable performance problems that a truly stateless protocol engenders. We'll overlook this for the moment because they don't really make HTTP stateful, they just make it dirty-stateless instead of pure-stateless.)
To help you understand the difference, imagine that a protocol like Telnet or SSH were stateless. If you wanted to get a directory listing of a remote file, you would have to, as one atomic operation, connect, sign in, change to the directory and issue the ls command. When the ls command finished displaying the directory contents, the connection would close. If you then wanted to display the contents of a specific file you would have to again connect, sign in, change to the directory and now issue the cat command. When the command displaying the file finished, the connection would again close.
When you look at it that way, though the lens of Telnet/SSH, that sounds pretty stupid, doesn't it? Well, in some ways it is and in some ways it isn't. When a protocol is stateless, the server can do some pretty good optimizations and the data can be spread around easily. Servers using stateless protocols can scale very effectively, so while the actual individual data transfers can be very slow (opening and closing TCP connections is NOT cheap!) an overall system can be very, very efficient and can scale to any number of users.
But...
Almost anything you want to do other than viewing static web pages will involve sessions and states. When HTTP is used for its original purpose (sharing static information like scientific papers) the stateless protocol makes a lot of sense. When you start using it for things like web applications, online stores, etc. then statelessness starts to be a bother because these are inherently stateful activities. As a result people very rapidly came up with ways to slather state on top of the stateless protocol. These mechanisms have included things like cookies, like encoding state in the URLs and having the server dynamically fire up data based on those, like hidden state requests, like ... well, like a whole bunch of things up to and including the more modern things like Web Sockets.
Here are a few links you can follow to get a deeper understanding of the concepts:
http://en.wikipedia.org/wiki/Stateless_server
http://en.wikipedia.org/wiki/HTTP
http://en.wikipedia.org/wiki/HTTP_persistent_connection

HTTP is stateless - this means that when using HTTP the end point does not "remember" things (such as who you are). It has no state. This is in contrast to a desktop application - if you have a form and you go to a different form, then go back, the state has been retained (so long as you haven't shut down the application).
Normally, in order to maintain state in web application, one uses cookies.

A stateless protocol does not require the server to retain information or status about each user for the duration of multiple requests. For example, when a web server is required to customize the content of a web page for a user, the web application may have to track the user's progress from page to page.
A common solution is the use of HTTP cookies. Other methods include server side sessions, hidden variables (when the current page is a form), and URL-rewriting using URI-encoded parameters, e.g., /index.php?session_id=some_unique_session_code.
here

HTTP is called a stateless protocol because each command is executed independently, without any knowledge of the commands that came before it.
This shortcoming of HTTP is being addressed in a number of new technologies, including cookies.

When it's said that something is stateless it usually means that you can't assume that the server tracks any state between interactions.
By default the HTTP protocol assumes a truly stateless server. Every request is treated as an independent request.
In practice this is fixed by some servers (most of them) using a tracking cookie in the request to match some state on the server with a specific client. This works because the way cookies work (they are posted to server on each subsequent requests once they have been set on the client).
Basically a server that isn't stateless is an impediment to scale. You need to either make sure that you route all the requests from a specific browser to the same instance or to do backend replication of the states. This usually is a limiting factor when trying to scale an application.
There are some other solutions for keeping track of state (see rails's encrypted state cookie) but basically if you want to grow you need to figure a way to avoid tracking state on the server :).

fire and forget compared to http request

I'm looking for opinion from you all. I have a web application that need to records data into another web application database. I not prefer to use HTTP request GET on 2nd application because of latency issue. I looking for fast way to save records on 2nd application quickly, I came across the idea of "fire and forget" , will JMS suit for this scenario? from my understanding JMS will guarantee message delivery, guarantee whether message will be 100% deliver is not important as long as can serve as many requests as possible. Let say I need to call at least 1000 random requests per seconds to 2nd application should I use JMS? HTTP request? or XMPP instead?

I think you're misunderstanding networking in general. There's positively no reason that a HTTP GET would have to be any slower than anything else, and if HTTP takes advantage of keep alives it's faster that most options.
JMX isn't a protocol, it's a specification that wraps many other protocols including, possibly, HTTP or XMPP.
In the end, at the levels where Java will operate, there's either UDP or TCP. TCP has more overhead by guarantees delivery (via retransmission) and ordering. UDP offers neither guaranteed delivery nor in-order delivery. If you can deal with UDP's limitations you'll find it "faster", and if you can't then any lightweight TCP wrapper (of which HTTP is one) is just about the same.

Your requirements seem to be:
one client and one server (inferred from your first sentence),
HTTP is mandatory (inferred from your talking about a web application database),
1000 or more record updates per second, and
individual updates do not need to be acknowledged synchronously (you are willing to use "fire and forget" approach.
The way I would approach this is to have the client threads queue the updates internally, and implement a client thread that periodically assembles queued updates into one HTTP request and sends it to the server. If necessary, the server can send a response that indicates the status for individual updates.
Batching eliminates the impact of latency on the client, and potentially allows the server to process the updates more efficiently.

The big difference between HTTP and JMS or XMPP is that JMS and XMPP allow asynchronous fire and forget messaging (where the client does not really know when and if a message will reach its destination and does not expect a response or an acknowledgment from the receiver). This would allow the first app to respond fast regardless of the second application processing time.
Asynchronous messaging is usually preferred for high-volume distributed messaging where the message consumers are slower than the producers. I can't say if this is exactly your case here.

If you have full control and the two web applications run in the same web container and hence in the same JVM, I would suggest using JNDI to allow both web applications to get access to a common data structure (a list?) which allows concurrent modification, namely to allow application A to add new entries and application B to consume the oldest entries simultaneously.
This is most likely the fastest way possible.
Note, that you should keep the information you put in the list to classes found in the JRE, or you will most likely run into class cast exceptions. These can be circumvented, but the easiest is most likely to just transfer strings in the common data structure.

Can I throttle requests made by a distributed app?

My application makes Web Service requests; there is a max rate of requests the provider will handle, so I need to throttle them down.
When the app ran on a single server, I used to do it at the application level: an object that keeps track of how many requests have been made so far, and waits if the current request makes it exceeds the maximum allowed load.
Now, we're migrating from a single server to a cluster, so there are two copies of the application running.
I can't keep checking for the max load at the application code, because the two nodes combined might exceed the allowed load.
I can't simply reduce the load on each server, because if the other node is idle, the first node can send out more requests.
This is a JavaEE 5 environment. What is the best way to throttle the requests the application sends out ?

Since you are already in a Java EE environment, you can create an MDB that handles all requests to the webservice based on a JMS queue. The instances of the application can simply post their requests to the queue and the MDB will recieve them and call the webservice.
The queue can actually be configured with the appropriate number of sessions that will limit the concurrent access to you webservice, thus your throttling is handled via the queue config.
The results can be returned via another queue (or even a queue per application instance).

The N nodes need to communicate. There are various strategies:
broadcast: each node will broadcast to everybody else that it's macking a call, and all other nodes will take that into account. Nodes are equal and maintain individial global count (each node know about every other node's call).
master node: one node is special, its the master and all other nodes ask permission from the master before making a call. The master is the only one that know the global count.
dedicated master: same as master, but the 'master' doesn't do calls on itslef, is just a service that keep track of calls.
Depending on how high do you anticipate to scale later, one or the other strategy may be best. For 2 nodes the simplest one is broadcast, but as the number of nodes increases the problems start to mount (you'll be spending more time broadcasting and responding to broadcats than actually doing WS requests).
How the nodes communicate, is up to you. You can open a TCP pipe, you can broadcats UDP, you can do a fully fledged WS for this purpose alone, you can use a file share protocol. Whatever you do, you are now no longer inside a process so all the fallacies of distributed computing apply.

Many ways of doing this: you might have a "Coordination Agent" which is responsible of handing "tokens" to the servers. Each "token" represents a permission to perform a task etc. Each application needs to request "tokens" in order to place calls.
Once an application depletes its tokens, it must ask for some more before proceeding to hit the Web Service again.
Of course, this all gets complicated when there are requirements with regards to the timing of each calls each application makes because of concurrency towards the Web Service.
You could rely on RabbitMQ as Messaging framework: Java bindings are available.

I recommend using beanstalkd to periodically pump a collection of requests (jobs) into a tube (queue), each with an appropriate delay. Any number of "worker" threads or processes will wait for the next request to be available, and if a worker finishes early it can pick up the next request. The down side is that there isn't any explicit load balancing between workers, but I have found that distribution of requests out of the queue has been well balanced.

This is an interesting problem, and the difficulty of the solution depends to a degree on how strict you want to be on the throttling.
My usual solution to this is JBossCache, partly because it comes packaged with JBoss AppServer, but also because it handles the task rather well. You can use it as a kind of distributed hashmap, recording the usage statistics at various degrees of granularity. Updates to it can be done asynchronously, so it doesn't slow things down.
JBossCache is usually used for heavy-duty distributed caching, but I rather like it for these lighter-weight jobs too. It's pure java, and requires no mucking about with the JVM (unlike Terracotta).

Hystrix was designed for pretty much the exact scenario you're describing. You can define a thread pool size for each service so you have a set maximum number of concurrent requests, and it queues up requests when the pool is full. You can also define a timeout for each service and when a service starts exceeding its timeout, Hystrix will reject further requests to that service for a short period of time in order to give the service a chance to get back on its feet. There's also real time monitoring of the entire cluster through Turbine.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.