Load Balancer working for popular web sites

Load Balancer working for popular web sites - java

I was asked in Interview and really confused on answer for below Question.
If say some site is getting 500 requests per second and the mechanism they handle to solve large number of hits is through Load Balancer.
In above case, all the request will first go to Load Balancer, it is then Load Balancer responsibility to pass on request to actual server which will give the response.
So Load Balancer is the entry point for all request and it too have some capacity to handle the request?
So how much request Load Balancer can accept and if it accept upto some limit then how system works in such case?
I am sorry if my Question is meaningless, Please explain?
Thanks.

A load balancer is usually running a lightweight event based HTTP server (NginX, for example) that maintains a connection to a backend server for each incoming request.
A popular setup is:
NginX
| \
| \
Apache Apache ...
Since NginX can handle a lot more connections and has a fairly predictable memory usage pattern, it can usually be used as a frontend for several Apache backend servers (running PHP, Python, Ruby, etc).
Replace Apache with Tomcat/Jetty/insert your favorite servlet container here and the same logic applies.
The main point here is that your backend servers are usually doing something a lot more time consuming (Running some scripting language, querying a DB, etc) that it is more than likely that each backend server is not bottlenecked by its HTTP server component but rather by the application logic.
There is no one size fits all solution to these kinds of problems. Another solution (amongst many), once you outgrow the capacity of a single load balancer machine is to use DNS round-robin between several balancers supporting several backend servers.

I would say that if the amount of connections rise up to the limits of the Load Balancer entry point, then you can implement a DNS level balancing (If name resolution is used in the requests). Each request will be driven (e.g. "round-robined" ) to different entry point thanks to the DNS resolution which is the responsible of switching among different resolutions for the same name in the requests, and sending those requests to different Load Balancers or directly to different servers (latter one, would imply a different LB technique).
But the 500 requests per second shouldn't be a problem for a server responsible of balancing.
HAProxy can handle thousands of requests/second without problem, driven different sessions to different servers, or also keeping active sessions distributed in different servers.
HAProxy is a free, very fast and reliable solution offering high
availability, load balancing, and proxying for TCP and HTTP-based
applications. It is particularly suited for web sites crawling under
very high loads while needing persistence or Layer7 processing.
Supporting tens of thousands of connections is clearly realistic with
todays hardware.

Related

Spring Boot Microservice With Configurable Communication Method

Consider this scenario: I have N>2 software components (microservices) that can communicate through two different communication protocols depending on how they are deployed. In other words, I have two deployment scenarios:
The components are to be deployed on the same machine. In this case I don't know if it makes sense to use HTTP to communicate these two components, if I think about performance. I understand that there are more efficient ways to communicate two processes on the same machine using java, such as sockets, RMI, RPC ...
The components are to be deployed on N different machines. In this case, it seems to me that it makes sense for me to use HTTP to communicate these components.
In short, what I want to do is to be able to configure the communication protocol depending on the way I perform the deployment: On a single machine, for example, use RMI, but when I deploy on two machines, use HTTP.
Does anyone know how I can do this using Spring Boot?
Many Thanks!

Fundamental building block of protocols like RMI or HTTP is socket communication. If you are not looking for the comfort of HTTP or RMI, and priority is performance, pure socket communication is your choice.
This will raise other concerns like, deployment difficulties. You should know IP address of both nodes in advance.
Another option, is to go for unix -domain socket for within server communication. For that you have to depend on JunixSocket.
If you want to go another route, check all inter process communication options.
EDIT
As you said in comment "It is simply no longer a question of two components but of many". In that scenario, each component should be a micro-service And should be capable to interact with each other. If that is the choice most scalable protocol are REST/RPC both are using HTTP protocol under the hood. REST is ideal solution for an API to be developed against a data source using CRUD operations. RPC is more lean towards action oriented API. You can find more details to identify the difference in between REST and RPC here.

How I understand this is...
if the components (producer and consumer) are deployed on the same host then use an optimized protocol and if on different hosts then use HTTP(s)
Firstly, there must be a serious driver to go down this route. I take it the driver here is performance. You would like to offer faster performance on local deployment and compartively compromised speeds on distributed deployments. BTW, given that we are in a distributed deployment world (or atleast where we are headed) HTTP will be what will survive. Custom protocols are discouraged.
Anyways... I would say your producer application should be in a self healing / discovery mode. On start-up (or periodically) it could check the health of the "optimized" end-point and decide whether it the optimized receiver is around. The receiver would need to stand behind a load-balancer. If the receiver is not up then go towards HTTP(S) and setup this instance accordingly at runtime.
For the consumer, it would need to keep both the gates (HTTP and optimized) open. It should be ready to handle requests from either channel.
In SpringBoot you can have a healthCheck implmented and switch the emitter on/off depending on the health of optimized end-point. If both end-points are unhealthy then surely the producer cannot emit anything. Apart from this the rest is just normal dependency-injection.

Design considerations for J2EE webapp on Tomcat in Amazon WebServices

My project is looking to deploy a new j2ee application to Amazon's cloud. ElasticBeanstalk supports Tomcat apps, which seems perfect. Are there any particular design considerations to keep in mind when writing said app that might differ from just a standalone tomcat on a server?
For example, I understand that the server is meant to scale automatically. Is this like a cluster? Our application framework tends to like to stick state in the HttpSession, is that a problem? Or when it says it scales automatically, does that just mean memory and CPU?

Automatic scaling on AWS is done via adding more servers, not adding more CPU/RAM. You can add more CPU/RAM manually, but it requires shutting down the server for a minute to make the change, and then configuring any software running on the server to take advantage of the added RAM, so that's not the way automatic scaling is done.
Elastic Beanstalk is basically a management interface for Amazon EC2 servers, Elastic Load Balancers and Auto Scaling Groups. It sets all that up for you and provides a convenient way of deploying new versions of your application easily. Elastic Beanstalk will create EC2 servers behind an Elastic Load Balancer and use an Auto Scaling configuration to add more servers as your application load increases. It handles adding the servers to the load balancer when they are ready to receive traffic, and removing them from the load balancer and deleting the extra servers when they are no longer needed.
For your Java application running on Tomcat you have a few options to handle horizontal scaling well. You can enable sticky sessions on the Load Balancer so that all requests from a specific user will go to the same server, thus keeping the HttpSession tied to the user. The main problem with this is that if a server is removed from the pool you may lose some HttpSessions and cause any users that were "stuck" to that server to be logged out of your application. The solution to this is to configure your Tomcat instances to store sessions in a shared location. There are Tomcat session store implementations out there that work with AWS services like ElastiCache (Redis) and DynamoDB. I would recommend using one of those, probably the Redis implementation if you aren't already familiar with DynamoDB.
Another consideration for moving a Java application to AWS is that you cannot use any tools or libraries that rely on multi-cast. You may not be using multi-cast for anything, but in my experience every Java app I've had to migrate to AWS relied on multi-cast for clustering and I had to modify it to use a different clustering method.
Also, for a successful migration to AWS I suggest you read up a bit on VPCs, private IP versus public IP, and Security Groups. A solid understanding of those topics is key to setting up your network so that your web servers can communicate with your DB and cache servers in a secure and performant manner.

Correct Spring Cloud+Eureka+Zuul architecture

I have a Spring-Cloud based application.
I have two gateway servers behind an HTTP Load balancer. The gateway servers redirect calls to 3 types of backend servers (let's call them UI1, UI2, REST) by querying a Eureka server for an endpoint.
However, if I am understanding the Spring documentation correctly, once a connection has been made between the client and the endpoint, Eureka is no longer needed until a disaster occurs. Client-side load balancing means that the client now has knowledge of an endpoint, and as long as it is working, it will not fetch a new endpoint from Eureka.
This is good in general, but in my setup, the client is actually the Gateway server, not the client browser. The client browser connects to the HTTP load balancer. Everything else is pretty much managed by the gateway.
So, it appears that if I have 2 Gateways and 6 backend servers of each type - I'm not getting any benefits of scalability. Each gateway will take ownership of the first two backend servers per server type, and that's it. The other 4 servers will just hang there, waiting for some timeout or crash to occur on the first 2 servers, so they could serve the next requests.
This doesn't sound like the correct architecture to me. I would have liked Eureka/Client side load balancing to have the ability to do some sort of round-robin or other approach, that would distribute my calls evenly between all servers as much as possible.
Am I wrong?

What is the best approach to build a system with high amount of data communication?

Hello
I have a cache server (written with Java+Lucene Framework) which keeps large amount of data and provides them according to request query.
It basically works like this:
On the startup, it connects DB and stores all tables to the RAM.
It listens for requests and provides the proper data as array lists (about 1000 - 20000 rows)
When a user visits to the web page, it connects to the cache server, requests, and show the server response.
I planned to run web and cache applications in different instances because of memory issues. Cache Server is as service and web is on Tomcat.
What is your suggestion about how the communication should be built between web side and cache server ?
I need to pass large amount of data with array lists from one instance to another. Should I think web services (xml communication), nio socket communication (maybe Apache MINA) or the solutions like CORBA ?
Thanks.

It really depends very much on considerations you have not specified.
What are the clients? for example, if your clients are javascript running AJAX, obviously something over HTTP is more useful than a proprietary UDP solution.
What network is it working on? Local networks behave differently than internet, and mobile internet is quite different than both.
How elaborate use can you make of caching? If you use HTTP you can have a rather good control (through HTTP headers) of both client cache and network caches, and a plethora of existing software that can make use of both.
There are many other considerations to be taken into account, and there are many existing implementations of systems matching the more-common needs. From your (not very detailed) description you gave, I would recommend having a look at Redis.

How to route subdomains to one or more appropriate nodes within a cluster?

I am trying to solve a distributed computing architecture problem. Here is the scenario.
Users come to my website and registers. As a part of the registration process they get a subdomain. For example, foo.xyz.com.
Now each users website is located/replicated on a one or more cluster nodes using some arbitrary scheme.
When the user request comes in (HTTP request via browser) , appropriate subdomain must be redirected to the matching cluster node. Essentially, I want my own dynamic domain name. I need to implement it in a fast and efficient way.
I've a java based web application which runs inside a Jetty7 container.
thanks,
NG

This definitely should be implemented outside of your application. Your Web application should be, as much as possible, agnostic from the way that requests get balanced in a cluster. The best performance you would get would be with hardware load balancers this one for example.
If you want to go for software based balancing I would configure Apache to serve as entry point and balance the traffic for your cluster with something like mod_proxy. See this tutorial that refers to Jetty.

have you taken a look at Nginx?Nginx may be more than your needs but it does effective job of routing subdomains to particular nodes.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.