Scalability advice for large game site

Scalability advice for large game site - java

I'm building a website where players can play a turn based game for virtual credits (like a Poker site, but different). The setup I came up with:
One data server which contains all player accounts with associated data (a database + service). Database and API may be split into two servers, if that helps.
One or more webservers which serve the website, connecting to the data server when needed.
One lobby server where players can find eachother and set up games (multiple is possible, but less user friendly)
Multiple game servers where the game is run (all rules and such are on the server, the client is just a remote control and viewer), with one load balancer.
A game client
The client will be made with Flash, the webserver will use PHP. The rest is all Java.
Communications
Player logs in on the site. Webserver sends username/password to data server, which creates a session key (like a cookie)
Player starts the client. Client connects to lobby server, passing the session key. Lobby server checks this key with the data server
Once a lobby is created and a game must start, the lobby server fetches a game server from the load balancer and sets up a game on this game server.
Lobby server tells the clients to connect to the game server and the game is played.
When the game is finished, the game server lets the lobby server know. The lobby server will check the score and update the credits in the data server.
Protocols:
Java to Java: RMI
PHP or Flash to Java: Custom binary protocol via socket. This protocol supports closing the socket when idle while keeping the virtual connection alive and resumable.
If the client has his wishes, the site will need to support thousands of concurrent players. With this information, can you see any bottlenecks in my setup? I'm personally a little bit worried about the existence of only one data server, but I'm not sure how to split that up. Other scalability (or other) remarks are also welcome.

Your architecture has a lot of single services that are crucial for ANY part of the system to work for ANY user. I considers these SPOFs.
You might want to consider sharding (or horizontal partitioning) for your data server.
Consider multiple lobby servers. The flash client can still disguise them as a single lobby, if you want to. Personally, I don't like playing games with people I cannot talk to in any language I don't understand. Also, I don't like joining a lobby server finding n-thousand games and not knowing anyone. Make multiple lobbies a feature (when you put thought into it, you really can). There's no real use for a lobby with 10000 people. If you still wanna go through with it, you could still try partitioning, based on the assumption that a player filters for specific parameters (opponent level, game type, etc.), trying to split lobbies along one or even multiple criteria.
The load balancer doesn't actually require enough power to be a physical server I suppose. Why not replicate it on all lobby servers? All it has to know is availability / server. Assuming, you have 10000 game servers (which I think is a whole fucking lot in this case) and a refresh rate of 1 second (which is far more than enough here), all you sync is 10000 integers per second (let's assume you can represent availability as a number (which I suppose you can)). If you figure out something better than connecting every game server with every lobby server, this doesn't even require too many connections on a single machine.
In this type of application, I think horizontal partitioning is a good idea, because for one it can be done easily and adds reliability to the system. Assume your SPOFs are partitioned, rather than redundant. This is easier and possibly cheaper. If a part of an SPOF goes down (let's say 1 of your 20 independent and physically distributed data servers), this bad, because 5% of your players are locked out. But probably it will get up some time soon. If your SPOF is redundant, chances are lower that anything fails. But if it does, EVERYBODY is locked out. This is an issue, because you'll have everybody trying to get back online all at the same time. Once your SPOF is back, it'll be hit by an amount of request orders of magnitude higher than it has to handle usually. And you can still employ horizontal partitioning and redundancy at the same time, as proposed for the balancing service.

Having worked on a couple of facebook games, I would say this:
Be thinking about scalability for thousands of players, but you have to get tens of thousands of players before the effort of scaling for those players will pay off.
That is to say, plan ahead, but worry about getting 1 player before you plan a system for thousands of concurrent players.
I suspect that the setup you describe will perform pretty well for your initial user-base. While you are building, avoid doing things like: Requiring the login server to talk to the lobby server. Make each server stand on it's own, the big thing that will kill you is inter-dependency between services.
But the most important thing, is to get it done in the most expedient way you can. If you get enough users to tax your system, that will be a really good thing. You can hire a DBA to help you figure out how to scale out when you have that many users.

Related

Synchronise a variable between java instance across network

I have this assignment in college where they ask us to run a Java app as a socket server with multiple clients. Client sends a string, server returns the string in upper case with a request counter. Quite simple.
Each request made by any given client is counted on the server side and stored in a static variable for each client connection thread. So that each client request increments the counter globally on the server. That's working well.
Now, they ask us to run "backup" instances of that server on different machines on the network so that if the primary stops responding, the client connects to one of the backups. That, I got working. But the counter is obviously reset since it's a different server.
The challenge is that the request counter be the same on the primary and the secondaries so that if the primary responds to 10 requests, goes down, client switch to a backup and makes a request, the backup server responds 11.
Here is what I considered:
if on the same PC, I'd use threads but we're over the network so I
believe this will not work.
server sends that counter to the client
with the response, which in turn returns it to the server at the
next request and so forth. Not very "clean" imo but could work.
Each server talks to each other to sync this counter. However, sockets
don't seem to be very efficient to do this, if even possible. RMI
seems to be an alternative here but I'd like confirmation before I
start learning it.
Any leads or suggestions here? I'm not posting code because I don't need a solution here but if necessary, I can invite to the gihub repo.
EDIT: There is no latency, reliability or similar constraints for this project. There is X number of clients and Y number of servers (single master, multiple failovers). Additional third party infrastructure like a DB isn't an option really but third party Java librairies are welcome. Basically I just run in Eclipse on multiple PCs. This is an introduction assignment to distributed systems, expected done in 2 weeks so I believe "keep it simple" is the key here!
EDIT 2: The number and addresses of backup servers will be passed as arguments to the application so broadcast/discovery isn't necessary. We'll likely cover all those points in a later lab assignment in the semester :)
EDIT 3: From all your great suggestions, I'll try an implementation of some variation of #3 and let you know how it works. I think the issue I have here is to make sure all servers are aware of the others. But like I mentioned, they don't need to discover each other so I'll hard code it for now and revisit in the next assignment! Probably opt for some elected master... :)

If option #2 is allowed, then it is the easiest, however I am not sure how it could work in the face of multiple clients (so it depends on the requirements here).
Is it possible to back the servers by a shared db running on another computer? Ideally perhaps one clustered across multiple machines? Or can you use an event bus or 3rd party libraries / code (shared cache, JMS, or even EJBs)?
If not, then having the servers talk to each other is your best bet. Sockets can work, as could UDP multicast (careful there though, no way to know if a message was missed which is why TCP / sockets are safer). If the nodes are going to talk to each other there are generally a few accepted ways to handle the setup:
Master / slaves: Current node is the master and all writes are to it. Slaves connect to the master and receive updates. When the master goes down a new master needs to be elected (see leader election). MongoDB works like this.
Everyone to everyone: Every node connects to every other known node. Can get complicated and might not scale well to lots of nodes.
Daisy chain: one node connects to the next node, which connects to the next, and so on. I don't believe this is widely used.
Ring network: Each node connects to two others in order to form a ring. This is generally superior to daisy chain, but a little bit more complicated to implement.
See here for more examples: https://en.wikipedia.org/wiki/Network_topology
If this was in the real world (i.e. not school), you would use either a shared cache (e.g. ehcache), local caches backed by an event bus (JMS of some sort), or a shared clustered db.
EDIT:
After re-reading your question, it seems you only have a single backup server to worry about, and my guess of the course requirements is that they simply want your backup server to connect to your primary server and also receive the variable count updates. This is completely fine to implement with sockets (it isn't inefficient for a single backup server), and is perhaps the solution they are expecting you to use.
E.g. Backup server connects to primary server and either polls for updates across the held connection or simply listens for updates issued from the primary server.
Key notes:
- You might need keep alives to ensure the connection does not get killed.
- Make sure to implement re-connection logic if the connection from backup to primary dies.
If this is for a networking course they may be expecting UDP multicast, but that may depend a little bit on the server / network environment.

This is a classic distributed systems problem. The right solution is some variation of your option #3, where the different servers communicate with each other.
Where it gets complicated is when you start to introduce latency, downtime, and/or network partitioning between the various servers. Eventually you'll need to arrive at some kind of consensus algorithm. Paxos is a well-known approach to this problem, but there are others; Raft is popular these days as well.

In my opinion best solution is to have vector of the counters. One counter per one server. Each server increments its own counter and broadcast vector value to all other servers. This data structure is conflict-free replicated data type.
Number of requests is calculated as sum of all elements of the vector.
About consistency. If you need strictly growing number on all servers you need to synchronously replicate you new value before answer to client.
The penalty here is performance and availability.
About broadcasting. You can choose any broadcasting algorithm you want. If number of servers are not too large you can use full mesh topology. If number of server become large you can use ring or star topology to replicate data.

The most real life would be option 3. It happens all the time. Nodes talk to one another on another port. So they self discover by broadcast (UDP). So each server broad casts its max on a UDP port. Other nodes listen and up their value that + 1 if their current value is less than that value, else ignore it and instead broadcast their bigger value.
This will work best when there is a 2-300 gap between client calls. This also assumes that any server could be primary (as decided by a load balancer).
UDP is stable within a LAN. Used widely.

Solutions to this problem trade off speed against consistency.
If you value consistency over speed you could try a synchronous approach (assuming servers A, B and C):
A receives initial request
A opens connection to B and C to request current counts from each
A calculates max count (based on its own value and the values from B and C), adds one and sends new count to B and C
A closes connections to B and C
A replies to original request, including new max count
At this point, all servers are in sync with the new max count, ready for a new request to any server.
Edit: Of course, if you are able to use a shared database, the problem becomes much simpler.

Multi-client multi instance centralised server using TCP protocol gaming system in Java

i'm not a newbie in Java programming. I would like to know how can i proceed with my project.
I want to develop a centralized gaming system in Java using TCP/IP protocol socket system. It should get the player details and display the information in the gaming window.
These are my criterias:
Maximum & Minimum number of players can participate??
The behavior of the server in front of a given state of the game board: should invite one or more players to offer their shots, notify the blow of an opponent or a player can declare the party over?
How to update the game when a shot is provided by a player???
I'm not looking for a straight answere here. I'm looking for some guidence which would be helpful for me to start with the project. Is there are any tools for Multi-client multi instance centralised server using TCP protocol???

First the network layer
There are several network libraries for java, mina,netty..
With the help of those networking library, you can solve the networking problems easily.
And the Logic Layer
You should maintain all the user_context objects in your server memory, and bind each of them to corresponding tcp connection. In most time, the user_context objects maintain as a hashmap/dictionary of RB-tree.
So,when some events happen, you can find corresponding user/client and send the message to them.

I would think the minimum number of players is 0;
The maximum is likely to be dependent on;
- your bandwidth, you need to have a significant upload speed for you want thousands of users.
- how much work there is in managing each user. You can connect to 10,000 users on a single server if they are not doing much, but as you add functionality, the number of users per server will drop to 1000 possibly only 100.
The choice of IO framework makes a big difference when you have unlimited bandwidth and trivial work per connection (usually copying byte[] of zeros) For real applications, its less likely to matter. I suggest what ever solution you pick you make it easy to replace, should you find a better solution later.
Is there are any tools for Multi-client multi instance centralised server using TCP protocol???
A common tool used is JMS, but games would be one area you might not use it. I would start with ActiveMQ as this will get you up and running quickly, just make sure you can replace it easily later.

How to improve the performance of a stock data transfer application?

This is a question which I have worked for several years, but now I still don't get a good solution.
My application has two part:
The first one is running in a server which is called "ROOT server". It will receive the realtime stock data from HKEx(Securities and futures exchange in Hong Kong), and broadcast them to 5 other children servers. It will append a timestamp to each data item when broadcasting.
The second ones are running in the "children" servers. They will receive the stock data from ROOT server, parse each of them, and get the important information. At last, they will send them in a new text format to the clients. The clients may be hundreds to thousands, they can register for some kind of stocks, and get the realtime information of them.
The performance is the most important thing. In the past several years, I tried all kinds of solutions I know to make it faster. The "faster" here means, the first one will receive and send the data to the children servers as fast as it can, and the children servers will receive and parse and send the data to the clients as fast as they can.
For now, when the data speed is 200K from HKEx and there are 5 children servers, the first one application will have 10ms latency for each data item in average. And the second one is not easy to test, it depends on the clients count.
What I'm using:
OpenSUSE 10
Sun Java 5.0
Mina 2.0
The server hardware:
4-core CPU (I don't know the type)
4G ram
I'm considering how to improve the performance.
Do I need to use a concurrent framework as akka
try another language, e.g. Scala? C++?
use the real-time java system?
your advices...
Need your help!
Update:
The applications have logged some important information for analysis, but I don't find any bottlenecks. The HKEx will provide more data in the next year, I don't think my application will be fast enough.
One of my customer have tested our application and another company's, but ours didn't have advantage in speed. I just want to find a way to make it faster.
How is the first application running
The first application will receive the stock data from HKEx and broadcast them to several other servers. The steps are:
It connects HKEx
logins
reads the data. The data is in binary format, each item has a head, which is 2 bytes of integer which means the length of body, then body, then next item.
put them into a hashmap in memory. Key is the sequence of the item, value is the byte array.
log the sequence of each received item into disk. Use log4j's buffer appender.
a daemon thread try to read the data from hashmap, and inserts them into postgresql in every 1 minute. (this is just used to backup the data)
when clients connect to this server, it accepts them and try to send all the data from hashmap from memory. I used thread pool in mina, the acceptor and senders are in different threads.
I think the logic is very simple. When there are 5 clients, I monitored the speed of transfer is only 1.5M/s at most. I used java to write a simplest socket program, and found it can be 10M/s.
Actually, I've spent more than 1 year trying all kinds of solutions on this application, just to make it faster. That why I feel desperate. Do I need to try another language than Java?
about 10ms latency
When the application received a data from HKEx, I will record the timestamp for it. When the root server broadcast the data to the children servers, it will append the timestamp to the data.
when children server get the data, it will send a message to root server to get the current timestamp, then compare them.
So, the 10ms latency contains:
root server got the data ---> the child server got the data
child server send a request for root server's timestamp ---> root server got it
But the 2nd one is very small that we can ignore it.

The first thing to do to find performance bottlenecks is to find out where most of the time is spent. A way to determine this is to use a profiler.
There are open source profilers available such as http://www.eclipse.org/tptp/, or commercial profilers such as Yourkit Java Profiler.
One easy thing to do could be to upgrade the JVM to Java SE6 or Java 7. General JVM performance improved a lot at version 6. See the Java SE 6 Performance White Paper for more details.

If you have checked everything, and found no obvious performance optimizations, you may need to change the architecture to get better performance. This would obviously be most fruitful if you could at least identify where your application is spending time - sounds like there are several major components:
The HK Ex server (out of your control)
The network between the Exchange and your system
The "root" server
The network between the "root" and the "child" servers
The "child" servers
The network between "child" servers and the client
The clients
To know where to spend your time, money and energy, I'd at least want to see an analysis of those components, how long each component takes (min, max, avg), and what the specification is of each resource.
Easiest thing to change is hardware - bigger servers, more memory etc., or better bandwidth. Can you see if any of those resources are constrained?
Next thing to look at is to change the communication protocol to be more efficient - how do clients receive the stocks? Can you reduce data size? 1.5M for only 5 clients sounds a lot...
Next, you might look at some kind of quality of service solution - provide dedicated hardware for "premium" customers, with reduced resource contention, more servers, more bandwidth - this will probably require changes to the architecture.
Next, you could consider changing the architecture - right now, your clients "pull" data from the client servers. You could, instead, "push" data out - that way, you shave off the polling interval on the client end.
At the very end of the list, I'd consider a different technology stack; Java is a fine programming language, but if absolute performance is a key priority, C/C++ is still faster. Clearly, that's a huge change, and a well-written Java app will be faster than a poorly written C/C++ app (and far more stable).

To trace the source of the delay I would add timing data to your end to end process. You can do this using an external log, or by adding meta data to your messages.
What you want to get is a timestamp at key stages in your application 3-5 is enough to start with. Normally I would use System.nanoTime() because I am looking for micro-second delays, but in your case System.currentTimeMillis() is likely to be enough, esp if you average over many samples (you will still get 0.1 ms accuracy on an average, with Ubuntu)
Compare time stamps for the same messages as it passes through your system and look for the highest average delay. Once you have found this try breaking this interval into more stages to zoom in on the problem.
I would analyse any stage which has a verage delay over over 1 ms for your situation.
If clients are updating every minute, there might not be a good technical reason to do this, but you don't want to be seen as being slow and your traders at a disavantage even if in reality it won't make a difference.

Advice for writing Client-Server based game

I'm thinking about writing a game which is based around a server, and several client programs connect to it. The game (very) basically consists of a list of items which a user can 'accept', which would remove it from the list on all connected computers (this needs to update very quickly).
I'm thinking about using a Java applet for the client since I would like this to be portable and run from a browser (mostly in Windows), as well as updating fast, and either a C++ or Java server running on Linux (currently just a home server, but possibly to go on a VPS).
A previous 'incarnation' of this game ran in a browser, and used PHP+mySQL for the backend, but this swamped the server quite a bit when several people connected (that was with about 8 people, this would eventually need to handle a lot more).
The users would probably all be in the same physical location (with the same public IP address), and the system would get several requests per second, all of which would require sending the list back to the clients.
Some computers may have firewall restrictions on them, so would you recommend using HTTP traffic, a custom port, or perhaps through SSH or some existing protocol?
Could anyone suggest some tips (threading, multiple requests of one item?), tools, databases (mySQL?), or APIs which would help me get started on this project? I would prefer C++ for the backend as it would be faster, but using Java would allow me to reuse code.
Thanks!

I wouldn't use C++ because of speed alone. It is highly unlikely that the difference in performance will make a real difference to your game. (Your network is likely to cloud any performance difference, unless you have 10 GigE between the client and server) I would use C++ or Java because you will get it working first using that language.

For anyone looking for a good networking API for c++ I always suggest Boost.Asio. It has the advantage of being platform independent, so you can compile a server for linux, windows etc. However, if you are not too familiar with c++ templates/boost the code can be a little overwhelming. Have a look, give it a try.
In terms of general advice. Given the description above, you seem to need a relatively simple server. I would suggest keeping it very basic, single threaded polling loop. Read a message from your connected clients (wait on multiple sockets), and respond appropriately. This eliminates any issue around multiple accesses to your list and other synchronization problems.
I might also suggest, before you re-write your initial incarnation. Try improving it, as you have stated:
and the system would get several requests per second, all of which would require sending the list back to the clients.
Given that each request removes an item from this list, why not just inform your uses which item is removed, rather than sending the entire list over the network time and time again? If this list is of any significant size, this minor change will result in a large improvement.

Critically efficient server

I am developing a client-server based application for financial alerts, where the client can set a value as the alert for a chosen financial instrument , and when this value will be reached the monitoring server will somehow alert the client (email, sms ... not important) .The server will monitor updates that come from a data generator program. Now, the server has to be very efficient as it has to handle many clients (possible over 50-100.000 alerts ,with updates coming at 1,2 seconds) .I've written servers before , but never with such imposed performances and I'm simply afraid that a basic approach(like before) will just not do it . So how should I design the server ?, what kind of data structures are best suited ?..what about multithreading ?....in general what should I do (and what I should not do) to squeeze every drop of performance out of it ?
Thanks.

I've worked on servers like this before. They were all written in C (or fairly simple C++). But they were even higher performance -- handling 20K updates per second (all updates from most major stock exchanges).
We would focus on not copying memory around. We were very careful in what STL classes we used. As far as updates, each financial instrument would be an object, and any clients that wanted to hear about that instrument would subscribe to it (ie get added to a list).
The server was multi-threaded, but not heavily so -- maybe a thread handing incoming updates, one handling outgoing client updates, one handling client subscribe/release notifications (don't remember that part -- just remember it had fewer threads than I would have expected, but not just one).
EDIT: Oh, and before I forget, the number of financial transactions happening is growing at an exponential rate. That 20K/sec server was just barely keeping up and the architects were getting stressed about what to do next year. I hear all major financial firms are facing similar problems.

You might want to look into using a proven message queue system, as it sounds like this is basically what you are doing in your application.
Projects like Apache's ActiveMQ or RabbitMQ are already widely used and highly tuned, and should be able to support the type of load you are talking about outside of the box.

I would think that squeezing every drop of performance out of it is not what you want to do, as you really never want that server to be under load significant enough to take it out of a real-time response scenario.
Instead, I would use a separate machine to handle messaging clients, and let that main, critical server focus directly on processing input data in "real time" to watch for alert criteria.

Best advice is to design your server so that it scales horizontally.
This means distributing your input events to one or more servers (on the same or different machines), that individually decide whether they need to handle a particular message.
Will you be supporting 50,000 clients on day 1? Then that should be your focus: how easily can you define a single client's needs, and how many clients can you support on a single server?
Second-best advice is not to artificially constrain yourself. If you say "we can't afford to have more than one machine," then you've already set yourself up for failure.

Beware of any architecture that needs clustered application servers to get a reasonable degree of performance. London Stock Exchange had just such a problem recently when they pulled an existing Tandem-based system and replaced it with clustered .Net servers.
You will have a lot of trouble getting this type of performance from a single Java or .Net server - really you need to consider C or C++. A clustered architecture is much more error prone to build and deploy and harder to guarantee uptime from.
For really high volumes you need to think in terms of using asynchronous I/O for networking (i.e. poll(), select() and asynchronous writes or their Windows equivalents), possibly with a pool of worker threads. Read up about the C10K problem for some more insight into this.
There is a very mature C++ framework called ACE (Adaptive Communications Environment) which was designed for high volume server applications in telecommunications. It may be a good foundation for your product - it has support for quite a variety of concurrency models and deals with most of the nuts and bolts of synchronisation within the framework. You might find that the time spent learning how to drive this framework pays you back in less development and easier implementation and testing.

One Thread for the receiving of instrument updates which will process the update and put it in a BlockingQueue.
One Thread to take the update from the BlockingQueue and hand it off to the process that handles that instrument, or set of instruments. This process will need to serialize the events to an instrument so the customer will not receive notices out-of-order.
This process (Thread) will need to iterated through the list of customers registered to receive notification and create a list of customers who should be notified based on their criteria. The process should then hand off the list to another process that will notify the customer of the change.
The notification process should iterate through the list and send each notification event to another process that handles how the customer wants to be notified (email, etc.).
One of the problems will be that with 100,000 customers synchronizing access to the list of customers and their criteria to be monitored.

You should try to find a way to organize the alerts as a tree and be able to quickly decide what alerts can be triggered by an update.
For example let's assume that the alert is the level of a certain indicator. Said indicator can have a range of 0, n. I would groups the clients who want to be notified of the level of the said indicator in a sort of a binary tree. That way you can scale it properly (you can actually implement a subtree as a process on a different machine) and the number of matches required to find the proper subset of clients will always be logarithmic.

Probably the Apache Mina network application framework as well as Apache Camel for messages routing are the good start point. Also Kilim message-passing framework looks very promising.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.