Querying Servers

Querying Servers - java

I want to query few game servers.
I have made a server refresher in Java which queries each server in a loop one by one.
Will changing to C++/C or PHP make querying faster or should i stick to Java ??
UDP Packets are sent / received to query a server.
Also, is there any faster way to do this other than one by one in loop.
Worst case time(when all servers offline ) is 200ms X number of servers . (2s is timeout for each). which becomes large when server list is huge.

You will not gain anything by switching language. Since network I/O is your main bottleneck you should consider doing the querying concurrently. Use threads or a threadpool to query multiple servers at once.

There are a few ways you could speed this up:
Use C instead, which can be slightly faster if you know how to write good C code.
Add multi-threading by querying multiple servers at the same time.
Use multiple servers (e.g. VPSs) from different continents and use those to query the gameservers closest to them. This will significantly decrease the latency.

Related

Connection-pooling in the scalable java web app

I'm just curious how to solve the connection-pooling problem in the scalable java application.
Imagine I have java web application with HikariCP set up (max pool size is 20) and PosgtreSQL with max allowed connections 100.
And now I want to implement scalability approach for my web app (no matter how) end even with autoscaling. So I don't know how many web app replicas will be eventually, it may dynamically change (caused by some reasons e.g. cluster workload).
But there is the problem. When I create more then 5 web app replicas cause my total connection count exceeds max allowed connection.
Are there any best practices to solve this problem (except evident increasing max allowed connections/decreasing pool size)?
Thanks

You need an orchestrator over the web application. It would be responsible for the scaling in-out and it will manage the connections in order not to exceed the limitation of 100. It will open-close the connections according to the traffic.
Nevertheless, my recommendation is to take into consideration the migration into a no-SQL database which is more suitable solution for scalability and performance.

I'll start by saying that whatever you do, as long as you're restricted by 100 connections to your DB - it will not scale!
That said, you can optimize and "squeeze" performance out of it by applying a couple of known tricks. It's important to understand the trade-offs (availability vs. consistency, latency vs. throughput and etc):
Caching: if you can anticipate certain select queries you can calculate them offline (maybe even from a replica?) and cache the results. The tradeoff: the user might get results which are not up-to-date
Buffering/throttling: all updates/inserts go to a queue and there are only a few workers which are allowed to pull from the queue and update the DB. Tradeoff: you get more availability but becomes "eventually consistent" (since updates won't be visible right away).
It might come to that you'll have to run the selects in async manner as well, which means that the user submits a query, and when it's ready it'll be "pushed" back to the client (or the client can keep "polling" every few seconds). It can be implemented with a callback as well.
By separating the updates (writes) from reads you'll be able to get more performance by creating replicas that are "read only" and which can be used by the webservers for read-queries.

Running a multithreaded programs using multiple host

I have a program which spins up thousands of threads. I am currently using one host for all the threads which takes a lot of time. If I want to use multiple hosts (say 10 hosts, each running 100 different threads), how should I proceed ?

Having thousands of threads on a single JVM sounds like a bad idea - you may spend most time context-switching instead of doing the actual work.
To split your work across multiple host, you cannot use threads managed by a single JVM. You'll need to have each host exposing an API that can receive part of work and return the result of the work done.
One approach would be to use Java RMI (remote method invocation) to complete this task, but really, your question lacks so many details important for the decision of what architecture to choose.

Creating 1000 threads in on JVM is very bad design and need to minimise count.
High thread count will not give you multi-threading benefit as context switching will be very frequent and will hit performance.
If you are thinking of dividing in multiple hosts then you need parallel processing system like Hadoop /Spark.
They internally handles task allocation as well as central system for syncing all hosts on which threads/tasks are running.

Multiple Concurrent SQL Queries - Performance Inquiry

So I've been browsing questions that may already have my answer, but they do not directly answer my question, my situation is I'd like to write a plugin for a game that will collect statistics, one of them would be collecting a statistic that could potentially with enough players in the server be about easily 200 queries within 3 seconds, on the specifications shown below, querying a remote database I have two questions, the first being, is this going to cause noticeable network issues on a 100Mbit port, and my second question being, will all the queries show tremendous amounts of CPU usage on top of a highly intensive game engine that takes a lot of CPU usage?
Server Specifications
- i3 3420 4 Cores
- 16GB RAM
- 100Mbit Port
On a Side Note,
Would moving the database to the local server reduce potential usage to the point where it's highly recommended?

Well, without knowing the amount of data being stored, it's hard to make any judgement calls on this one. However, a couple of things...
I doubt any database could handle 200 queries in 3 seconds on that kind of machine, unless you have tables with only a few records.
The 100mbit port won't be a problem; you're not actually transporting the whole database across the wire, just the query ("SELECT FROM ...") and the results (which should be a single row for statistics).
However, you will bog down the server with such queries, causing hickups and delays for your gamers. My recommendation for you would be to replicate the gamedatabase to a separate server (a simple Master/Slave setup) and perform your statistics queries on the slave database.

Is there a Java local queue library I can use that keeps memory usage low by dumping to the hard drive?

This maybe not possible but I thought I might just give it a try. I have some work that process some data, it makes 3 decisions with each data it proceses: keep, discard or modify/reprocess(because its unsure to keep/discard). This generates a very large amount of data because the reprocess may break the data into many different parts.
My initial method was to send it to my executionservice that was processing the data but because the number of items to process was large I would run out of memory very quickly. Then I decided to maybe offload the queue off to a messaging server(rabbitmq) which works fine but now I'm bound by network IO. What I like about rabbitmq is it keeps messages in memory up to a certain level and then dumps old messages to the local drive so if I have 8 gigs of memory on my server I can still have a 100 gig message queue.
So my question is, is there any library that has a similar feature in Java? Something that I can use as a nonblocking queue that keeps only X items in queue(either by number of items or size) and writes the rest to the local drive.
note: Right now I'm only asking for this to be used on one server. In the future I might add more servers but because each server is self-generating data I would try to take messages from one queue and push them to another if one server's queue is empty. The library would not need to have network access but I would need to access the queue from another Java process. I know this is a long shot but thought if anyone knew it would be SO.

Not sure if it id the approach you are looking for, but why not using a lightweight database like hsqldb and a persistence layer like hibernate? You can have your messages in memory, then commit to db to save on disk, and later query them, with a convenient SQL query.

Actually, as Cuevas wrote, HSQLDB could be a solution. If you use the "cached table" provided, you can specify the maximum amount of memory used, exceeding data will be sent to the hard drive.

Use the filesystem. It's old-school, yet so many engineers get bitten with libraries because they are lazy. True that HSQLDB provides lots of value-add features, but in the context of being light weight....

How to improve the performance of a stock data transfer application?

This is a question which I have worked for several years, but now I still don't get a good solution.
My application has two part:
The first one is running in a server which is called "ROOT server". It will receive the realtime stock data from HKEx(Securities and futures exchange in Hong Kong), and broadcast them to 5 other children servers. It will append a timestamp to each data item when broadcasting.
The second ones are running in the "children" servers. They will receive the stock data from ROOT server, parse each of them, and get the important information. At last, they will send them in a new text format to the clients. The clients may be hundreds to thousands, they can register for some kind of stocks, and get the realtime information of them.
The performance is the most important thing. In the past several years, I tried all kinds of solutions I know to make it faster. The "faster" here means, the first one will receive and send the data to the children servers as fast as it can, and the children servers will receive and parse and send the data to the clients as fast as they can.
For now, when the data speed is 200K from HKEx and there are 5 children servers, the first one application will have 10ms latency for each data item in average. And the second one is not easy to test, it depends on the clients count.
What I'm using:
OpenSUSE 10
Sun Java 5.0
Mina 2.0
The server hardware:
4-core CPU (I don't know the type)
4G ram
I'm considering how to improve the performance.
Do I need to use a concurrent framework as akka
try another language, e.g. Scala? C++?
use the real-time java system?
your advices...
Need your help!
Update:
The applications have logged some important information for analysis, but I don't find any bottlenecks. The HKEx will provide more data in the next year, I don't think my application will be fast enough.
One of my customer have tested our application and another company's, but ours didn't have advantage in speed. I just want to find a way to make it faster.
How is the first application running
The first application will receive the stock data from HKEx and broadcast them to several other servers. The steps are:
It connects HKEx
logins
reads the data. The data is in binary format, each item has a head, which is 2 bytes of integer which means the length of body, then body, then next item.
put them into a hashmap in memory. Key is the sequence of the item, value is the byte array.
log the sequence of each received item into disk. Use log4j's buffer appender.
a daemon thread try to read the data from hashmap, and inserts them into postgresql in every 1 minute. (this is just used to backup the data)
when clients connect to this server, it accepts them and try to send all the data from hashmap from memory. I used thread pool in mina, the acceptor and senders are in different threads.
I think the logic is very simple. When there are 5 clients, I monitored the speed of transfer is only 1.5M/s at most. I used java to write a simplest socket program, and found it can be 10M/s.
Actually, I've spent more than 1 year trying all kinds of solutions on this application, just to make it faster. That why I feel desperate. Do I need to try another language than Java?
about 10ms latency
When the application received a data from HKEx, I will record the timestamp for it. When the root server broadcast the data to the children servers, it will append the timestamp to the data.
when children server get the data, it will send a message to root server to get the current timestamp, then compare them.
So, the 10ms latency contains:
root server got the data ---> the child server got the data
child server send a request for root server's timestamp ---> root server got it
But the 2nd one is very small that we can ignore it.

The first thing to do to find performance bottlenecks is to find out where most of the time is spent. A way to determine this is to use a profiler.
There are open source profilers available such as http://www.eclipse.org/tptp/, or commercial profilers such as Yourkit Java Profiler.
One easy thing to do could be to upgrade the JVM to Java SE6 or Java 7. General JVM performance improved a lot at version 6. See the Java SE 6 Performance White Paper for more details.

If you have checked everything, and found no obvious performance optimizations, you may need to change the architecture to get better performance. This would obviously be most fruitful if you could at least identify where your application is spending time - sounds like there are several major components:
The HK Ex server (out of your control)
The network between the Exchange and your system
The "root" server
The network between the "root" and the "child" servers
The "child" servers
The network between "child" servers and the client
The clients
To know where to spend your time, money and energy, I'd at least want to see an analysis of those components, how long each component takes (min, max, avg), and what the specification is of each resource.
Easiest thing to change is hardware - bigger servers, more memory etc., or better bandwidth. Can you see if any of those resources are constrained?
Next thing to look at is to change the communication protocol to be more efficient - how do clients receive the stocks? Can you reduce data size? 1.5M for only 5 clients sounds a lot...
Next, you might look at some kind of quality of service solution - provide dedicated hardware for "premium" customers, with reduced resource contention, more servers, more bandwidth - this will probably require changes to the architecture.
Next, you could consider changing the architecture - right now, your clients "pull" data from the client servers. You could, instead, "push" data out - that way, you shave off the polling interval on the client end.
At the very end of the list, I'd consider a different technology stack; Java is a fine programming language, but if absolute performance is a key priority, C/C++ is still faster. Clearly, that's a huge change, and a well-written Java app will be faster than a poorly written C/C++ app (and far more stable).

To trace the source of the delay I would add timing data to your end to end process. You can do this using an external log, or by adding meta data to your messages.
What you want to get is a timestamp at key stages in your application 3-5 is enough to start with. Normally I would use System.nanoTime() because I am looking for micro-second delays, but in your case System.currentTimeMillis() is likely to be enough, esp if you average over many samples (you will still get 0.1 ms accuracy on an average, with Ubuntu)
Compare time stamps for the same messages as it passes through your system and look for the highest average delay. Once you have found this try breaking this interval into more stages to zoom in on the problem.
I would analyse any stage which has a verage delay over over 1 ms for your situation.
If clients are updating every minute, there might not be a good technical reason to do this, but you don't want to be seen as being slow and your traders at a disavantage even if in reality it won't make a difference.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.