Significant runtime differences performing an PHP-Script from Android (WiFi/GPRS) - java

I'm quite sure that I'm overlooking something very fundamentally but I can't understand the following case:
On the one hand i got my Android-app which I may start with using WiFi or GRPS.
On the other hand i got a PHP-Script on an independent server which executes a relatively complex algorithm, but returns just a few bytes (about 80).
The Android-App connects to the server and triggers the PHP-Script by a normal URLConnection, fetches the InputStream and processes it.
Now, when I do that on WiFi, its quite fast. But if I just use GPRS/EDGE its about 10-20 times slower.
Thats what i do not understand... I would understand such a difference if the script does return alot of bytes which has to be transferred but these are just a few bytes.
I would have thought that the runtime of the PHP-Script on the server is fully independent from whos calling it and is providing the information constantly fast.
May somebody tell me where these performance differences might arise from?
greetz
rob

Most likely your script isn't running slower- check your server logs. What you're seeing is the high latency of cellular data plans. Basically, it takes a long time for requests to get from your phone to the tower. It isn't causing your php to run slower, its just taking a while for the data to transfer to your server and back.

Related

Finding the cause for bad XSLT performance in Lucee

We're struggling to find the cause for XSL transformations that perform really bad occasionally for quite some time now.
So far there's nothing we can make out as a real cause, since it can occur under heavy load but also when the server is basically idle. The attached example happened when there were 158 requests in 15 minutes. So, no mentionable load at all.
We were suspecting some external XML documents that are used within the transformations, but that doesn't seem to be the problem either, since they usually load within milliseconds, sometimes maybe seconds, but nothing that would explain the 200+ seconds the requests took.
The same transformations run quite well when we try them later to check if there's a problem.
We are running Fusion Reactor to monitor our server but there's nothing unusual to see as well. In yesterday's cases, there was neither high CPU load nor anything else out of the ordinary.
I attached a screenshot from Fusion Reactor's profiler, where you can see the times taken and it always seems to be the "scanDocument" part that takes up 99.x% of the time, if we interpret the result correctly.
Is there any way to find out what's causing the delay here?
The versions we are currently running are:
Ubuntu: 14.04.5 LTS
Java: 1.8.0_45
Lucee: 4.5.4.017 final
Well, there is 99.8% in SocketInputStream.sockerRead0 so I'd blame a slow network connection.
The rest of the program is just waiting for bytes to arrive over the slow network connection, so you don't see high CPU

Many Java threads downloading content from web - but not saturating the bandwidth

A while back I tried implementing a crawler in Java and left the project for a while (made a lot of progress since). Basically I have implemented a crawler with circa 200-400 threads, each thread connects and downloads the content of one page (simplified for clarity, but that's basically it):
// we're in a run() method of a truely generic Runnable.
// _url is a member passed to the Runnable object beforehand.
Connection c = Jsoup.connect(_url).timeout(10000);
c.execute();
Document d = c.response().parse();
// use Jsoup to get the links, add them to the backbone of the crawler
// to be checked and maybe later passed to the crawling queue.
This works. The problem is I only use a very small fraction of my internet bandwidth. Having the ability to download at >6MB/s, I've identified (using NetLimiter and my own calculations) that I only use about 1MB/s at best when downloading pages sources.
I've done a lot of statistics and analyses and it is somewhat reasonable - if the computer cannot efficiently support over ~400 threads (I don't know about that also, but a larger number of threads seems to be ineffective) and each connection takes about 4 seconds to complete, then I'm supposed to download 100 pages per second which is indeed what happens. The bizarre thing is that many times while I run this program, the internet connection is completely clogged - neither I nor anyone else on my wifi connection can access the web normally (when I'm only using 16%! which does not happen when downloading other files, say movies).
I've spent literally weeks calculating, analyzing and collecting various statistics (making sure all threads are operating with VM monitor, calculating mean run time for threads, excel charts...), before coming here, but I've ran out of answers. I wonder if this behavior could be explained. I realize there's a lot of "ifs" in this question, but it's the best I can do without it turning into an essay.
My computer specs are i5 4460 with 8GB DDR3-1600 and a 100Mb/s (effectively around 8MB/s) internet connection, connected directly via LAN to the crawler. I'm looking for general directions - where else should I look
(I mean obvous stuff that are clear to experienced developers and not myself) in order to either:
Improve the download speed (maybe not Jsoup? different number of threads? I've already tried using selectors instead of threads and it was slower), or:
Free up the internet when I'm running this program.
I've thought about the router itself (Netgear N600) limiting the number of outgoing connections (seems odd), so I'm saturating the number of connections, and not the bandwidth, but couldn't figure out if that's even possible.
Any general direction / advice would be warmly welcomed :) feel free to point out newish mistakes, that's how I learn.
Amir.
The issue was not DNS resolutions, as creating the connections with an IP address (I stored all addresses in advance then used those) resulted in the exact same response times and bandwidth use. Nor was it the threads issue.
I suspect now it was the netlimiter program's "fault". I've measured directly the number of bytes received and outputted these to disk (I've done this before but apprantly I've made some changes in the program). It seems I'm really saturating the bandwidth. Also, when switching to HttpUrlConnection objects instead of Jsoup, the netlimiter program does show a much larger bandwidth usage. Perhaps it has some issue with Jsoup.
I'm not sure this was the problem, but empirically, the program downloads a lot of data. So I hope this helps anyone who might encounter a similar issue in the future.

Java NIO server how to read variable-len packets w/out any header

I'm learning to make Minecraft servers similar to Bukkit for fun. I've dealt with NIO before but not very much and not in a practical way. I'm encountering an issue right now where Minecraft has many variable-length packets and since there's not any sort of consistent "header" for these packets of data, NIO is doing this weird thing where it fragments packets because the data isn't always sent immediately in full.
I learned recently that this is a thing from this thread: Java: reading variable-size packets using NIO I'd rather not use Netty/MINA/etc. because I'd like to learn this all myself as I'm doing this for the education and not with the intention of making it some huge project.
So my question is, how exactly do I go about preventing this sort of fragmenting of packets? I tried using Nagle's algorithm in java.net.Socket#setTcpNoDelay(boolean on) but oddly enough, all this does is make it so that every single time the packet is sent, it's fragmented, whereas when I don't have it enabled, the first packet always comes through OK, and then the following packets become fragmented.
I followed the Rox Java NIO Tutorial pretty closely so I know this code should work, but that tutorial only went as far as echoing a string message back to peers, not complicated bytestreams.
Here's my code. For some context, I'm using Executor#execute(Runnable) to create the two threads. Since I'm still learning about threads and concurrency and trying to piece them together with networking, any feedback on that would be very appreciated as well!!
ServerSocketManager
ServerDataManager
Thanks a lot, I know this is quite a bit of stuff to take in, so I can't thank you enough for taking the time to read & respond!!
TCP is ALWAYS a stream of bytes. You don't get to control when you get them or how many you get. It can come in at any time with any amount. That's why protocols exist.
Headers are a common part of a protocol to tell you how much data you need to read before you have the whole message.
So the short answer here is: You can't.
Everything you're saying you don't want to do -- that's what you have to do.

How to improve the performance of a stock data transfer application?

This is a question which I have worked for several years, but now I still don't get a good solution.
My application has two part:
The first one is running in a server which is called "ROOT server". It will receive the realtime stock data from HKEx(Securities and futures exchange in Hong Kong), and broadcast them to 5 other children servers. It will append a timestamp to each data item when broadcasting.
The second ones are running in the "children" servers. They will receive the stock data from ROOT server, parse each of them, and get the important information. At last, they will send them in a new text format to the clients. The clients may be hundreds to thousands, they can register for some kind of stocks, and get the realtime information of them.
The performance is the most important thing. In the past several years, I tried all kinds of solutions I know to make it faster. The "faster" here means, the first one will receive and send the data to the children servers as fast as it can, and the children servers will receive and parse and send the data to the clients as fast as they can.
For now, when the data speed is 200K from HKEx and there are 5 children servers, the first one application will have 10ms latency for each data item in average. And the second one is not easy to test, it depends on the clients count.
What I'm using:
OpenSUSE 10
Sun Java 5.0
Mina 2.0
The server hardware:
4-core CPU (I don't know the type)
4G ram
I'm considering how to improve the performance.
Do I need to use a concurrent framework as akka
try another language, e.g. Scala? C++?
use the real-time java system?
your advices...
Need your help!
Update:
The applications have logged some important information for analysis, but I don't find any bottlenecks. The HKEx will provide more data in the next year, I don't think my application will be fast enough.
One of my customer have tested our application and another company's, but ours didn't have advantage in speed. I just want to find a way to make it faster.
How is the first application running
The first application will receive the stock data from HKEx and broadcast them to several other servers. The steps are:
It connects HKEx
logins
reads the data. The data is in binary format, each item has a head, which is 2 bytes of integer which means the length of body, then body, then next item.
put them into a hashmap in memory. Key is the sequence of the item, value is the byte array.
log the sequence of each received item into disk. Use log4j's buffer appender.
a daemon thread try to read the data from hashmap, and inserts them into postgresql in every 1 minute. (this is just used to backup the data)
when clients connect to this server, it accepts them and try to send all the data from hashmap from memory. I used thread pool in mina, the acceptor and senders are in different threads.
I think the logic is very simple. When there are 5 clients, I monitored the speed of transfer is only 1.5M/s at most. I used java to write a simplest socket program, and found it can be 10M/s.
Actually, I've spent more than 1 year trying all kinds of solutions on this application, just to make it faster. That why I feel desperate. Do I need to try another language than Java?
about 10ms latency
When the application received a data from HKEx, I will record the timestamp for it. When the root server broadcast the data to the children servers, it will append the timestamp to the data.
when children server get the data, it will send a message to root server to get the current timestamp, then compare them.
So, the 10ms latency contains:
root server got the data ---> the child server got the data
child server send a request for root server's timestamp ---> root server got it
But the 2nd one is very small that we can ignore it.
The first thing to do to find performance bottlenecks is to find out where most of the time is spent. A way to determine this is to use a profiler.
There are open source profilers available such as http://www.eclipse.org/tptp/, or commercial profilers such as Yourkit Java Profiler.
One easy thing to do could be to upgrade the JVM to Java SE6 or Java 7. General JVM performance improved a lot at version 6. See the Java SE 6 Performance White Paper for more details.
If you have checked everything, and found no obvious performance optimizations, you may need to change the architecture to get better performance. This would obviously be most fruitful if you could at least identify where your application is spending time - sounds like there are several major components:
The HK Ex server (out of your control)
The network between the Exchange and your system
The "root" server
The network between the "root" and the "child" servers
The "child" servers
The network between "child" servers and the client
The clients
To know where to spend your time, money and energy, I'd at least want to see an analysis of those components, how long each component takes (min, max, avg), and what the specification is of each resource.
Easiest thing to change is hardware - bigger servers, more memory etc., or better bandwidth. Can you see if any of those resources are constrained?
Next thing to look at is to change the communication protocol to be more efficient - how do clients receive the stocks? Can you reduce data size? 1.5M for only 5 clients sounds a lot...
Next, you might look at some kind of quality of service solution - provide dedicated hardware for "premium" customers, with reduced resource contention, more servers, more bandwidth - this will probably require changes to the architecture.
Next, you could consider changing the architecture - right now, your clients "pull" data from the client servers. You could, instead, "push" data out - that way, you shave off the polling interval on the client end.
At the very end of the list, I'd consider a different technology stack; Java is a fine programming language, but if absolute performance is a key priority, C/C++ is still faster. Clearly, that's a huge change, and a well-written Java app will be faster than a poorly written C/C++ app (and far more stable).
To trace the source of the delay I would add timing data to your end to end process. You can do this using an external log, or by adding meta data to your messages.
What you want to get is a timestamp at key stages in your application 3-5 is enough to start with. Normally I would use System.nanoTime() because I am looking for micro-second delays, but in your case System.currentTimeMillis() is likely to be enough, esp if you average over many samples (you will still get 0.1 ms accuracy on an average, with Ubuntu)
Compare time stamps for the same messages as it passes through your system and look for the highest average delay. Once you have found this try breaking this interval into more stages to zoom in on the problem.
I would analyse any stage which has a verage delay over over 1 ms for your situation.
If clients are updating every minute, there might not be a good technical reason to do this, but you don't want to be seen as being slow and your traders at a disavantage even if in reality it won't make a difference.

Advice for writing Client-Server based game

I'm thinking about writing a game which is based around a server, and several client programs connect to it. The game (very) basically consists of a list of items which a user can 'accept', which would remove it from the list on all connected computers (this needs to update very quickly).
I'm thinking about using a Java applet for the client since I would like this to be portable and run from a browser (mostly in Windows), as well as updating fast, and either a C++ or Java server running on Linux (currently just a home server, but possibly to go on a VPS).
A previous 'incarnation' of this game ran in a browser, and used PHP+mySQL for the backend, but this swamped the server quite a bit when several people connected (that was with about 8 people, this would eventually need to handle a lot more).
The users would probably all be in the same physical location (with the same public IP address), and the system would get several requests per second, all of which would require sending the list back to the clients.
Some computers may have firewall restrictions on them, so would you recommend using HTTP traffic, a custom port, or perhaps through SSH or some existing protocol?
Could anyone suggest some tips (threading, multiple requests of one item?), tools, databases (mySQL?), or APIs which would help me get started on this project? I would prefer C++ for the backend as it would be faster, but using Java would allow me to reuse code.
Thanks!
I wouldn't use C++ because of speed alone. It is highly unlikely that the difference in performance will make a real difference to your game. (Your network is likely to cloud any performance difference, unless you have 10 GigE between the client and server) I would use C++ or Java because you will get it working first using that language.
For anyone looking for a good networking API for c++ I always suggest Boost.Asio. It has the advantage of being platform independent, so you can compile a server for linux, windows etc. However, if you are not too familiar with c++ templates/boost the code can be a little overwhelming. Have a look, give it a try.
In terms of general advice. Given the description above, you seem to need a relatively simple server. I would suggest keeping it very basic, single threaded polling loop. Read a message from your connected clients (wait on multiple sockets), and respond appropriately. This eliminates any issue around multiple accesses to your list and other synchronization problems.
I might also suggest, before you re-write your initial incarnation. Try improving it, as you have stated:
and the system would get several requests per second, all of which would require sending the list back to the clients.
Given that each request removes an item from this list, why not just inform your uses which item is removed, rather than sending the entire list over the network time and time again? If this list is of any significant size, this minor change will result in a large improvement.

Categories

Resources