How to measure one-way latency? - java

I wanna measure the time the troughput takes from my client to my server. Currently i can only measure the full trip (from client to server and back to client) i can do this by measuring the time before we send a packet and measuring it after we receive it back from the server. Technically speaking, if i were to devide the full trip time i would get an avarage of each one-way throughput.
But what if some throughput actually took longer to arrive like this:
In the image i created the throughput from client to server is 30 ms and from server to client 90 ms. If the data would have such arrival rates then measuring the full round trip and dividing it by 2 would not give an accurate one-way arrival time. How can i accurately measure one-way arrival times?

How can i accurately measure one-way arrival times
TL;DR - YOU CAN'T
This is actually a very deep philosophical question that is unanswered at the very core of physics (the rock bottom "metal" of the universe). Physics does not unequivocally know the one-way speed of light, only the two-way speed. No experiment we have so far devised can answer that question. See The One-Way Speed of Light.
Although the average speed over a two-way path can be measured, the one-way speed in one direction or the other is undefined (and not simply unknown), unless one can define what is "the same time" in two different locations. To measure the time that the light has taken to travel from one place to another it is necessary to know the start and finish times as measured on the same time scale. This requires either two synchronized clocks, one at the start and one at the finish, or some means of sending a signal instantaneously from the start to the finish. No instantaneous means of transmitting information is known. Thus the measured value of the average one-way speed is dependent on the method used to synchronize the start and finish clocks. This is a matter of convention.
You can get arbitrarily close for non-relativistic situations by synchronizing clocks, but how do you know the clocks stay synchronized? For your case you'd have to agree to synchronize on the same time signal, but propagation delays can introduce tens to hundreds of milliseconds of delay and jitter.
So if you want to pin down one-way times to an accuracy less than clock jitter you're out of luck. Here's the output from ntpq peer on one of my Linux systems.
remote refid st t when poll reach delay offset jitter
==============================================================================
*unifi.versadns. 71.66.197.233 2 u 289 1024 377 2.298 -0.897 0.747
+eterna.binary.n 68.97.68.79 2 u 615 1024 377 42.258 -3.640 0.430
+homemail.org 139.78.97.128 2 u 160 1024 377 45.257 -0.209 0.391
-time.skylineser 130.207.244.240 2 u 418 1024 103 24.829 2.066 1.376
You might be able to pin down one-way time to within 5ms if both systems use the same master clock for synchronization and have had enough time for delay and jitter to stabilize.

OWAMP - owping
you need to have client and server running,
use NTP or chrony to Synchronize time.
There are expensive hardware to sync time between machines as well.

Related

How to run 200k requests in an hour from JMeter

I want to hit my application with 200,000 requests in an hour. I am using JMeter to perform this testing.
I am doing this test on a local spring boot application.
The result I am looking for is to configure JMeter thread group for a throughput of 55 hits per second.
How should I configure the thread group to get 200,000 in an hour?
So far I've tried -
First Approach
i) Reduced my requirement to 10,000 requests in 180 seconds which is a smaller proportion to my original ask.
ii) Number of threads - 10,000
iii) Ramp-up period - 180 seconds
iv) This works fine and I get the desired throughout of 55/sec, but this fails further as I increase the proportion to 50,000 threads in 900 seconds(15 minutes), I get the desired throughout but JMeter and my system becomes extremely slow and I also notice 6% error in response. I believe this is not the correct approach when I want to achieve 200k and more.
Second Approach
i) I found this solution to put in a Constant throughput timer, which I did.
ii) Number of threads - 10
iii) Ramp-up period - 900 seconds
iv) Infinite loop
v) Target throughout in minutes - 3300
vi) calculate throughout based on - all active threads.
Although I had configured the ramp-up to be 15 minutes, it seems to run for more than 40 minites until it achieves a throughput of 55/sec due to the constant throughput timer. Found no errors in response in this approach.
The easiest is going for Precise Throughput Timer
However you need to understand 2 things:
Timers can only pause JMeter samplers execution rate in order to limit JMeter's throughput to the desired value so you need to supply sufficient number of threads in the Thread Group. For instance 200k requests per hour is more or less 55 requests per second, if your application response time is 1 second - you will need 55 threads, if it's 2 seconds - you will need 110 threads and so on. The most convenient option is using Throughput Shaping Timer and Concurrency Thread Group combination, they can be connected together via Feedback Function so JMeter would be able to kick off more threads if the current amount is not sufficient to conduct the required load.
The system under test must be capable of processing that many requests per hour, if it doesn't - you won't be able to achieve this whatever you do from JMeter side.

Measuring time between hops in network (JAVA)

I'm trying to gather data of time that a HTTP request take to travel from 1 node to another in a network, here's a simple network topology that I'm working on:
I'm using Raspberry Pi 4 model B
PC ---- RaspPi(1) ---- RaspPi(2) ---- RaspPi(n) ---- ...
Each of these nodes have their own application that can work with HTTP, the idea to gather data is:
Suppose I have a HTTP request that has RaspPi(n) as its destination, now once the request traverse through each node, I logged out the TIMESTAMP when it reaches the node, from then I can calculate DeltaT, which is the time it takes for my request to travel between 2 consecutive nodes.
I have tried to use:
Date now = Calendar.getInstance();
TimeStamp ts = new TimeStamp(now.getTime());
And
System.currentTimeMillis()
to get the TIMESTAMP, the problem is, data that I gathered have negative DeltaT, which for example: TIMESTAMP at RaspPi(2) is before TIMESTAMP at RaspPi(1). I've done some searching around and found that the 2 methods I used above are not monotonic (Source 1 and Source 2).
So the other method I'm thinking is to use System.nanoTime() but this doesn't seem to work on different JVMs, which is all of my network nodes.
I don't know if there is a better approach to gather these data, or some work around that I can do to fix those methods I used.
Please let me know if I haven't make myself clear. Thanks for reading.
For getting the wall-clock time use System.currentTimeInMillis.
System.nanoTime can be pretty much arbitrary and should be used for measuring durations of an operation on the same node/machine. You will get arbitrary results if you try to use it to measure the time difference between two different computers.
In some cases, System.curentTimeInMillis may go backward but I guess you're more likely to observe issues caused by system clocks on those machines being not perfectly synchronized - this is in general challenging and also depends on what time intervals you generally expect to measure (microseconds, milliseconds?).
Check out NTP protocol and how to keep clocks synchronized and take the results you get with a grain of salt.
Here's some good info about this topic: https://codeburst.io/why-shouldnt-you-trust-system-clocks-72a82a41df93 - an interesting piece:
According to google, there is a 6ms drift in a clock which is synchronised every 30s
An interesting paper relevant to the topic: Clock Synchronization for One-Way Delay Measurement: A Survey

Reasoning about threads and time outs: Could providing a timeout improve performance for a multithreaded application?

I have an java application that uses 60 threads in a thread pool where each thread makes an SOAP request to export data to a server. Our goal is to export as much data as possible as quickly possible.
We've recently noticed that the promised time out on the server-side is not happening. We were told that there was a 90 second time out which we were asked to tolerate.
Instead, we are seeing that sometimes, an error response doesn't come back for over 200 seconds. This is a latent issue that has been happening all along.
I am wondering if this is an opportunity to improve our export rate or if changing the time out rate, on our side, to 90 seconds will not change a thing.
We have found that roughly 60 threads provides the optimal output. If we increase it to 70 threads or 120 threads for example, we see no increase in export rate.
If we reduce thread count below 60 threads, we see a decrease in our export rate.
Assuming that the average response time for an error is 120 seconds (which always indicates an error) and a non-error is less than 30 seconds, would we necessarily gain any performance benefit by setting the time out to 90 seconds? Or would the fact that increasing threads didn't help indicate that we are already at the maximum export rate?

Accurate spending time solution

Considering the following code snippets
class time implement Runnable{
long t=0L;
public void run(){
try{while(true){Thread.sleep(1000);t++;/*show the time*/}}catch(Throwable t){}
}
}
////
long long t=0L;
void* time(void* a){//pthread thread start
sleep(1);t++;//show the time
}
I read in some tutorial that in Java Thread.sleep(1000) is not exactly 1 second, and it might be more if the system is busy at the time, then OS switch to the thread late.
Questions:
Is this case true at all or no?
Is this scenario same for native (C/C++) codes?
What is the accurate way to count the seconds up in an application?
Others have answered about the accuracy of timing. Unfortunately, there is no GUARANTEED way to sleep for X amount of time, and wake up at exactly X.00000 seconds (or milliseconds, nanoseconds, etc).
For displaying time in seconds, you can just lower the time you are waiting to, say, half a second. Then you won't have the time jump two seconds from time to time, because half a second isn't going to be extended to more than a second (unless the OS & system you are running on is absolutely overloaded and nothing gets to run when it should - in which case you should fix that problem [get a faster processor, more memory, or whatever it takes], not fiddle with the timing of your application). This works well for "relatively long periods of time", such as one second or 1/10th of a second. For higher precision, it won't really work, since we're now entering the "scheduling jitter" zone.
If you want very accurate timing, then you will probably need to use a Real-Time OS, or at least an OS that has "real time extensions enabled", which will allow the OS to be more strict about time (at the cost of "ease of use" from the programmer, and possibly also the OS being less efficient in it's handling of processes, because it "switches more often than it needs to", compared to a more "lazy" timing approach).
Note also that the "may take longer", in an idle system is mainly the "rounding up of the timer" (if the system tick happens every 10ms or 1ms, the timer is set to 1000ms + whatever is left of the current timer tick, so may be 1009.999ms, or 1000.75ms, for example). The other overhead, that come from scheduling and general OS overheads should be in the microseconds range if not nanoseconds on any modern system - after all, an OS can do quite a lot of work in a microsecond - a modern x86 CPU will execute 3 cycles per clock, and the clock runs around 0.3ns. That's 10 instructions per nanosecond [of course, cache-misses and such will worsen this dramatically]. If the OS has more than a few thousand instructions to go from one process to another (less still for threads), then there's something quite wrong. A few thousand instructions # 10 instructions per nanonsecond = some hundreds of nanoseconds. Definitely less than a microsecond. Compare that to the 1ms or 10ms "jitter" of starting the timer just after the timer ticked off last time.
Naturally, if the CPU is busy running other tasks, this is different - then the time "left to run" on other processes will also influence the time taken to wake up a process.
Of course, in a heavily loaded memory system, the "just woken up" process may not be "ready to run", it could be swapped out to disk, for example. In which case, tens if not hundreds of milliseconds are needed to load it back from the disk.
To answer the two first questions: Yes it's true, and yes.
First there is the time between the timeout expires and the time when the OS notices it, then there the time for the OS to reschedule your process, and lastly there's the time from the process has been "woken up" until it is its turn to run. How long will all this take? There's no way of saying.
And as it's all done on the OS level, it doesn't really matter what language you program in.
As for a more accurate way? There is none. You can use more high-precision timers, but there is no way of avoiding the lag described above.
Yes, it´s true that it is not accurate.
It´s the same for simple sleep-functions in C/C++ and pretty much everything else.
Depending on your system, there could be better functions accessible,
but:
What is the accurate way
A really accurate way does not exist.
Unles you have some really expensive special computer with atomic clock included.
(and no usual OS too. And even then, we could argue what "accurate" means)
If busy waiting (high CPU load) is acceptable, look at nanoTime or native usleep, HighPerformanceCounter or whatever is applicable for your system
The sleep call tells the system to stop the thread execution for at least a time period specified as argument. The system will then resume thread execution when it has a chance (it actually depends on many factors, such as hardware, thread priorities, etc.). To more or less acurately measure the time you can store the time at the beginning of execution and then calculate the time delta each time it's needed.
The sleep function is not accurate, but if the intent is to display the total amount of seconds then you should store the current time at the beginning and then display the time difference every now and then.
This is true. Every sleep implementation in any language (C too) will fail to wait exactly 1 second. It has to deal with your OS scheduler, the sleep duration is juste a hint : the minimum sleep duration to be precise, but the actual difference depends on gigazillions of factors.
Trying to figure out the deviation is tricky if you want a very high resolution clock. In most cases, you'll have about 1~5 ms (roughly).
The thing is that the order of magnitude will be the same whatever the sleep duration. If you want something "accurate", you can divide your time application and wait for a longer period. For example, when you benchmark, you will prefer this type of implementation because the delta-time will increase, decreasing uncertainty :
// get t0
// process n times
// get t1
// compute average time : (t1-t0)/n

Messaging latency in java (with zeromq)

I just ran the zeroMQ hello world example and timed the request-response latency. It averaged about 0.1ms running using the IPC protocol. This sounds quite slow to me....Does this sound about right?
long start=System.nanoTime();
socket.send(request, 0);
// Get the reply.
byte[] reply = socket.recv(0);
System.out.println((System.nanoTime()-start)/1000000.0);
I assume your average had a sample of more than one? I would run the test for at least 2-10 seconds before taking an average. The average latency in the same process/thread may be misleading.
I would create a second process which echo everything it gets if you are not doing this already. (And divide the latency in two unless you want the RTT latency)
Plain Sockets can get a RTT latency of 20 micro-seconds on a typical multi-core box and I would expect IPC to be faster. On a fast PC you can get a typical RTT latency of 9 micro-second using sockets.
If you want latency much lower than this, I would consider doing everything in one process or one thread if you can, in which case the cost of a method call is around 10 ns (if its not inlined ;)

Categories

Resources