I'm trying to gather data of time that a HTTP request take to travel from 1 node to another in a network, here's a simple network topology that I'm working on:
I'm using Raspberry Pi 4 model B
PC ---- RaspPi(1) ---- RaspPi(2) ---- RaspPi(n) ---- ...
Each of these nodes have their own application that can work with HTTP, the idea to gather data is:
Suppose I have a HTTP request that has RaspPi(n) as its destination, now once the request traverse through each node, I logged out the TIMESTAMP when it reaches the node, from then I can calculate DeltaT, which is the time it takes for my request to travel between 2 consecutive nodes.
I have tried to use:
Date now = Calendar.getInstance();
TimeStamp ts = new TimeStamp(now.getTime());
And
System.currentTimeMillis()
to get the TIMESTAMP, the problem is, data that I gathered have negative DeltaT, which for example: TIMESTAMP at RaspPi(2) is before TIMESTAMP at RaspPi(1). I've done some searching around and found that the 2 methods I used above are not monotonic (Source 1 and Source 2).
So the other method I'm thinking is to use System.nanoTime() but this doesn't seem to work on different JVMs, which is all of my network nodes.
I don't know if there is a better approach to gather these data, or some work around that I can do to fix those methods I used.
Please let me know if I haven't make myself clear. Thanks for reading.
For getting the wall-clock time use System.currentTimeInMillis.
System.nanoTime can be pretty much arbitrary and should be used for measuring durations of an operation on the same node/machine. You will get arbitrary results if you try to use it to measure the time difference between two different computers.
In some cases, System.curentTimeInMillis may go backward but I guess you're more likely to observe issues caused by system clocks on those machines being not perfectly synchronized - this is in general challenging and also depends on what time intervals you generally expect to measure (microseconds, milliseconds?).
Check out NTP protocol and how to keep clocks synchronized and take the results you get with a grain of salt.
Here's some good info about this topic: https://codeburst.io/why-shouldnt-you-trust-system-clocks-72a82a41df93 - an interesting piece:
According to google, there is a 6ms drift in a clock which is synchronised every 30s
An interesting paper relevant to the topic: Clock Synchronization for One-Way Delay Measurement: A Survey
Related
I work at a retailer and we consider to introduce CQ5 as a CMS.
However, after doing some research and talking to consultants it turns out, that there may be things that may be "complicated". Perhaps one of you can shed a little light on this.
The first thing is, we were told that when you use the Multi Site Manager to create multi language pages (about 80 languages) the update process can be as slow as half an hour until a change is ultimately published. Did someone of you experience something similar?
The other thing is, that the TarOptimizer has pretty long running times. I was told that runs that take up to 24 hours are not uncommon. Again my question: Did someone of you had such a problem or has an explanation for this?
I am really looking forward to your response.
These are really 2 separate question, but I'll address them based on my experience.
The update process for creating new multi-language pages will vary based on the number of languages, and also the number of publish instances and web-servers (assuming you're using dispatcher to cache) you are running. This is because the replication process is where the bottleneck is (at least in my experience), and as such if you're trying to push out a large amount of content across a large number of publishers with a large number of front-end web-servers whose cache needs to be cleared, there will be some delay in getting this to happen since replication is an asynchronous process. The longest delay I've seen for this has been in the 10-15 minute range, that was with 12 publishers and 12 front end webservers, but this comes with the obvious caveat that your mileage may vary.
For the Tar Optimzation job, I'd encourage you to take a look at this page as it has a lot of good info about the Tar Optizer job and how to tune it. The job can take a long time to run when you have a large repository, especially on an instance with a large number of write operations, but the run times can be configured so that it only runs during a given time period, and it will pick up where it left off the night before if the total run time is longer than the allowed run time. By default, it runs from 2-5 am each night, so if it takes more than that 3 hour period, it will continue where it left off the next night, allowing it to optimize the entire repository over a period of a few days if needed.
I've been reading a lot about password storing, hashing, salting, "peppering", MAC, etc because I'm about to make a new website and security it's really important to me, however there are some reasons why I'm considering not using Google Authentication (or Facebook, OpenID or any other) which are not relevant right now, but it brings me to this point.
I'm new to Google App Engine, this is going to be my first project on it, and I'm a little confused about the "Instance Hours" and how it no longer has "CPU time" but the aforementioned quota. Even worst, I haven't been able to understand what is the Instance Hours Free Quota.
Here's why I'm worried about the quotas and what does that has anything to do with my security concerns: One recommendation I've read everywhere is to make multiple iterations and hash the password several times, because that would make and attacker spend much much much more time (I don't have numbers, but they are everywhere on https://security.stackexchange.com/).
Multiple iterations have direct impact on CPU time, and if GAE had a CPU time quota I think making 1000 iterations every time a user logs in could be a problem, however if what they count is Instance Hours from the moment the request is done to up to fifteen minutes later and as read on GAE quota docs is:
In general, instance usage is billed on an hourly basis based on the
instance's uptime. Billing begins when the instance starts and ends
fifteen minutes after the instance shuts down. You will be billed only
for idle instances up to the number of maximum idle instances set in
the Performance Settings tab of the Admin Console. Runtime overhead is
counted against the instance memory.
then it means that if my users log in (hash 1000 times), then they continue to use the site, the Instance Hours will continue to sum until all of them leave the page + 15 minutes? If this is true, then making it iterate 1000 times wouldn't have a significant impact on my quota, other than the "extra" time it takes for the user to log in, but I'm aware of that and it's a price I'm willing to pay.
The number of iterations I'll make will be the ones that make the time to log in acceptable and imperceptible to the user, so don't worry about this.
My questions are:
Will making MANY iterations have a direct impact on the Instance Hours, or my assumptions about how the Instance Hours are summed are correct?
Is there a CPU time quota on Google App Engine I'm missing somehow? Does it have a Free Quota?
What is the Instance Hours Free Quota?
Answers:
Look Moishe accepted answer and the other question he asked (which has not been answered but has usefull comments) When does the App Engine scheduler use a new thread vs. a new instance?
According to Google there is no CPU time quota: http://googleappengine.blogspot.com.es/2009/02/skys-almost-limit-high-cpu-is-no-more.html
Found an answer to question number 3 here: Google App Engine Frontend Instance Hours Limit Reached
If it takes a long time to process a request, because eg. you're doing something very computationally intensive, and you don't want other users to wait a long time, the App Engine scheduler may spin up another instance of your application to serve incoming requests.
Imagine that computing the hash for a password takes 1 minute and during that minute your application gets a request from another user. That user could wait for a minute to get a response to their request, or the App Engine scheduler could spin up another instance to service that request and get a response back much sooner. You can tune whether or not another instance will come up using the Performance sliders on your Application Settings page in the admin console.
Basically the question you need to ask about instance hours is: is it likely you'll get overlapping requests (ie. a new request coming in before the current request is complete). If this happens not-infrequently, and you want snappy response for your users, you'll need to budget more instance hours.
I suspect that the big computation you'll need to do will be infrequent -- only on initial sign-in to generate a cookie, say, rather than for every request.
To explicitly answer your question #1, making many iterations will only have an effect on your instance hours if it causes overlapping requests. If you only get one request every 30 seconds, you could spend 30 seconds serving each request (including calculating each hash, and doing other operations) and not exceed your free instance-hours quota. Conversely if you get 10 requests per second and spend any more than 100ms serving the request, then you'll start to exceed your instance hours fairly quickly.
Instance hours are for long as the server is running, answering requests, etc. If your server isn't running, it can't wake up on a request or anything.
Imagine instance hours as having the computer on. You are billed when it's on, and not when it's off.
You could have multiple instances, so let's say you have two instances, you're burning twice as many instance hours.
Your password hashing won't affect this because it will only incur instance hours when the instance is on, and when its off, it won't be incurring any instance hours, but it won't be hashing either.
There are multiple sources covering passwords. You evidently have read some that encourage multi-pass hashing. Consider the first link below before finalizing this decision. Excerpt from this page: "It's easy to get carried away and try to combine different hash functions, hoping that the result will be more secure. In practice, though, there is no benefit to doing it. All it does is create interoperability problems, and can sometimes even make the hashes less secure."
Two valuable links to consider( first has quote above, second is good "how to" source):
http://crackstation.net/hashing-security.htm
http://throwingfire.com/storing-passwords-securely/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+throwingfire+%28Throwing+Fire%29#notpasswordhashes
Is it correct to compare two values resulting from a call to System.nanoTime() on two different machines? I would say no because System.nanoTime() returns a nanosecond-precise time relative to some arbitrary point time by using the Time Stamp Counter (TSC) which is processor dependent.
If I am right, is there a way (in Java) to capture an instant on two different machines and to compare (safely) these values with at least a microsecond precision or even nanotime precision?
System.currentTimeMillis() is not a solution because it is not returning a linearly increasing number of time stamps. The user or services such as NTP can change the system clock at any time and the time will leap back and forward.
You might want to look into the various clock synchronization algorithms available. Apparently the Precision Time Protocol can get you within sub-microsecond accuracy on a LAN.
If you don't need a specific time value but rather would like to know the ordering of various events, you could for instance use Lamport timestamps.
You cannot use nanoTime between two different machines. For the Java API docs:
This method can only be used to measure elapsed time and is not
related to any other notion of system or wall-clock time. The value
returned represents nanoseconds since some fixed but arbitrary time
(perhaps in the future, so values may be negative).
There's no guarantee that nanoTime is relative to any timebase.
This is a processor & OS dependent Q. Looking at POSIX clocks, for example, there are high precision time of day aware timestamps (e.g. CLOCK_REALTIME returns a nano epoch time value) and high precision arbitrary time timestamps (e.g. CLOCK_MONOTONIC) (NB: the difference between these 2 is nicely explained in this answer).
The latter is often something like time since the box was booted and therefore there's no way to accurately compare them across servers unless you have high precision clock sync (e.g. PTP as referenced in the other answer) in the first place (as then you'd be able to share an offset between them).
Whether NTP is good enough for you depends on what you're trying to measure. For example if you're trying to measure an interval of a few hundred micros (e.g. boxes connected to the same switch) then your results will be rough, at the other extreme NTP can be perfectly good if your servers are in different geographical locations entirely (e.g. London to NY) which means the clock sync effect (as long as it's not way way off) is swamped by the latency between the locations.
FWIW the JNI required to access such clocks from java is pretty trivial.
You can synchronize the time to current time millis. However even if you use NTP this can drift by 1 ms to 10 ms between machines. The only way to be micro-second synchronization between machines is to use specialist hardware.
nanoTime is guaranteed to be determined the same way or have the same resolution on two different OSes.
Is it possible to slow down time in the Java virtual machine according to CPU usage by modification of the source code of OpenJDK? I have a network simulation (Java to ns-3) which consumes real time, synchronised loosely to the wall clock. However, because I run so many clients in the simulation, the CPU usage hits 100% and hard guarantees aren't maintained about how long events in the simulator should take to process (i.e., a high amount of super-late events). Therefore, the simulation tops out at around 40 nodes when there's a lot of network traffic, and even then it's a bit iffy. The ideal solution would be to slow down time according to CPU, but I'm not sure how to do this successfully. A lesser solution is to just slow down time by some multiple (time lensing?).
If someone could give some guidance, the source code for the relevant file in question (for Windows) is at http://pastebin.com/RSQpCdbD. I've tried modifying some parts of the file, but my results haven't really been very successful.
Thanks in advance,
Chris
You might look at VirtualBox, which allows one to Accelerate or slow down the guest clock from the command line.
I'm not entirely sure if this is what you want but, with the Joda-time library you can stop time completely. So calls to new Date() or new DateTime() within Joda-time will continously return the same time.
So, you could, in one Thread "stop time" with this call:
DateTimeUtils.setCurrentMillisFixed(System.currentTimeMillis());
Then your Thread could sleep for, say, 5000ms, and then call:
// advance time by one second
DateTimeUtils.setCurrentMillisFixed(System.currentTimeMillis() + 1000);
So provided you application is doing whatever it does based on the time within the system this will "slow" time by setting time forwards one second every 5 seconds.
But, as i said... i'm not sure this will work in your environment.
According to this discussion of Google App Engine on Hacker News,
A DB (read) request takes over 100ms on the
datastore. That's insane and unusable
for about 90% of applications.
How do you determine what is an acceptable response time for a DB read request?
I have been using App Engine without noticing any issues with DB responsiveness. But, on the other hand, I'm not sure I would even know what to look for in that regard :)
You can measure precisely how much each RPC call (datastore or otherwise) is taking, thanks to Guido van Rossum's AppStats relatively-new component (it's part of the standard SDK since 1.3.1). See here for more. 100 milliseconds is fine for most well-designed apps -- if you need to make two or three queries to serve a page, you can still serve in less than half a seconds even if there's lots of processing and rendering involved... not too shabby. Plus, you can use memcache to reduce many of those latencies, etc.
The poster is wrong. Datastore get operations are much faster - about 15-20ms each, currently. Datastore query operations can be slower, because they're much more involved and return more data, but they still complete in anywhere from 30-100ms for a typical query. Other posters have amply addressed whether that's "acceptable" or not.
What do you mean by acceptable? What kind of application are you writing? Acceptable means different things for different domains/applications/people. First, you should decide how quickly you want your app to respond to a request. Let's pick 1 second, just for argument's sake. Now, how many DB requests do you need to make to fulfill that request? Let's say 5. Let's also say that we also have 400ms worth of other processing to do. OK, so that's 5 reads times 100ms each, plus 400ms of other stuff. 900ms total, which is less than our goal of 1 second. Perfect! 100ms is an acceptable read rate. In fact, 120ms would still be acceptable, just barely.
Now, let's generalize:
numberOfReads * readTime + otherStuffTime = TotalTime
Fill in your numbers, and you can see what is an acceptable time for your particular situation.
If you haven't noticed any issues then it is by definition an acceptable response time. The only question is how long your users are happy to wait.
An "an acceptable response time for a DB read request" depends entirely on your application and your users.
If the net result is that your site runs fast enough to satisfy you and your users then the slow response time of the services provided by Google in their AppEngine are acceptable.
Now, looking deeper at this particular issue, it sounds like we are talking about GET's. Here are the figures for GET latency and it looks to me that the average latency is closer to 50ms then 100. I'm not saying that is good, but I don't think it is accurate to say 100ms.