How to plot a graph if only event points are given?

How to plot a graph if only event points are given? - java

I am trying to plot availability of node (machine). In order to save storage on data collected, instead of recording data on fixed interval, I record them based on events (ADDED, REMOVED). ADDED means "up", REMOVED means "down/unreachable"
Here's the sample data I have:
2012-11-25 11:11:11.1234 - node added.
2012-11-25 15:01:20.1234 - node removed.
2012-11-25 18:12:12.1234 - node added.
Let's say, I want to plot a graph from time range: 2012-11-24 to 2012-11-25 (x-axis), Up/Down (y-axis) , how do I plot the graph?

i think there are some examples (i cant remember which one) in the d3 tools (http://d3js.org)
If you look through the examples you can choose the type of visualisation you want to use.
I think the data set you have would match to what you are trying to do (you may need to write a small operator to convert to the up/down comment to an integer).

If you have all your data stored in an array, simply use JavaScript's built in Array.filter method, and use JavaScript's Date object to convert the timestamp into milliseconds (note that it'll round to the nearest thousandth of a second - only 3 decimal places).
var startTime = Date.parse("2012-11-24"),
endTime = Date.parse("2012-11-25") + 86400000; /* Add one day */
filteredData = data.filter(function(d) {
var time = Date.parse(d.time);
return (time >= startTime && time < endTime);
});
You'll may need to play around with the date ranges, I'm assuming you mean that you want data which is on between 2012-11-24 and 2012-11-25 inclusive.
If your data is in a database, another way would be just to simply query the database to only display data which exists within the time range (you could call some PHP script - using d3.text - which outputs JSON and accepts two GET parameters, startTime and endTime).
d3.text("getNodes.php?startTime=" + startTime + "&endTime=" + endTime, function(json) {
filteredData = JSON.parse(json);
});

Related

How to get the time taken for part of a program in Unix

I'm working on comparing a Binary Search Tree to an AVL one and want to see the usr/sys time for a search operation performed on both. Thing is: I have an application (SearchBST.java/SearchAVL.java) that reads in a file and populates the trees, and then searches them. I want to know if I can check the usr/sys time for just the searching instead of the entire thing (inserting and searching). It seems to me that the insertion is causing the AVL's time (using "time java SearchAVL") to be roughly the same as the BST's.
Should I be doing it differently (such that populating the tree doesn't affect the overall time)? I'll post some code as soon as I can, but I wanted to see if anyone has any thoughts in the mean time.

Why don't you measure the time inside your application?
// Read file to a temporary collection or array
// to prevent meassuring disk performance instead of tree performance
long t = System.nanoTime();
// populate tree
long tPopulate = System.nanoTime() - t;
t = System.nanoTime();
// search tree
long tSearch = System.nanoTime() - t;
System.out.println("tPopulate = " + tPopulate + " ns");
System.out.println("tSearch = " + tSearch + " ns");
This will only print the wall clock time, but since you don't have any Thread.sleep(...) commands or things like that in your program, the wall clock time shouldn't differ much from the user time.

How to use System.nanoTime() for sequencing events with more precision

I want to attach a timestamp whenever an event happens in my application. Let's assume that a client creates a Event object and I want to attach a creation timestamp to the Event object. I can do this using the System.currentTimeMillis() within the constructor. This works fine if the Event objects are created no faster than once every millisecond. In this case each Event object gets a different value from System.currentTimeMillis() and hence the Event objects are sequenced.
However if the Event objects need to be created a rate that is faster than one object per millisecond, then my logic breaks. Depending on the rate of object creation 2 or more Event objects end up having the same creation timestamp (since System.currentTimeMillis returned the same value when called in quick succession)
Now how do I sequence the Event objects in this case? I am aware of the System.nanoTime() but that's not related to the epoch.
I am open to storing the creation timestamp within the Event class split into 2 instance variables - creationTimeInMS (long) and creationTimeInNS (long)
I do not want to java.sql.Timestamp which does support nano second precision
Is there anyway I could leverage the System.nanoTime to provide sequencing of the event objects?
Note - It is guaranteed that the event will not get created faster than 1 per nanosecond. Hence nanosecond precision will suffice
The code that I have using is as below
class Event {
private long timestamp
public Event() {
...
timestamp = System.currentTimeMillis()
}
So if the Event constructor is called by multiple threads at a rate faster than 1 per millisecond, then two (or more) Event objects get the same timestamp.
The System.nanoTime() is supposed to return unique number if called no faster than once every nano second. However I am not sure how I could use this number in conjunction with the timestamp. Do I add this to the timestamp to generate a nano second precision time?

It is hard to achieve that with relying on the wall clock, times are gonna collide and nano second resolution is hard to achieve in practice, a quick solution is to add a wrapper around the time that remembers its last value. This wont work in a distributed environment of course.
static class MonotonicClock{
private long last;
public MonotonicClock(){
last = System.currentTimeMillis();
}
public synchronized long getNext(){
long current = System.currentTimeMillis();
if(last < current){ // last seen is less than "now"
last = current;
}else{
last++; //collision, overclock the time
}
return last;
}
}
In a distributed system, things are more complicated. You might need to look at Lamport timestamps and Vector Clocks for that.

Maximum occurrence of any event in time range

I have collection time stamps, e.g 10:18:07.490,11:50:18.251 where first is the start time and second is end time for an event. I need to find a range where maximum events are happening just 24 hours of time. These events are happening in precision of milliseconds.
What I am doing is to divide 24 hours on millisecond scale, and attach events at every millisecond, and then finding a range where maximum events are happening.
LocalTime start = LocalTime.parse("00:00");
LocalTime end = LocalTime.parse("23:59");
for (LocalTime x = start; x.isBefore(end); x = x.plus(Duration.ofMillis(1))) {
for (int i = 0; i < startTime.size(); i++) {
if (startTime.get(i).isAfter(x) && endTime.get(i).isBefore(x))
// add them to list;
}
}
Certainly this is not a good approach, it takes too much memory. How I can do it in a proper way? Any suggestion?

A solution finding the first period of maximum concurrent events:
If you're willing to use a third party library, this can be implemented "relatively easy" in a SQL style with jOOλ's window functions. The idea is the same as explained in amit's answer:
System.out.println(
Seq.of(tuple(LocalTime.parse("10:18:07.490"), LocalTime.parse("11:50:18.251")),
tuple(LocalTime.parse("09:37:03.100"), LocalTime.parse("16:57:13.938")),
tuple(LocalTime.parse("08:15:11.201"), LocalTime.parse("10:33:17.019")),
tuple(LocalTime.parse("10:37:03.100"), LocalTime.parse("11:00:15.123")),
tuple(LocalTime.parse("11:20:55.037"), LocalTime.parse("14:37:25.188")),
tuple(LocalTime.parse("12:15:00.000"), LocalTime.parse("14:13:11.456")))
.flatMap(t -> Seq.of(tuple(t.v1, 1), tuple(t.v2, -1)))
.sorted(Comparator.comparing(t -> t.v1))
.window(Long.MIN_VALUE, 0)
.map(w -> tuple(
w.value().v1,
w.lead().map(t -> t.v1).orElse(null),
w.sum(t -> t.v2).orElse(0)))
.maxBy(t -> t.v3)
);
The above prints:
Optional[(10:18:07.490, 10:33:17.019, 3)]
So, during the period between 10:18... and 10:33..., there had been 3 events, which is the most number of events that overlap at any time during the day.
Finding all periods of maximum concurrent events:
Note that there are several periods when there are 3 concurrent events in the sample data. maxBy() returns only the first such period. In order to return all such periods, use maxAllBy() instead (added to jOOλ 0.9.11):
.maxAllBy(t -> t.v3)
.toList()
Yielding then:
[(10:18:07.490, 10:33:17.019, 3),
(10:37:03.100, 11:00:15.123, 3),
(11:20:55.037, 11:50:18.251, 3),
(12:15 , 14:13:11.456, 3)]
Or, a graphical representation
3 /-----\ /-----\ /-----\ /-----\
2 /-----/ \-----/ \-----/ \-----/ \-----\
1 -----/ \-----\
0 \--
08:15 09:37 10:18 10:33 10:37 11:00 11:20 11:50 12:15 14:13 14:37 16:57
Explanations:
Here's the original solution again with comments:
// This is your input data
Seq.of(tuple(LocalTime.parse("10:18:07.490"), LocalTime.parse("11:50:18.251")),
tuple(LocalTime.parse("09:37:03.100"), LocalTime.parse("16:57:13.938")),
tuple(LocalTime.parse("08:15:11.201"), LocalTime.parse("10:33:17.019")),
tuple(LocalTime.parse("10:37:03.100"), LocalTime.parse("11:00:15.123")),
tuple(LocalTime.parse("11:20:55.037"), LocalTime.parse("14:37:25.188")),
tuple(LocalTime.parse("12:15:00.000"), LocalTime.parse("14:13:11.456")))
// Flatten "start" and "end" times into a single sequence, with start times being
// accompanied by a "+1" event, and end times by a "-1" event, which can then be summed
.flatMap(t -> Seq.of(tuple(t.v1, 1), tuple(t.v2, -1)))
// Sort the "start" and "end" times according to the time
.sorted(Comparator.comparing(t -> t.v1))
// Create a "window" between the first time and the current time in the sequence
.window(Long.MIN_VALUE, 0)
// Map each time value to a tuple containing
// (1) the time value itself
// (2) the subsequent time value (lead)
// (3) the "running total" of the +1 / -1 values
.map(w -> tuple(
w.value().v1,
w.lead().map(t -> t.v1).orElse(null),
w.sum(t -> t.v2).orElse(0)))
// Now, find the tuple that has the maximum "running total" value
.maxBy(t -> t.v3)
I have written up more about window functions and how to implement them in Java in this blog post.
(disclaimer: I work for the company behind jOOλ)

It can be done significantly better in terms of memory (well, assuming O(n) is considered good for you, and you don't regard 24*60*60*1000 as tolerable constant):
Create a list of items [time, type] (where time is the time, and type is
either start or end).
Sort the list by time.
Iterate the list, and when you see a "start", increment a counter, and when you see a "end", decrememnt it.
By storing a "so far seen maximum", you can easily identify the single point where maximal number of events occuring on it.
If you want to get the interval containing this point, you can simply find the time where "first maximum" occures, until when it ends (which is the next [time, type] pair, or if you allow start,end to be together and not counted, just linear scan from this point until the counter decreases and time moved, this can be done only once, and does not change total complexity of the algorithm).
This is really easy to modify this approach to get the interval from the point

timestamp milliseconds distinction in Dao operation

I have this realy simple dao operation which works pretty fine(part of a JUnit test):
for (int i = 0 ; i < 5000 ; i++)
mUserDao.saveUser(lUser, new Date().getTime());
the second parameter is a timestamp as long value. I test a kind of bulk save.
my question is: is it theoretical possible that I have two entries with the same long value in my database(mysql)? - in the same process.
A first look inside the relation shows me different long values for each entrie(At least the last millisecond is increased).
Thx in advance
Stefan

You can't guarantee that new Date() will give you the current time accurately. Often it can be wrong by up to about 10ms. Calling new Date() uses System.currentTimeMillis(). The Javadoc for currentTimeMillis() says
Returns the current time in milliseconds. Note that while the unit of time of the return value is a millisecond, the granularity of the value depends on the underlying operating system and may be larger. For example, many operating systems measure time in units of tens of milliseconds.
So it's OS dependent. My experience is that Windows is particularly bad at giving you accurate dates.
And you certainly can't guarantee that you'll get different dates from successive calls to new Date().

Yes this is possible especially if you are having a fast hardware and that two saving operations are done in the same time. Then both created entries will have the same Long value.

I think milliseconds since the UNIX epoch are still the best way to measure time in a reasonably accurate way. However, it's not really good to have a timestamp only as a primary key. As long as you have a unique primary key you don't really need unique timestamps.
In case for some reason you still want the timestamp to be unique, you can apply an 'artificial smear'. For example:
long last = 0;
for (int i = 0 ; i < 5000 ; i++) {
long now = new Date().getTime();
if (now <= last) {
now = last + 1;
}
last = now;
mUserDao.saveUser(lUser, now);
}
There are many ways this can be improved but the code above is just to illustrate the idea of a smear.

Number of visitors per hour

I have a server log formatted like this:
128.33.100.1 2011-03-03 15:25 test.html
I need to extract several things from it but I am mostly stuck on how to get the total number of visits per hour as well as the number of unique visitors per page. Any pointers would be appreciated.

Assuming you have only these lines in your log file. Here is how i think you should go about doing this. (This is assuming database is not involved)
Create a class that represents each line (a model), having IP, Date, time , file
You can add a method on this model which returns a java timestamp based on date time.
Then create an Hash Map which stores file name as keys and list of object of above class as values
Start reading a line at a time.
For each line
a. Use StringTokenizer to get IP, Date, time and file as tokens
b. Populate the object of above class
c. Append this object to the list matching the file name in the hash map. (create new if one doesnt exist)
Now you have all the data in a usable data structure.
To get number of unique visitors for each page:
1. Just retrieve the list corresponding to correct file name form hash map. This you can run a simple algorithm to count number of unique IP addresses. You can also use a Java Collections functionality to do this.
To get number of visits per hour for each page:
1. again retrieve the correct list as above, and fin the min and max time stamp.
2. find out the time in hours. then divide the total entries in the list with hours.
Hope that helps.

If you are splitting the line into an array, I would suggest taking the hour out of the 3rd element and do an a check for all preceding lines the same way from the first time you see the 15 until the first time you see the 16, with a counter store the number of hits in that hour.
Splitting a String can be done like this:
String[] temp;
String str = "firstElement secondElement thirdElement";
String delimiter = " ";
temp = str.split(delimiter); //temp be filled with three elements.
As far as the unique visitors per page go, you can grab the 1st element of the array you used for splitting and putting that value inside a HashMap with the that IP value as the key and the page they visited as a value. Then do a check on the HashMap with every IP that comes in and if its not in the it, insert it and by the end you will have a HashMap filled with unique elements/IPs.
Hope that gives you some help.

Convert the log entries in java.util.Calendar and then perform your maths on per unique IP addresses.
import java.text.SimpleDateFormat;
import java.util.Calendar;
import java.util.Date;
public class Visit
{
public static void main(String[] args) throws Exception
{
String []stats = "128.33.100.1 2011-03-03 15:25 test.html".split("\\s+");
System.out.println("IP Address: " + stats[0]);
SimpleDateFormat formatter = new SimpleDateFormat("yyyy-MM-dd hh:mm");
Date date = formatter.parse(stats[1]+" "+stats[2]);
Calendar cal = Calendar.getInstance();
cal.setTime(date);
System.out.println("On Date: " + cal.get(Calendar.DATE)+ "/" + cal.get(Calendar.MONTH)+ "/" + cal.get(Calendar.YEAR));
System.out.println("At time: " + cal.get(Calendar.HOUR_OF_DAY)+ ":" + cal.get(Calendar.MINUTE));
System.out.println("Visited page: " + stats[3]);
/*
* You have the Calendar object now perform your maths
*/
}
}

While parsing the line from the log do:
Allocate a Dictionary (this should be done just once at the start of the program)
Extract the date time part
Convert it to DateTime object (.NET) or similar object for your programming language
Set the minutes and seconds of the date time object to 0
Put the date time in Dictionary object if it doesn't already exist.
Increment the value in the dictionary item where Date time is your current parsed date time
In the end this dictionary will have hourly hits

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to plot a graph if only event points are given? - java

Related

How to get the time taken for part of a program in Unix

How to use System.nanoTime() for sequencing events with more precision

Maximum occurrence of any event in time range

timestamp milliseconds distinction in Dao operation

Number of visitors per hour

Categories

Resources