I need a java utility method in java (for my application which get thousands of request coming in a second), which has following feature.
The request has arrivaltime in format of (DD-MM-YYYY-HH:MM:SS) and bucketNumber (1-100).
I want that if for same bucketNumber if same arrivaltime comes from request it should increment the value of arrivaltime of request by 1 miliisecond.
For example :
If for bucketNumber=1 arrival time for 1st, 2nd, 3rd request = 01-01-2016-10:00:00 (actually time till milli 01-01-2016-10:00:00:000) and a 4th request with 01-01-2016-10:00:01.
So for 2nd request the utility method will return 01-01-2016-10:00:00 (BUT this actually 01-01-2016-10:00:00:001)
and for 3rd request the utility method will return 01-01-2016-10:00:00 (BUT this actually 01-01-2016-10:00:00:002)
and for 4rd request the utility method will return 01-01-2016-10:00:01 only without performing any operation.
I don't want to keep a huge cache to perform this action (if I use set then I want to keep removing redundant the data as well).
//signature should be like below
Date getIncrementedArrivalTimeIfSame(Date arrivaltime, int bucketNumber ) {
//return incremented if equal else return original arrivaltime
}
Should I use a global map which has bucketNumber as key and a set which has arrival time? Please help me to implement this. This method will be invoked within synchronized block in a threadSafe way.
Below is my solution.
I finaly used a map:
static Map<Integer, Date> arrivalTimeMap = new HashMap<>();
static Date getIncrementedArrivalTimeIfEqual(Date arrivaltime,
int bucketNumber) {
Date lastArrivalTime = arrivalTimeMap.put(bucketNumber, arrivaltime);
if(lastArrivalTime != null && !lastArrivalTime.before(arrivaltime)){
Date incrementedArrivalTime = incrementDateByMilliSeconds(
lastArrivalTime, 1);
arrivaltime = incrementedArrivalTime;
}
arrivalTimeMap.put(bucketNumber, arrivaltime);
return arrivaltime;
}
Related
I'm using Camel JPA endpoints to poll a database and copy the data to a second one.
To not poll duplicates, I'm planning to save the highest ID of the copied data and only poll data with an ID higher than that one.
To save a few database writes, I want to write back the highest ID after the current polling / copying run is over, not for every single data element. I can access the element (and its ID) in the Camel Route class:
private Long startId = 0L;
private Long lastId = 0L;
from("jpa://Data").routeId("dataRoute")
.onCompletion().onCompleteOnly().process(ex -> {
if (lastId > startId) {
startId = lastId;
logger.info("New highest ID: {}", startId);
}
}).end()
.process(ex -> {
Data data = ex.getIn().getBody(Data.class);
lastId = data.getId();
NewData newData = (NewData) convertData(data);
ex.getMessage().setBody(newData);
}).to("jpa://NewData")
Now I want to save startId after the current polling is over. To do so, I overrode the PollingConsumerPollStrategy with my custom one where I want to access lastId inside the commit method (which gets executed exactly when I want it to, after the current polling is complete).
However, I can't access the route there. I tried via the route ID:
#Override
public void commit(Consumer consumer, Endpoint endpoint, int polledMessages) {
var route = (MyRoute) endpoint.getCamelContext().getRoute("dataRoute");
var lastId = route.getLastId();
log.debug("LastID: {}", lastId);
}
However I'm getting a class cast exception: DefaultRoute to MyRoute. Is it something about handing the ID to my route?
I would do it a bit differently.
Instead of using RouteBuilder instance vars for storing startId and lastId, you may also put these values as GlobalOptions (which is basically a map of key-value pairs) of current CamelContext.
This way, you can easily obtain their value using:
public void commit(Consumer consumer, Endpoint endpoint, int polledMessages) {
String lastId = endpoint.getCamelContext().getGlobalOption​("lastId");
}
It is also (theoretically) a better implementation because it also supports potential concurrent executions, as the id are shared for all instances running in the context.
I have to design a rest API in Java which :-
Accepts a POST request with the below json :-
{
"instrument": "ABC",
"price": "200.90",
"timestamp" : "2018-09-25T12:00:00"
}
these records would be saved in an in memory collection and not any kind of database.
There would be a GET API which returns the statistics of the specific instrument records received in the last 60 seconds. The GET request would be :- /statistics/{instrumentName} Ex :- /statistics/ABC . The response looks as mentioned below :-
{
"count" : "3"
"min" : "100.00"
"max" : "200.00"
"sum" : "450.00"
"avg" : "150.00"
}
There would be another GET request /statistics which returns the statistics of all the instruments that was received in the last 60 seconds ( Not specific to particular instrument like #2 )
What makes this algorithm complex to implement is that the GET call should be executed - O(1) time and space complexity.
The approach which I have thought for 3# is to have a collection which will have 60 buckets ( since we have to calculate for past 60 secs so sampling per 1 sec). Every time the transaction comes in it will go to specific bucket depending on the key i.e. hour-min-sec ( it would be a map with this key and the statistics for that sec ) .
But what I am not able to understand is how to address the problem 2# where we have to get the statistics of specific instrument /statistics/ABC for last 60 sec in O(1) time and space complexity.
What could be the best strategy to clean up records which are older than 60 secs?
Any help with the algorithm will be appreciated.
Store the data in a Map<String, Instrument>, and have the class look like this:
class Instrument {
private String name;
private SortedMap<LocalDateTime, BigDecimal> prices;
private BigDecimal minPrice;
private BigDecimal maxPrice;
private BigDecimal sumPrice;
// Internal helper method
private void cleanup() {
LocalDateTime expireTime = LocalDateTime.now().minusSeconds(60);
Map<LocalDateTime, BigDecimal> expiredPrices = this.prices.headMap(expireTime);
for (BigDecimal expiredPrice : expiredPrices.values()) {
if (this.minPrice.compareTo(expiredPrice) == 0)
this.minPrice = null;
if (this.maxPrice.compareTo(expiredPrice) == 0)
this.maxPrice = null;
this.sumPrice = this.sumPrice.subtract(expiredPrice);
}
expiredPrices.clear(); // Removes expired prices from this.prices
if (this.minPrice == null && ! this.prices.isEmpty())
this.minPrice = this.prices.values().stream().min(Comparator.naturalOrder()).get();
if (this.maxPrice == null && ! this.prices.isEmpty())
this.maxPrice = this.prices.values().stream().max(Comparator.naturalOrder()).get();
}
// other code
}
All the public methods of Instrument must be synchronized and must start with a call to cleanup(), since time has elapsed since any previous call. The addPrice(LocalDateTime, BigDecimal) method must of course update the 3 statistics fields.
To ensure statistics are in sync, it would be appropriate to have a Statistics class that can be used as return value, so all 4 main statistics values (incl. count obtained from this.prices.size()) represent the same set of prices.
I have a list of sessions expirations time and session timeout value.
So I can get session start time subtracting timeout from expiration time.
I know how to check if two dates is overlapping:
public boolean isOverlapped(LocalDateTime start1, LocalDateTime end1,
LocalDateTime start2, LocalDateTime end2) {
return (start1.isBefore(end2) || start1.equals(end2))
&& (start2.isBefore(end1) || start2.equals(end2));
}
but have no idea how to do this with for list of dates.
In result I want to have list with the longest chain of overlapped(concurrent) sessions.
Appreciate any help!
First off, make a new class that represents these sessions (if you don't have one already):
class Session {
private LocalDateTime start;
private LocalDateTime end;
public boolean isOverlapped(Session other) {
return (start.isBefore(other.end) || start.equals(other.end))
&& (end.isAfter(other.start) || end.equals(other.start));
}
...
}
Your input list will have to be a list of Sessions.
Next, here is an algorithm that does what you asked for; It takes in a list and checks for each element if it overlaps with any other element in the list (except for itself). If that is the case, it puts it in the result list:
public static List<Session> filter(List<Session> in) {
List<Session> result = new ArrayList<>();
for(Session current : in) {
for(Session other : in) {
if(current != other && current.isOverlapped(other)) {
result.add(current);
break;
}
}
}
return result;
}
Here is also an example program: Ideone
The result will be a list containing sessions that were concurrent with any other session.
This is a rather classic problem. You didn't specify if you want the longest session in terms of time or in terms of number of intervals, but they both work the same way.
First sort all your sessions by start time. Then the logic would be the following:
current_chain = []
best_chain = []
for session in sessions_sorted_by_start:
if session doesn't overlap with any session in curent_chain: [1]
update best_chain if current_chain is better [2]
current_chain = []
current_chain.insert(session) [3]
update best_chain if current_chain is better [2]
The idea here is the following: we maintain a current chain. If a new session overlaps with any other session in the chain, we just add it to the chain. If it doesn't overlap with any session in the current chain, then its start is to the right from the end of the furthest session in the current chain, so no other remaining session will overlap with anything in the current chain (since they are sorted by the start date). That means that the current chain is as long as it will ever get, so we can check if it is better than the best chain so far ([2]) based on whichever criteria (time or number of sessions), and reset it.
Now, to make it linear in time it would be cool to do overlap check of a session and a chain at [1] in constant time. This is easily done if for the chain you always maintain the furthest session in it. To do it, when you insert a new session at [3], if its end extends beyond the end of the current furthest session, update the furthest session, otherwise do not. This way at [1] you only need to check the overlap with the furthest session, instead of checking all of them, which makes that particular check constant time, and the entire algorithm linear (if you do not account for initial sorting, which is of course O(n log n)).
I'm fetching Google Analytics data using Core Reporting API v4. I'm able to capture at most 10,000 records for a given combination of Dimensions & Metrics.
My question is that if my query can produce more than 10,000 search results then how can I fetch all those records? I have gone through the documentation and found that in a single request we can't access more than 10,000 records by setting the properties of ReportRequest object.
ReportRequest request = new ReportRequest()
.setDateRanges(Arrays.asList(dateRange))
.setViewId(VIEW_ID)
.setDimensions(Arrays.asList(dimension))
.setMetrics(Arrays.asList(metric))
.setPageSize(10000);
How can we enable multiple requests in a single run depending upon the number of search-results that can be obtained.
For example : If my query can return 35,000 records then there should be 4 requests (10,000,10,000, 10,000 & 3,500) managed internally.
Please look into this and facilitate me some guidance. Thanks in Advance.
The Analytics Core Reporting API returns a maximum of 10,000 rows per
request, no matter how many you ask for.
If the request you are making will generate more then 10000 rows then there will be additional rows you can request. The response returned from the first request will contain a parameter called nextPageToken which you can use to request the next set of data.
You will have to dig around the Java library the only documentation on how to do it I have found is HTTP.
POST https://analyticsreporting.googleapis.com/v4/reports:batchGet
{
"reportRequests":[
{
...
# Taken from `nextPageToken` of a previous response.
"pageToken": "XDkjaf98234xklj234",
"pageSize": "10000",
}]
}
Here's a stable and extensively tested solution in Java. It is a recursive solution that stores every 10000 results batch (if any) and recalls itself until finds a null nextToken. In this specific solution every 10000 results batch is saved into a csv and then a recursive call is performed! Note that the first time this function called from somewhere outside, the nextPageToken is also null!! Focus on the recursive rationale and the null value check!
private static int getComplexReport(AnalyticsReporting service,int
reportIndex,java.lang.String startDate,String endDate,ArrayList<String>
metricNames,ArrayList<String> dimensionNames,String pageToken)
throws IOException
ReportRequest req = createComplexRequest(startDate,endDate,metricNames,dimensionNames,pageToken);
ArrayList<ReportRequest> requests = new ArrayList<>();
requests.add(req);
// Create the GetReportsRequest object.
GetReportsRequest getReport = new GetReportsRequest()
.setReportRequests(requests);
// Call the batchGet method.
GetReportsResponse response = service.reports().batchGet(getReport).execute();
//printResponse(response);
saveBatchToCsvFile("dummy_"+startDate+"_"+endDate+"_"+Integer.toString(reportIndex)+".csv",startDate+"_"+endDate,response,metricNames,dimensionNames);
String nextToken = response.getReports().get(0).getNextPageToken();
//System.out.println(nextToken);
if(nextToken!=null)
return getComplexReport(service,reportIndex+1,"2016-06-21","2016-06-21",metricNames,dimensionNames,nextToken);
return reportIndex;
}
var reportRequest = new ReportRequest
{
DateRanges = new List<DateRange> { dateRange },
Dimensions = new List<Dimension> { date, UserId, DeviceCategory},
Metrics = new List<Metric> { sessions },
ViewId = view,
PageSize = 400000
};
My problem
Let's say I want to hold my messages in some sort of datastructure for longpolling application:
1. "dude"
2. "where"
3. "is"
4. "my"
5. "car"
Asking for messages from index[4,5] should return:
"my","car".
Next let's assume that after a while I would like to purge old messages because they aren't useful anymore and I want to save memory. Let's say after time x messages[1-3] became stale. I assume that it would be most efficient to just do the deletion once every x seconds. Next my datastructure should contain:
4. "my"
5. "car"
My solution?
I was thinking of using a concurrentskiplistset or concurrentskiplist map. Also I was thinking of deleting the old messages from inside a newSingleThreadScheduledExecutor. I would like to know how you would implement(efficiently/thread-safe) this or maybe use a library?
The big concern, as I gather it, is how to let certain elements expire after a period. I had a similar requirement and I created a message class that implemented the Delayed Interface. This class held everything I needed for a message and (through the Delayed interface) told me when it has expired.
I used instances of this object within a concurrent collection, you could use a ConcurrentMap because it will allow you to key those objects with an integer key.
I reaped the collection once every so often, removing items whose delay has passed. We test for expiration by using the getDelay method of the Delayed interface:
message.getDelay(TimeUnit.MILLISECONDS);
I used a normal thread that would sleep for a period then reap the expired items. In my requirements it wasn't important that the items be removed as soon as their delay had expired. It seems that you have a similar flexibility.
If you needed to remove items as soon as their delay expired, then instead of sleeping a set period in your reaping thread, you would sleep for the delay of the message that will expire first.
Here's my delayed message class:
class DelayedMessage implements Delayed {
long endOfDelay;
Date requestTime;
String message;
public DelayedMessage(String m, int delay) {
requestTime = new Date();
endOfDelay = System.currentTimeMillis()
+ delay;
this.message = m;
}
public long getDelay(TimeUnit unit) {
long delay = unit.convert(
endOfDelay - System.currentTimeMillis(),
TimeUnit.MILLISECONDS);
return delay;
}
public int compareTo(Delayed o) {
DelayedMessage that = (DelayedMessage) o;
if (this.endOfDelay < that.endOfDelay) {
return -1;
}
if (this.endOfDelay > that.endOfDelay) {
return 1;
}
return this.requestTime.compareTo(that.requestTime);
}
#Override
public String toString() {
return message;
}
}
I'm not sure if this is what you want, but it looks like you need a NavigableMap<K,V> to me.
import java.util.*;
public class NaviMap {
public static void main(String[] args) {
NavigableMap<Integer,String> nmap = new TreeMap<Integer,String>();
nmap.put(1, "dude");
nmap.put(2, "where");
nmap.put(3, "is");
nmap.put(4, "my");
nmap.put(5, "car");
System.out.println(nmap);
// prints "{1=dude, 2=where, 3=is, 4=my, 5=car}"
System.out.println(nmap.subMap(4, true, 5, true).values());
// prints "[my, car]" ^inclusive^
nmap.subMap(1, true, 3, true).clear();
System.out.println(nmap);
// prints "{4=my, 5=car}"
// wrap into synchronized SortedMap
SortedMap<Integer,String> ssmap =Collections.synchronizedSortedMap(nmap);
System.out.println(ssmap.subMap(4, 5));
// prints "{4=my}" ^exclusive upper bound!
System.out.println(ssmap.subMap(4, 5+1));
// prints "{4=my, 5=car}" ^ugly but "works"
}
}
Now, unfortunately there's no easy way to get a synchronized version of a NavigableMap<K,V>, but a SortedMap does have a subMap, but only one overload where the upper bound is strictly exclusive.
API links
SortedMap.subMap
NavigableMap.subMap
Collections.synchronizedSortedMap