Java - How to compare Guava Ranges? - java

I decided to create a Map to store metric names and the Ranges representing live periods for each metric. At first I used a TreeRangeMap to store the Ranges but since each Metric contains a single Range I switched to Ranges as shown below.
My goal is to keep the latest time range in the DEFAULT_METRICS_MAP when I receive a Range for the metric from external API.
When I had a TreeRangeMap representing Ranges, comparing them was easy. I added new metric to the TreeRangeMap and then got the max range like this:
private static Optional<Range<Long>> maxRange(TreeRangeSet<Long> rangeSet) {
Set<Range<Long>> ranges = rangeSet.asRanges();
return ranges.stream().max(Comparator.comparing(Range::upperEndpoint));
}
What would be the correct way to compare Ranges when they are not wrapped into a TreeRangeMap?
public static final Map<String, Range<Long>> DEFAULT_METRICS_MAP;
static {
Map<String, Range<Long>> theMap = new HashMap<>();
theMap.put("Metric1", Range.closed(Long.MIN_VALUE, Long.MAX_VALUE));
theMap.put("Metric2", Range.closed(10L, 20L));
theMap.put("Metric3", Range.closed(30L, 50L));
METRICS_MAP = Collections.unmodifiableMap(theMap);
}

First of all it was a correct decission to avoid using TreeRangeMap/TreeRangeSet in this particular case. As I understand (correct me if I'm wrong), you don't need to keep all the ranges for all the metrics. What you need is the latest range for each metric at every moment in time.
Ideally you would like to have a very fast method of retriving, like:
Range<Long> range = getRange(metric);
The most efficient way is to compare Range objects on receiving them:
public void setRange(String metric, Range<Long> newRange) {
Range<Long> oldRange = metricRanges.get(metric);
if (comparator.compare(newRange, oldRange) > 0) {
metricRanges.put(metric, newRange);
}
}
Here is the full example:
// Better keep this map encapsulated
private final Map<String, Range<Long>> metricRanges = new HashMap<>();
private final Comparator<Range<Long>> comparator =
Comparator.nullsFirst(Comparator.comparing(Range::upperEndpoint));
static {
// Fill in your map with default ranges
}
public void setRange(String metric, Range<Long> newRange) {
Range<Long> oldRange = metricRanges.get(metric);
if (comparator.compare(newRange, oldRange) > 0) {
metricRanges.put(metric, newRange);
}
}
public Range<Long> getRange(String metric) {
return metricRanges.get(metric);
}
If you still need Optional:
public Optional<Range<Long>> getRange(String metric) {
return Optional.of(metricRanges.get(metric));
}

Related

Averaging across multiple fields with IntSummaryStatistics

I'm trying to use Java 8 streams to create a single CarData object, which consists of an average of all the CarData fields in the list coming from getCars;
CarData = new CarData();
CarData.getBodyWeight returns Integer
CarData.getShellWeight returns Integer
List<CarData> carData = carResults.getCars();
IntSummaryStatistics averageBodyWeight = carData.stream()
.mapToInt((x) -> x.getBodyWeight())
.summaryStatistics();
averageBodyWeight.getAverage();
IntSummaryStatistics averageShellWeight = carData.stream()
.mapToInt((x) -> x.getShellWeight())
.summaryStatistics();
getShellWeight.getAverage();
I don't want to have to put each of these back together in my final returned result.
Visually, this is my list
getCars() : [
{CarData: { getBodyWeight=10, getShellWeight=3 } }
{CarData: { getBodyWeight=6, getShellWeight=5 } }
{CarData: { getBodyWeight=8, getShellWeight=19 } }
]
and the output I'm trying to achieve is a single object that has the average of each of the fields I specify. not sure If I need to use Collectors.averagingInt or some combo of IntSummaryStatistics to achieve this. Easy to do across one field for either of these techniques, just not sure what I'm missing when using multiple integer fields.
{CarData: { getBodyWeight=8, getShellWeight=9 } }
Starting with JDK 12, you can use the following solution:
CarData average = carData.stream().collect(Collectors.teeing(
Collectors.averagingInt(CarData::getBodyWeight),
Collectors.averagingInt(CarData::getShellWeight),
(avgBody, avgShell) -> new CarData(avgBody.intValue(), avgShell.intValue())));
For older Java versions, you can do either, add the teeing implementation of this answer to your code base and use it exactly as above or create a custom collector tailored to your task, as shown in Andreas’ answer.
Or consider that streaming twice over a List in memory is not necessarily worse than doing two operations in one stream, both, readability- and performance-wise.
Note that calling intValue() on Double objects has the same behavior as the (int) casts in Andreas’ answer. So in either case, you have to adjust the code if other rounding behavior is intended.
Or you consider using a different result object, capable of holding two floating point values for the averages.
You need to write your own Collector, something like this:
class CarDataAverage {
public static Collector<CarData, CarDataAverage, Optional<CarData>> get() {
return Collector.of(CarDataAverage::new, CarDataAverage::add,
CarDataAverage::combine,CarDataAverage::finish);
}
private long sumBodyWeight;
private long sumShellWeight;
private int count;
private void add(CarData carData) {
this.sumBodyWeight += carData.getBodyWeight();
this.sumShellWeight += carData.getShellWeight();
this.count++;
}
private CarDataAverage combine(CarDataAverage that) {
this.sumBodyWeight += that.sumBodyWeight;
this.sumShellWeight += that.sumShellWeight;
this.count += that.count;
return this;
}
private Optional<CarData> finish() {
if (this.count == 0)
return Optional.empty();
// adjust as needed if averages should be rounded
return Optional.of(new CarData((int) (this.sumBodyWeight / this.count),
(int) (this.sumShellWeight / this.count)));
}
}
You then use it like this:
List<CarData> list = ...
Optional<CarData> averageCarData = list.stream().collect(CarDataAverage.get());

Efficiently search a list of consecutive date periods

I have a structure which contains consecutive time periods (without overlap) and a certain value.
class Record {
private TimeWindow timeWindow;
private String value;
}
interface TimeWindow {
LocalDate getBeginDate();
LocalDate getEndDate(); //Can be null
}
My goal is to implement a function which takes a date and figures out the value.
A naive implementation could be to loop through all records until the date matches the window.
class RecordHistory {
private List<Record> history;
public String getValueForDate(LocalDate date) {
for (Record record : history) {
if (record.dateMatchesWindow(date)){
return record.getValue();
}
}
return null; //or something similar
}
}
class Record {
private TimeWindow timeWindow;
private String value;
public boolean dateMatchesWindow(LocalDate subject) {
return !subject.isBefore(timeWindow.getBeginDate()) && (timeWindow.getEndDate() == null || !subject.isAfter(timeWindow.getEndDate()));
}
public String getValue(){
return value;
}
}
The origin of these values are from database queries (no chance to change the structure of the tables). The list of Records could be small or huge, and the dates vary from the start of the history until the end. However, the same date will not be calculated twice for the same RecordHistory. There will be multiple RecordHistory objects, the values represent different attributes.
Is there an efficient way to search this structure?
You can use binary search to get the matching Record (if such a record exists) in O(logn) time.
Java already has data structure that do that for you, e.g. the TreeMap. You can map every Record to its starting time, then get the floorEntry for a given time, and see whether it's a match.
// create map (done only once, of course)
TreeMap<LocalDate, Record> records = new TreeMap<>();
for (Record r : recordList) {
records.put(r.getTimeWindow().getBeginDate(), r);
}
// find record for a given date
public String getValueForDate(LocalDate date) {
Record floor = records.floorEntry(date).getValue();
if (floor.dateMatchesWindow(date)) {
return r;
}
return null;
}
If the entries are non-overlapping, and if the floor entry is not a match, than no other entry will be.

Jenetics , how to find subset of set using GA

I am trying to find best subset of set. Imagine that we need to find subset of objects. We have got some fitness function for this subset. So at the beginning we should make a population of subsets and then using GA we should try to find the best subset.
I would like to use Jenetics.io but I do not know how to use it in this case. Problem for me is that chromosomes are much different data structure than subset.
I would like to have a function( population, fitness function) which makes all needed job.
I tried to understand how Jenetics exactly works. Maybe I am wrong but I think there is no way to make it works the way I want.
Please give me advice , maybe there is option to use Jenetics in this case?
There is a sub-set example in the Jenetics library. Essentially, it has the following form:
class SubsetExample
implements Problem<ISeq<MyObject>, EnumGene<MyObject>, Double>
{
// Define your basic set here.
private final ISeq<MyObject> basicSet = ISeq.empty();
private final int subSetSize = 5;
#Override
public Function<ISeq<MyObject>, Double> fitness() {
return subSet -> {
assert(subset.size() == subSetSize);
double fitness = 0;
for (MyObject obj : subSet) {
// Do some fitness calculation
}
return fitness;
};
}
#Override
public Codec<ISeq<MyObject>, EnumGene<MyObject>> codec() {
return codecs.ofSubSet(basicSet, subSetSize);
}
public static void main(final String[] args) {
final SubsetExample problem = new SubsetExample()
final Engine<EnumGene<MyObject>, Double> engine = Engine.builder(problem)
.minimizing()
.maximalPhenotypeAge(5)
.alterers(
new PartiallyMatchedCrossover<>(0.4),
new Mutator<>(0.3))
.build();
final Phenotype<EnumGene<MyObject>, Double> result = engine.stream()
.limit(limit.bySteadyFitness(55))
.collect(EvolutionResult.toBestPhenotype());
System.out.print(result);
}
}

non-locking threading code using atomic types when implementing a sliding window class for time

I am trying to understand this code from yammer metrics. The confusion starts with the trim method and the call to trim in both update and getSnapShot. Could someone explain the logic here say for a 15 min sliding window? Why would you want to clear the map before passing it into SnapShot (this is where the stats of the window are calculated).
package com.codahale.metrics;
import java.util.concurrent.ConcurrentSkipListMap;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicLong;
public class SlidingTimeWindowReservoir implements Reservoir {
// allow for this many duplicate ticks before overwriting measurements
private static final int COLLISION_BUFFER = 256;
// only trim on updating once every N
private static final int TRIM_THRESHOLD = 256;
private final Clock clock;
private final ConcurrentSkipListMap<Long, Long> measurements;
private final long window;
private final AtomicLong lastTick;
private final AtomicLong count;
public SlidingTimeWindowReservoir(long window, TimeUnit windowUnit) {
this(window, windowUnit, Clock.defaultClock());
}
public SlidingTimeWindowReservoir(long window, TimeUnit windowUnit, Clock clock) {
this.clock = clock;
this.measurements = new ConcurrentSkipListMap<Long, Long>();
this.window = windowUnit.toNanos(window) * COLLISION_BUFFER;
this.lastTick = new AtomicLong();
this.count = new AtomicLong();
}
#Override
public int size() {
trim();
return measurements.size();
}
#Override
public void update(long value) {
if (count.incrementAndGet() % TRIM_THRESHOLD == 0) {
trim();
}
measurements.put(getTick(), value);
}
#Override
public Snapshot getSnapshot() {
trim();
return new Snapshot(measurements.values());
}
private long getTick() {
for (; ; ) {
final long oldTick = lastTick.get();
final long tick = clock.getTick() * COLLISION_BUFFER;
// ensure the tick is strictly incrementing even if there are duplicate ticks
final long newTick = tick > oldTick ? tick : oldTick + 1;
if (lastTick.compareAndSet(oldTick, newTick)) {
return newTick;
}
}
}
private void trim() {
measurements.headMap(getTick() - window).clear();
}
}
Two bits of information from the documentation
ConcurrentSkipListMap is sorted according to the natural ordering of its keys
that's the datastructure to hold all measurements. Key here is a long which is basically the current time. -> measurements indexed by time are sorted by time.
.headMap(K toKey) returns a view of the portion of this map whose keys are strictly less than toKey.
The magic code in getTick makes sure that one time value is never used twice (simply takes oldTick + 1 if that would happen). COLLISION_BUFFER is a bit tricky to understand but it's basically ensuring that even through Clock#getTick() returns the same value you get new values that don't collide with the next tick from clock.
E.g.
Clock.getTick() returns 0 -> modified to 0 * 256 = 0
Clock.getTick() returns 1 -> modified to 1 * 256 = 256
-> 256 values room in between.
Now trim() does
measurements.headMap(getTick() - window).clear();
This calculates the "current time", subtracts the time window and uses that time to get the portion of the map that is older than "window ticks ago". Clearing that portion will also clear it in the original map. It's not clearing the whole map, just that part.
-> trim removes values that are too old.
Each time you update you need to remove old values or the map gets too large. When creating the Snapshot the same things happens so those old values are not included.
The endless for loop in getTick is another trick to use the atomic compare and set method to ensure that - once you are ready to update the value - nothing has changed the value in between. If that happens, the whole loop starts over & refreshes it's starting value. The basic schema is
for (; ; ) {
long expectedOldValue = atomic.get();
// other threads can change the value of atomic here..
long modified = modify(expectedOldValue);
// we can only set the new value if the old one is still the same
if (atomic.compareAndSet(expectedOldValue, modified)) {
return modified;
}
}

How to re-order a List<String>

I have created the following method:
public List<String> listAll() {
List worldCountriesByLocal = new ArrayList();
for (Locale locale : Locale.getAvailableLocales()) {
final String isoCountry = locale.getDisplayCountry();
if (isoCountry.length() > 0) {
worldCountriesByLocal.add(isoCountry);
Collections.sort(worldCountriesByLocal);
}
}
return worldCountriesByLocal;
}
Its pretty simple and it returns a list of world countries in the users locale. I then sort it to get it alphabetic. This all works perfectly (except I seem to occasionally get duplicates of countries!).
Anyway, what I need is to place the US, and UK at the top of the list regardless. The problem I have is that I can't isolate the index or the string that will be returned for the US and UK because that is specific to the locale!
Any ideas would be really appreciated.
Anyway, what I need is to place the US, and UK at the top of the list regardless. The problem I have is that I can't isolate the index or the string that will be returned for the US and UK because that is specific to the locale!
It sounds like you should implement your own Comparator<Locale> to compare two locales with the following steps:
If the locales are the same, return 0
If one locale is the US, make that "win"
If one locale is the UK, make that "win"
Otherwise, use o1.getDisplayCountry().compareTo(o2.getDisplayCountry()) (i.e. delegate to existing behaviour)
(This will put the US before the UK.)
Then call Collections.sort with an instance of your custom comparator.
Do all of this before extracting the country names - then extract them from the sorted list.
You could also use a TreeSet to eliminate duplicates and your own Comparator to bring US and GB up to the start.
You are getting duplicates (which this will eliminate) because there are often more than one locale per country. There is a US(Spanish) as well as a US(English) and there are three Switzerlands (French, German and Italian) for example.
public class AllLocales {
// Which Locales get priority.
private static final Locale[] priorityLocales = {
Locale.US,
Locale.UK
};
private static class MyLocale implements Comparable<MyLocale> {
// My Locale.
private final Locale me;
public MyLocale(Locale me) {
this.me = me;
}
// Convenience
public String getCountry() {
return me.getCountry();
}
#Override
public int compareTo(MyLocale it) {
// No duplicates in the country field.
if (getCountry().equals(it.getCountry())) {
return 0;
}
// Check for priority ones.
for (int i = 0; i < priorityLocales.length; i++) {
Locale priority = priorityLocales[i];
// I am a priority one.
if (getCountry().equals(priority.getCountry())) {
// I come first.
return -1;
}
// It is a priority one.
if (it.getCountry().equals(priority.getCountry())) {
// It comes first.
return 1;
}
}
// Default to straight comparison.
return getCountry().compareTo(it.getCountry());
}
}
public static List<String> listAll() {
Set<MyLocale> byLocale = new TreeSet();
// Gather them all up.
for (Locale locale : Locale.getAvailableLocales()) {
final String isoCountry = locale.getDisplayCountry();
if (isoCountry.length() > 0) {
//System.out.println(locale.getCountry() + ":" + isoCountry + ":" + locale.getDisplayName());
byLocale.add(new MyLocale(locale));
}
}
// Roll them out of the set.
ArrayList<String> list = new ArrayList<>();
for (MyLocale l : byLocale) {
list.add(l.getCountry());
}
return list;
}
public static void main(String[] args) throws InterruptedException {
// Some demo usages.
List<String> locales = listAll();
System.out.println(locales);
}
}
yes, when you do sort, just provide your own comparator
Collections.sort(worldCountriesByLocal, new Comparator() {
#Override
public int compare(String o1, String o2) {
if (o1.equals(TOP_VALUE))
return -1;
if (o2.equals(TOP_VALUE))
return 1;
return o1.compareTo(o2);
}
})
where top value will be value what you want to always on top
I would write my own POJO with a sort token consisting of an integer assigning priority (e.g. 0 for US, 1 for UK, 2 for everyone else), then some delimiter and then the country name. Then I would put the array in a HashMap keyed by that sort ID and the POJO as the val. Then I would sort the keys out of the map and iterate through the sorting and retrieve the plain country name for each sorted key.
E.g.
2.Sweden
2.France
2.Tanzania
0.US
1.UK
sorts
0.US
1.UK
2.France
2.Sweden
2.Tanzania
EDIT: a POJO is needed only if you have more fields other than the country name. If it is just the country name, I would set the sort ID as the hash key and the country name as the val and skip the POJO part.

Categories

Resources