Dynamic sorting with Java 8 stream? - java

Long story short, I am using JDBI DAO to access data. It is hard to have a query with a dynamic order in it. So I am planning to use a Java 8 stream to perform this as a post processing step after the query results are fetched.
The problem is the way the comparator works by having to declare a method of the object statically.
shepherds = shepherds.stream()
.sorted(Comparator.comparing(Shepherd::getId).reversed())
.collect(Collectors.toList());
How can I do this dynamically with what will be variables like this
orderBy = id
orderDirection = ASC
So that I can paramaterize this method call?
e.g.
if(orderDirection.equals("ASC"))
shepherds.stream().sorted(Comparator.comparing(orderBy));
else
shepherds.stream().sorted(Comparator.comparing(orderBy).reversed());

The simplest way is to probably build a Map (unless you really can't do that on the DB side, which should be your main focus), where the Key would look like :
class Key {
String field;
Direction direction; // enum
// getters/setters/hashcode/equals
}
And simply create this Map upfront:
Map<Key, Comparator<Shepherd>> map = new HashMap<>();
map.put(new Key("id", Direction.ASC), Comparator.comparing(Shepard::getId))
map.put(new Key("id", Direction.DESC), Comparator.comparing(Shepard::getId).reversed())
The other way, I think it would be possible to create this really dynamically via LambdaMetafactory, but it is rather complicated.

Related

Apache Flink ListState<String> vs ValueState<List<String>>

Looking at the documentation, it seems like I could use either a ListState or a ValueState<List<String>> to store state. For example the code below:
// Use ListState
ListStateDescriptor<String> lDescriptor = new ListStateDescriptor<String>
("testListState", TypeInformation.of(new TypeHint<String>() {}));
ListState<String> testListState = getRuntimeContext().getListState(lDescriptor);
// Use ValueState
ValueStateDescriptor<List<String>> testDescriptor =
new ValueStateDescriptor<List<String>>("testList",
TypeInformation.of(new TypeHint<List<String>>() {}));
ValueState<List<String>> testState = getRuntimeContext().getState(testDescriptor);
If I need to store a unique list of elements tied to each key, would there be a benefit of using one over the other? The downside of using ListState would be first converting the Iterable to a List<> if I need to modify it before saving the list whereas I could just retrieve the list directly if I use ValueState.
I only use ValueState if I only want store one value to each key. You can use it to store lists, but the code will be more verbose.
If you use ValueState, you must get the value, update the list, and update the value but if you use ListState you can manage it directly

Guava Collections - filter by array of String values

I have a list of Event ojects. Each object exposes getId() getter. I need to filter collection to get only items with specific id, I can do that this way:
Lists.newArrayList(Iterables.filter(ret, x->x.getCategoryId().equals(category)));
As a result, I'm getting new array, filtered to items, where getCategoryId() equals my specific category.
Fine so far. Problem: what if instead of single, specific category, I have array of String values (all categories to be used as filter). This could look as follows:
Lists.newArrayList(Iterables.filter(ret, x->x.getCategoryId().equals(categories.get(0)) || x.getCategoryId().equals(categories.get(1)) || ......../*To the end of the list*/));
As my categories list is dynamic, I need to use dynamic query to get all || criteria applied. What is the best approach to do that? Can I loop it somehow or provide array as my criteria to filter method?
NOTES: I'm on Android, so:
Java 8 lambdas can be used (as you can see above) to simplify syntax.
Java 8 streams can't be used (because min api is Android Lollipop). This is why Guava is used to perform filtering. Please don't propose any soultions based on Java 8 streams.
So, do you have any idea?
All you have to do is build a guava Predicate, since you would need fast look-up, it might pay off to build a Set from that array first:
Set<String> set = new HashSet<>(Arrays.asList(values));
And than simply replace the Predicate:
x -> set.contains(x.getCategoryId())
Using only Java 8, you'd do something like:
final Set<String> categories = new HashSet<>(Arrays.asList("category 1", "category 2"));
ret.stream()
.filter(x -> categories.contains(x.getCategoryId()))
.collect(Collectors.toList());

Broadcasting a HashMap in Flink

I am using Flink v.1.4.0.
I am working with the DataSet API and one of the things I want to try is very similar to how broadcast variables are used in Apache Spark.
Practically, I want to apply a map function on a DataSet, go through each of the elements in the DataSet and search for it in a HashMap; if the search element is present in the Map then retrieve the respective value.
The HashMap is very big and I don't know if (since I haven't even built my solution) it needs to be Serializable to be transmitted and used by all workers concurrently.
In general, the solution I have in mind would look like this:
Map<String, T> hashMap = new ... ;
DataSet<Point> points = env.readCsv(...);
points
.map(point -> hashMap.getOrDefault(point.getId, 0))
...
but I don't know if this would work or if it is efficient in any way. After doing a bit of searching I found a much better example here according to which one can us Broadcast variables in Flink to broadcast a List as follows:
DataSet<Point> points = env.readCsv(...);
DataSet<Centroid> centroids = ... ; // some computation
points.map(new RichMapFunction<Point, Integer>() {
private List<Centroid> centroids;
#Override
public void open(Configuration parameters) {
this.centroids = getRuntimeContext().getBroadcastVariable("centroids");
}
#Override
public Integer map(Point p) {
return selectCentroid(centroids, p);
}
}).withBroadcastSet("centroids", centroids);
However, .getBroadcastVariable() seems to only work with a List.
Can someone provide an alternative solution with a HashMap?
How would that solution work?
What is the most efficient way to go about solving this?
Could one use a Flink Managed State to do something similar to how broadcast variables are used? How?
Finally, can I attempt multiple mappings with multiple broadcast variables in a pipeline?
Where do the values of hashMap come from? Two other possible solutions:
Reinitialise/recreate/regenerate hashMap in each instance of your filtering/mapping operator separately in open method. Probably more efficient per record, but duplicates initialisation logic.
Create two DataSet, one for hashMap values, second for points and join those two DataSets using desired join strategy. As an analogy, what you are trying to do could be expressed by SQL query SELECT * FROM points p, hashMap h WHERE h.key = p.id.

Is there a possibility to keep a single map store and use for multiple maps in hazelcast

Currently am using Hazelcast and persistence database as Hbase,
So far I have 10 maps, for each map am using a map store, So Am using 10 mapstore classes (i.e) In all the 10 classes am implementing the MapStore. It creates a complexity in maintenance.
So What I did is, I kept a generic map store and implemented the same class for all the maps, It has the ability to accept it, To make it clear, I did something like
Map1 - com.test.GenericMapStore
Map2 - com.test.GenericMapStore
Map3 - com.test.GenericMapStore
...
Map10 - com.test.GenericMapStore
It gets mapped as above,
But for the methods in store, storeAll, loadAllKeys, loadAll am able to check the instance of object and find the mapName ---- Not a good way
But for methods like delete, deleteAll, load - I dont have any clue to find the mapName,
Pls tell me like any way to use a singleMapStore for all the maps???
So I need a map store implementation where, for all methods in mapstore, I need the PARAM called mapName to be passed, So In case, If I have common Implementation, I can make use of it just by using MAP NAME param in all the methods,
Example :
Store(String key, Object object, String mapName),
StoreAll(Map, String mapName),
delete(String key, String mapName)
delete(Collections keys, String mapName) ...
If there is a way already available, Pls do let me know...
Thanks hazelcast team,,, You ppl are doing the great job... Much Apprecaiated...
Thanks and Regards,
Harry
You should be able to achieve this with a MapStoreFactory (docs).
The MapStoreFactory is called with the name of the map and you can pass that name into the GenericMapStore.
In you MapStoreFactory :
public MapLoader newMapStore(mapName, props) {
return new GenericMapStore(mapName);
}
then in GenericMapStore you will have the mapName for each operation.

Java Arrays: Identify elements of a vector with constants

I have a String array (String[]) containing several String objects representing XPath queries. These queries are predetermined at design time. This array is passed to an object who executes the queries and then returns a Map<String, ArrayList<String>> with the results.
The map is made like this:
{Query that originated the result, Results vector}
Since I have to take these results and then perform some work with them, I need to know the individual queries. e.g.:
ArrayList<String> firstQueryResults = xpathResults.getObject(modelQueries[0]);
... logic pertaining only to the first query results ...
Retrieving the results by an integer (in the case of the first query, "0") doesn't seem nice to me, so I was wondering if there would be the possibility to identify them via enum-like constants, for better clarity:
... = xpathResults.getObject(QueryDictionary.MODEL_PACKAGE);
... = xpathResults.getObject(QueryDictionary.COMPONENT_PACKAGE);
OR
... = xpathResults.getObject(ModelQueries.PACKAGE);
... = xpathResults.getObject(ComponentQueries.PACKAGE);
I thought of using maps (i.e. Map<String, String> as in Map {ID, Query}) but I have still to reference the queries via an hardcoded string (e.g. "Package").
I also thought of using enums but i have several query sets (Model, Component, ...) and I also need to get all the query in a set in a String[] form in order to pass them to the object who performs the queries.
You can use a marker interface:
public interface QueryType {
}
Then your enums can implement this interface:
public enum ModelQueries implements QueryType {
...
}
public enum ComponentQueries implements QueryType {
...
}
and so on.
Then your getObject method can accept a parameter of type QueryType. Were you looking for something like this? Let me know if I haven't understood your question properly.

Categories

Resources