Hazelcast: Predicate Performance Issue - java

I am facing a performance issue in hazelcast while using the Predicate on the hazelcast map.
So I have a model class as shown below:
public class MyAccount {
private String account;
private Date startDate;
private Date endDate;
private MyEnum accountTypeEnum;
// Overrides equals and hascodes using all the attributes
// Has getters and setters for all the attributes
}
Then I create a hazelcast instance of type Hazelcast<MyAccount, String>. And in that instance I start saving the MyAccount object as key and associated string as it's value.
Point to note: I am saving these accounts in different maps (let say local, state, national and international)
Approx 180,000 objects of MyAccount is created and saved in the hazelcast, in different maps depending upon the account's geographical position. Apart from these, hazelcast stores another 50,000 string objects as keys and values in different maps (excluding the maps mentioned above)
Then I have a method which uses predicate filters on the attributes account, startDate and endDate to filter out accounts. Lets call this method as filter.
public static Predicate filter(String account, Date date) {
EntryObject entryObject = new PredicateBuilder().getEntryObject();
PredicateBuilder accountPredicate = entryObject.key().get(Constants.ACCOUNT).equal(account);
PredicateBuilder startDatePredicate = entryObject.key().get(Constants.START_DATE).isNull().or(entryObject.key().get(Constants.START_DATE).lessEqual(date));
PredicateBuilder endDatePredicate = entryObject.key().get(Constants.END_DATE).isNull().or(entryObject.key().get(Constants.END_DATE).greaterThan(date));
return accountPredicate.and(effectiveDatePredicate.and(endDatePredicate));
}
private void addIndexesToHazelcast() {
Arrays.asList("LOCAL", "STATE", "NATIONAL", "INTERNATIONAL").forEach(location -> {
IMap<Object, Object> map = hazelcastInstance.getMap(location);
map.addIndex("__key." + "startDate", true);
map.addIndex("__key." + "endDate", true);
map.addIndex("__key." + "account", false);
});
}
Issue: For a particular map, say local, which holds around 80,000 objects, when I use the predicate to fetch the values from the map, it takes around 4 - 7 seconds which is unacceptable.
Predicate predicate = filter(account, date);
String value = hazelcast.getMap(mapKey).values(predicate); // This line takes 4-7 secs
I am surprised that the cache should take 4 - 7 seconds to fetch the value for one single account given that I have added index in the hazelcast maps for the same attributes. This is a massive performance blow.
Could anybody please let me know why is this happening ?

Related

How to efficiently compare two objects of same Class and check which are the fields that differ?

I want to write a generic function that accepts two objects of same entity class and compares the fields that are different and returns List of all the changes made to particular fields along with time.
One among the many entity classes would be say Member as follows
public class Member {
String firstName;
String lastName;
String driverLicenseNumber;
Integer age;
LocalDateTime timestamp;
}
In the DB, I have a table called member_audit that gets populated with old data whenever there is a change in member table using triggers (Similarly for other entities).
The List of resource for each of the entity I would be returning is something like
public class MemberAuditsResource {
private String field;
private LocalDateTime on;
private String changeType;
private String oldValue;
private String newValue;
}
I can only think of writing a function for each entity separately like this
private List<MembeAuditsResource> memberCompare(Member obj1, Member obj2) {
//Compare every field in both the objects using if else and populate the resource.
}
And then calling the above function to compare every pair of record in the entity_audit table.
The code would be very large to compare every field and multiplied by different entities.
Is there a better and efficient way?
If you extend the ideas to compare the object graph , it is not a trivial problem. So, the efficient way is not to re-inventing the wheel but use an existing library such as JaVers :
Member oldMember = new Member("foo" ,"chan" ,"AB12" , 21 ,LocalDateTime.now());
Member newMember = new Member("bar" ,"chan" ,"AB12" , 22 ,LocalDateTime.now());
Diff diff = javers.compare(oldMember, newMember);
for(Change change: diff.getChanges()) {
System.out.println(change);
}
Then , you can get something like:
ValueChange{ 'firstName' changed from 'foo' to 'bar' }
ValueChange{ 'age' changed from '21' to '22' }
Convert both object to a Map using JSON objectMapper.convertValue method. Then you can easily compare the keys/values of the two maps and create a list of differences.

Choosing right data structure and design for Java multithreading problem

I am implementing a singleton class that must handle multiple threads accessing its data structure at once.
The class has a method that returns true if the data structure already contains myObject and false otherwise. If the object has not been seen then object is added to the data structure.
boolean alreadySeen(MyObject myObject){}
MyObject has two member variables Instant expiration and String id where id acts as my key to decide whether the data structure contains myObject. I cannot change MyObject class. I need to periodically check the expiration of myObjects in the data structure and remove them if they have expired.
So I am looking to use one or more data structures that I can quickly add, delete and search by both expiration and id. I will mostly be adding elements and searching if element exists with the periodic cleanup removing expired elements.
A map like ConcurrentHashMap<id,MyObject> gives me the O(1) insert and delete but it would be O(n) to search through for expired objects.
As mentioned above I cannot change the MyObject class. So I thought about making a wrapper for that class so I can override equals() and hashcode() and then do an ordered set like ConcurrentSkipListSet<MyObjectWrapper>(new ExpComparator()) This would let me order order the set by expiration date and then I could quickly find expired ones on top. However I believe this would be O(log n) for search, delete.
Is there any better structure I could use? And if not am I better off in the long run with the map at O(1) lookup and add but periodic O(n) for delete of expiration? Or better for set with O(log n) of everything?
You lookup,add and remove operation can run at O(1),but it need other cost as follow:
First,it need double memory to store data
Second,Expiration time cannot be very accurate
we need two map,one store object,key is the id and value is object,like Map<id,MyObject> another may to store the relationship between expiration and MyObjects likeMap<Long,List<MyObjects>>,key need to calculate.
Codes:
In order to simple write code i modify your MyObject class:
class MyObject {
private long expiration;
private String id;
}
The other code
private Map dataSet = new ConcurrentHashMap<>();
private Map> obj2Expiration = new ConcurrentHashMap<>();
public boolean alreadySeen(MyObject myObject) {
boolean exist = dataSet.containsKey(myObject.getId());
if (!exist) {
dataSet.put(myObject.getId(),myObject);
Long expirateKey = myObject.getExpiration() / 5000;
List<MyObject> objects = obj2Expiration.get(expirateKey);
if (null == objects) {
objects = new ArrayList<>();
obj2Expiration.put(expirateKey,objects);
} else {
objects.add(myObject);
}
}
return exist;
}
#Scheduled(fixedRate = 5000)
public void remove() {
long now = System.currentTimeMillis();
long needRemode = now /5000 -1;
Optional.ofNullable(obj2Expiration.get(needRemode))
.ifPresent(objects -> objects.stream().forEach(o -> {
dataSet.remove(o.getId());
}));
}

Riak 2i - Update deletes secondary indexes

I am using oficial Riak Java client v2.0.2. When I update previously written value (with 2i indexes), the secondary indexes are not preserved.
This is how I do update:
Location location = new Location(this.namespace, key);
UpdateValue updateOp = new UpdateValue.Builder(location)
.withFetchOption(FetchValue.Option.DELETED_VCLOCK, true)
.withUpdate(new RiakKVUpdateValue(values))
.build();
And this is my update class:
public class RiakKVUpdateValue extends Update<Map<String, String>> {
private final Map<String, String> value;
public RiakKVUpdateValue(HashMap<String, ByteIterator> values) {
this.value = StringByteIterator.getStringMap(values);
}
#Override
public Map<String, String> apply(Map<String, String> original) {
return this.value;
}
}
I haven't found anything in the docs about updating objects with 2i indexes.
Am I doing someting wrong?
Should I do manual Read/Modify/Write?
You have to fetch the index and write it back every time you update the value. See 2i Indexing an Object.
I would suggest to create a field to hold the index and annotate it with #RiakIndex. A field annotated with this annotation is populated with 2i values automatically by the Java client when fetched. Then, copy its value in RiakKVUpdateValue.apply() to retain it. Alternatively, fetch and then write back in two separate commands as you already mentioned. This will allow you to control metadata you want to write back. Don't forget to populate the VClocks.
P.S. Retaining 2i automatically can be a bad idea since it's not obvious that a user will want to keep old 2i values. I believe that's why it is left up to the user to decide.

Google cloud datastore runquery get as date variable?

I'm using java. This my code
RunQueryResponse response = dataset.runQuery("project_name", queryrequest).execute();
and response.tostring(). I have all the query I want but there many.
How to get a single value with each field. Like put it in an array or something we can call with for loop or interator.
Thanks
----add code--------
Here some my code:
Iterator<EntityResult> entity_interator = response.getBatch().getEntityResults().iterator();
Map<String, Property> entity;
while(entity_interator.hasNext()){
entity = entity_interator.next().getEntity().getProperties();
String first = entity.get("First").toString();
String last = entity.get("Last").toString();
String time = entity.get("Time").toString();
System.out.println(first);
System.out.println(last);
System.out.println(time);
}
and response:
{"values":[{"stringValue":"first name"}]}
{"values":[{"stringValue":"last name"}]}
{"values":[{"dateTimeValue":"2013-08-28T08:21:58.498Z"}]}
How can I get time as date time varible and first name and last name with out {"values":[{"stringValue":" and all junk thing.
Each Property may contain one or more Value objects, and each Value can contain one of several different value types (each type has its own field). If your properties are single-valued, you can just take the first one from the list:
Iterator<EntityResult> entity_interator = response.getBatch().getEntityResults().iterator();
Map<String, Property> entity;
while(entity_interator.hasNext()){
entity = entity_interator.next().getEntity().getProperties();
String first = entity.get("First").getValues().get(0).getStringValue();
String last = entity.get("Last").getValues().get(0).getStringValue();
DateTime dateTime = entity.get("Time").getValues().get(0).getDateTimeValue();
Date time = new Date(dateTime.getValue());
System.out.println(first);
System.out.println(last);
System.out.println(time);
}
Note that this is an example of using the JSON API (full JavaDoc reference). The samples in the Google Cloud Datastore docs are using the Protocol Buffers API which is structured in the same way but has slightly different syntax.

Objectify app engine - querying embedded entities using a list of values

I am using objectify-appengine framework for querying. Here is my simplified problem: Consider these 2 classes:-
public class Customer {
#Id private String email;
#Embedded private Account Account = new Account(); // note embedded annotation
}
and
public class Account {
private String number; //example: 1234
}
The following query works & gives me 1 customer:
Objectify ofy = ObjectifyService.begin();
ofy.query(Customer.class).filter("account.number = ", "1234");
Question:
However, if have a List of values (account numbers). Is there a way to fetch them in 1 query? I tried passing a list of account numbers like this:
ofy.query(Customer.class).filter("account.number = ", myAccountNumberList);
But if fails saying:
java.lang.IllegalArgumentException: A collection of values is not allowed.
Thoughts?
filter("account.number IN", theList)
Note that IN just causes the GAE SDK to issue multiple queries for you, merging the results:
The IN operator also performs multiple queries, one for each item in the specified list, with all other filters the same and the IN filter replaced with an EQUAL filter. The results are merged, in the order of the items in the list. If a query has more than one IN filter, it is performed as multiple queries, one for each possible combination of values in the IN lists.
From https://developers.google.com/appengine/docs/java/datastore/queries

Categories

Resources