Java cache design question

Java cache design question - java

I need to develop a simple cache (no concurrency or refresh required) to hold different types of objects. The lookup of these objects may be in a different way. Like lets say we are caching book object which has ISBN number and author. Lookup of this object can be either by ISBN number like
Book lookupBookByISBN(String isbn);
OR it could be a lookupByAuthor like
List lookupBookByAuthor(String authorName);
In a very simple way, it means I can have a Cache object which has two maps one to store book object by ISBN and another to store the same object by authorname.
Like this, think of many such object type like book, so I do not want to store the same object in different maps just because the lookup of them are different.
One way I was thinking of having a single Map whose key is a custom Key object and value is Object (so that I can store any object or list of object)
The Key object is a immutable object which might look like this
public class Key {
private final Stirng keyName;
private final String keyValue;
public Key(String name,String value) {
this.keyName= name;
this.keyValue = value;
}
//getters for keyName and value
//hashcode and equals to be put as a key of a map
}
Implementation of lookup method will be
public Book lookupBookByISBN(String isbn) {
Key key = new Key("ISBN",isbn);
return ((Book)map.get(key));
}
public List<Book> lookupBookByAuthor(String isbn) {
Key key = new Key("Author",isbn);
return (List<Book>map.get(key));
}
The insert into map needs to be carefully done as the same object needs to be inserted twice into the map.
public void putBook(Book book) {
Key key = new Key("ISBN",book.getISBN());
map.put(key,book);
key = new Key("Author",book.getAuthor());
List<Book> list = map.get(key);
if (null == list) {
list = new ArrayList<Book>();
map.put(key,book);
}
list.add(book);
}
I somehow feel this might not be a good idea and I might need to put the same object in the map N number of times depending upon N dimensions by which I need to lookup the object.
Is there anyother way to design the same in a better way?

When you store an object in a collection (of any kind), you only store a reference to the object. So go ahead and use multiple maps, you will have only one copy of the actual object.
For example
Map<String,MyBigObject> map1 = new HashMap...
Map<String,MyBigObject> map2 = new HashMap...
MyBigObject mbo = new MyBigObject(...);
map1.put(mbo.getISBN(),mbo);
map2.put(mbo.getAuthor(),mbo);
The single object mbo is now accessible via either map.
EDIT: If you're worried about the complexity of multiple maps complicating the code, write a class MultiMap that contains all the maps and manages them in whatever way you want. You could have methods add(MyBigObject...) which inserts the object into all the maps using the various property accessors to set the correct key, and then lookup methods such as getByAuthor(...) and getByISBN(...), and whatever else you need. Hide all the complexity behind a simple unified interace.

Related

How I can access and add value from the list which is nested in the hashmap and list in Java

I am trying to add value for the List which is stored in HashMap and that has one parent List.
When I try to do so I get "The method get in type is not compatible with the List"
I am trying the following code, logic is :
If I get the matching value of tID in the txnValue List I am just adding the "Values" List otherwise I am creating the new HashMap.
List < HashMap > txnvalues = new ArrayList < HashMap > ();
for (LinkedHashMap < String, Object > linkedHashMap: resultset) {
HashMap data = new HashMap < > ();
HashMap attrData = new HashMap < > ();
List values = new ArrayList < > ();
data.put("values", new ArrayList < > ());
attrData.put("attrID", linkedHashMap.get("ID"));
attrData.put("attrVal", linkedHashMap.get("VAL"));
String txnID = linkedHashMap.get("T_ID").toString();
if (!txnvalues.stream().anyMatch(list -> list.containsValue(txnID))) {
data.put("tID", linkedHashMap.get("T_ID"));
values.add(attrData);
data.put("Values", values);
txnvalues.add(data);
} else {
txnvalues.get("Values").add(attrData); // this Line throws error
}
}
Example :
[{
"tID":123,
"Values":[{attrID:1,attrVal:123}]
}]
//Here If linkedHashmap.get("T_ID") = 123 which matches with tID then I want to add data in the Values
[{
"tID":123,
"Values":[{attrID:1,attrVal:123},{attrID:11,attrVal:467}]
}]
//If it doesn't match then I want to create new Hashmap and update txnValues Like this
[{
"tID":123,
"Values":[{attrID:1,attrVal:123},{attrID:2,attrVal:3435}]
},
{
"tID":456,
"Values":[{attrID:2,attrVal:233}]
}
]

I decided to parameterize all of your various iterables. Below is the parameterized code.
List<HashMap<String, List<HashMap<String, Object>>>> txnvalues = new ArrayList<HashMap<String, List<HashMap<String, Object>>>>();
for (LinkedHashMap<String, Object> linkedHashMap : resultset) {//Error here
HashMap<String, List<HashMap<String, Object>>> data = new HashMap<String, List<HashMap<String, Object>>>();
HashMap<String, Object> attrData = new HashMap<String, Object>();
List<HashMap<String, Object>> values = new ArrayList<HashMap<String, Object>>();
data.put("values", new ArrayList<>());
attrData.put("attrID", linkedHashMap.get("ID"));
attrData.put("attrVal", linkedHashMap.get("VAL"));
String txnID = linkedHashMap.get("T_ID").toString();
if (!txnvalues.stream().anyMatch(list -> list.containsValue(txnID))) {
data.put("tID", linkedHashMap.get("T_ID")); //Error here
values.add(attrData);
data.put("Values", values);
txnvalues.add(data);
} else {
txnvalues.get("Values").add(attrData); //Error here
}
}
First, you have multiple errors in your code such as trying to put a String key and Object value into data, which is a HashMap that only takes a String key and a List(of HashMaps of Strings and Objects) value. Another such is trying to get an item from txnvalues by a String, when txnvalues is a List and therefore requires an integer index parameter.
Second, you have a variable here which is never defined: resultset. We don't know what it is or how it is used, since it's never referenced elsewhere.
Third, there are many many ways to handle nested sets. This >-> List<HashMap<String, List<HashMap<String, Object>>>> is simply horrible.
Please re-write your code in a way that is readable, parameterized, and can properly compile without errors. Just parameterizing will help you keep track of which iterables take which parameters and will help prevent the problem you had when you came here for help.

I'm probably late with this answer. Nevertheless, I'll introduce a possible remedy accompanied by a detailed explanation.
At the first glance, such a deeply nested collection seems contrived and incomprehensible. But problems that you can see in this code aren't something unusual, they could be observed in many questions on StackOverflow, and in many repositories. The only difference is in concentration.
Let's try to examine it closely. A map is a data structure that is commonly misused by beginners because it allows to combine objects of different nature. I am pretty sure that provided code models something more or less tangible. Did you notice that PO tries to access an entry that has a string key called "id"? That's a clear indicator that collections here are used in place of objects.
If I say object graph can be far more complex, it probably wouldn't be something new. But how to reason about the code that is written in such a way?
Let's step aside for a moment and consider the following task:
there are a number of sailboats, you need to determine which of them will win the race and return its name as a result;
input provided as a plain text and consists of the following parameters: unique name, displacement, and weight (only these three for simplicity);
the speed of the vessel depends on its displacement and weight (i.e. formula is provided, we need only parse the values);
It is very likely that somebody can come up with such a solution:
create a Map<String, List<Double>>, where the key is a sailboat's name and the value is a list that contains displacement and weight;
then just iterate over the entry set, apply the formula and so find the fastest vessel.
Only a couple of methods, and it seems that a separate class for a sailboat will allegedly increase the overall complexity and amount of code. That's a common delusion for many students. The creation of a separate class will provide a logical structure to the code and will pay off if you would wish to extend or reuse it. Note that not only attributes of the sailboat must belong to this class but also the methods that allow to compute sailboat's speed and compare sailboats based on it.
Decomposition is a skill and it has to be exercised. And for those of you who didn't realize from the beginning that a sailboat in the previous example has to be represented by an object, I advise to try the next exercise: describe a university, a candy shop, a grocery store, a cat, anything you like but without using objects. First, think about a couple of use-cases that entail accessing some properties of the elements of the system that you're trying to model. Then draw diagrams and write the code using warriors collections and arrays, pay attention that the more complex your system becomes, the more cumbersome become all nested maps and lists, which make you write your code like this:
map.get(something).get(something).add(somethingElse);
And then, when you see the problems, you are ready to implement the classes that make sense in your domain model and compare the two approaches.
Disclaimer: understanding decomposition is a crucial thing but class design is a very broad topic, there are lots of things to study in this area like classic principles and design patterns. But before diving into these topics, you have to have a firm understanding of decomposition and OOP. Without this knowledge even with an object-oriented approach, your solution could become convoluted and difficult to manage. But this is a step in the right direction. The fact alone that you are using an object-oriented language doesn't automatically make your solution object-oriented. It's a skill, and it has to be exercised.
It was a very long digression, now let's get to the point.
As I already said, I'm convinced that the post author had in mind some kind of natural use case. Instead of names that describe the system in this maze of data structures we can see only dump get() and put(). But there's a clue in the usage of map. An id as a key is a clear indicator that it has to be an object which is substituted by a map.
That is a start of a journey, I'll try to provide a scenario that makes sense (at least a bit) and pieces of a system that fits into a structure depicted in the scheme provided at the start of this post.
Let's consider an organization that sells something (I'm not trying to guess what was the author's intention, but providing a use case that will allow to reason about the code). There are a bunch of departments, each with a unique identifier.
Each department has a collection of products that it sells. Department gets different products from different suppliers. And in turn, each product has a unique id a collection of suppliers represented by plain string (it looks contrived, but keep in mind it's just an illustration of what the code does).
As a use-case, let's assume that the company launches a new product and it must be accessible in all its departments. The code checks whether the department has this product already, if not, the product will be added with a default set of suppliers, otherwise it merges the existing set of suppliers and the default one.
As you can see the code in the main method is very concise. Note that all the miscellanies of data structures are still there, but we are not accessing them directly. As the information expert principle suggests, this logic is hidden inside the objects. That makes this solution reusable and less error-prone.
public static void main(String[] args) {
// this list is a rough equivalent of the "List<Map<String, List<Map<String, Object>>>> txnvalues"
List<Department> departments =
List.of(new Department("dep11"), new Department("dep12"));
Product newProd = new Product("id123"); // a NEW Product with id = "id123"
newProd.addAllSuppliers(List.of("supplierA", "supplierB"));
for (Department dep: departments) { // launching the new Product
dep.mergeProduct(newProd);
}
}
public class Department {
private final String departmentId;
private final Map<String, Product> idToProduct;
public Department(String departmentName) {
this.departmentId = departmentName;
this.idToProduct = new HashMap<>();
}
public void mergeProduct(Product prod) {
idToProduct.merge(prod.getId(), prod, Product::merge);
}
public void mergeAllProducts(Iterable<Product> products) {
for (Product prod: products) {
mergeProduct(prod);
}
}
public void addProduct(Product prod) {
idToProduct.put(prod.getId(), prod);
}
public void addAllProducts(Iterable<Product> products) {
for (Product prod: products) {
addProduct(prod);
}
}
public String getId() {
return departmentId;
}
public Map<String, Product> getIdToProduct() {
return Collections.unmodifiableMap(idToProduct);
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o instanceof Department other) {
return departmentId.equals(other.departmentId);
} else return false;
}
#Override
public int hashCode() {
return Objects.hash(departmentId);
}
}
public class Product {
private final String productId;
private final Set<String> suppliers;
public Product(String id) {
this.productId = id;
this.suppliers = new HashSet<>();
}
public boolean addSupplier(String newSup) {
return suppliers.add(newSup);
}
public boolean addAllSuppliers(Collection<String> newSup) {
return suppliers.addAll(newSup);
}
public Product merge(Product other) {
if (!this.equals(other)) throw new IllegalArgumentException();
Product merged = new Product(productId);
merged.addAllSuppliers(this.suppliers);
merged.addAllSuppliers(other.suppliers);
return merged;
}
public String getId() {
return productId;
}
public Set<String> getSuppliers() {
return Collections.unmodifiableSet(suppliers);
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o instanceof Product other) {
return this.productId.equals(other.productId);
} else return false;
}
#Override
public int hashCode() {
return Objects.hash(productId);
}
}
Further steps:
First of all make sure that you don't have gaps in the core concepts of OOP: encapsulation, inheritance, and polymorphism.
Draw before you start to code, it's not necessary to create a full-blown UML diagram. Even a rough set of named boxes with arrows will help you understand better how your system is structured and how its parts interact with each other.
Read and apply. Extend your knowledge gradually and try to apply it. High cohesion, Low coupling, SOLID, and lots of helpful reading can be found here, for instance this recent post
Write a bit, test a bit: don't wait until your code became a beast. Write a bit and give it a try, add something else and take a look at how these parts fit together.

In the else block, you call get method of txnvalues which a list of HashMaps and thus it expects an integer index. I believe you assume that at this point you've got a reference to the HashMap to which you would add the values. But you don't.
So, you need to find the index where to add the values, which means you have to look through the txnvalues list again.
For this reason, you should use a different approach:
txnvalues.stream()
.filter(m -> m.get("tID").equals(txnID))
.findFirst()
.ifPresentOrElse(
m -> m.get("Values").add(attrData),
() -> {
HashMap data = new HashMap<>();
// Other stuff to fill the data
txnvalues.add(data);
}
);
Here .filter(m -> m.get("tID").equals(txnID)) corresponds to your .anyMatch(list -> list.containsValue(txnID)) (the parameter list is actually instance of HashMap).
I changed the condition: according to your data sample, you looking for Map which has txnID value for the "tID" key, therefore getting the value of this key is faster than looking through all the values in the HashMap. (It may return null.)
So filter will return only the entries which contain match the required value of the "tID" key. Then .findFirst() “returns” the reference to that HashMap. Now .ifPresentOrElse performs the actions you want:
m.get("Values").add(attrData) into the list; this corresponds your one line of code in the else block;
the other code is what you had in the if block: if nothing is found, create the new instance.

Choosing right data structure and design for Java multithreading problem

I am implementing a singleton class that must handle multiple threads accessing its data structure at once.
The class has a method that returns true if the data structure already contains myObject and false otherwise. If the object has not been seen then object is added to the data structure.
boolean alreadySeen(MyObject myObject){}
MyObject has two member variables Instant expiration and String id where id acts as my key to decide whether the data structure contains myObject. I cannot change MyObject class. I need to periodically check the expiration of myObjects in the data structure and remove them if they have expired.
So I am looking to use one or more data structures that I can quickly add, delete and search by both expiration and id. I will mostly be adding elements and searching if element exists with the periodic cleanup removing expired elements.
A map like ConcurrentHashMap<id,MyObject> gives me the O(1) insert and delete but it would be O(n) to search through for expired objects.
As mentioned above I cannot change the MyObject class. So I thought about making a wrapper for that class so I can override equals() and hashcode() and then do an ordered set like ConcurrentSkipListSet<MyObjectWrapper>(new ExpComparator()) This would let me order order the set by expiration date and then I could quickly find expired ones on top. However I believe this would be O(log n) for search, delete.
Is there any better structure I could use? And if not am I better off in the long run with the map at O(1) lookup and add but periodic O(n) for delete of expiration? Or better for set with O(log n) of everything?

You lookup，add and remove operation can run at O(1),but it need other cost as follow:
First,it need double memory to store data
Second,Expiration time cannot be very accurate
we need two map，one store object，key is the id and value is object，like Map<id,MyObject> another may to store the relationship between expiration and MyObjects likeMap<Long,List<MyObjects>>,key need to calculate.
Codes:
In order to simple write code i modify your MyObject class:
class MyObject {
private long expiration;
private String id;
}
The other code
private Map dataSet = new ConcurrentHashMap<>();
private Map> obj2Expiration = new ConcurrentHashMap<>();
public boolean alreadySeen(MyObject myObject) {
boolean exist = dataSet.containsKey(myObject.getId());
if (!exist) {
dataSet.put(myObject.getId(),myObject);
Long expirateKey = myObject.getExpiration() / 5000;
List<MyObject> objects = obj2Expiration.get(expirateKey);
if (null == objects) {
objects = new ArrayList<>();
obj2Expiration.put(expirateKey,objects);
} else {
objects.add(myObject);
}
}
return exist;
}
#Scheduled(fixedRate = 5000)
public void remove() {
long now = System.currentTimeMillis();
long needRemode = now /5000 -1;
Optional.ofNullable(obj2Expiration.get(needRemode))
.ifPresent(objects -> objects.stream().forEach(o -> {
dataSet.remove(o.getId());
}));
}

Maintain java mapping from two different types of values in a single data structure

I have a collection of objects that look something like
class Widget {
String name;
int id;
// Intuitive constructor omitted
}
Sometimes I want to look up an item by name, and sometime I want to look it up by id. I can obviously do this by
Map<String, Widget> mapByName;
Map<Integer, Widget> mapById;
However, that requires maintaining two maps, and at some point, I will (or another user who is unfamiliar with the double map) will make a change to the code and only update one of the maps.
The obvious solution is to make a class to manage the two maps. Does such a class already exist, probably in a third party package?
I am looking for something that lets me do something along the lines of
DoubleMap<String, Integer, Widget> map = new DoubleMap<>();
Widget w = new Widget(3, "foo");
map.put(w.id, w.name, w);
map.get1(3); // returns w
map.get2("foo"); // returns w

A simple solution could be, to write your own key class that includes both keys.
class WidgetKey {
String id;
String name;
boolean equals() {...}
boolean hashCode() {...}
}
Map<WidgetKey, Widget> yourMap;
Beware that you have to implement equals and hashCode in the WidgetKey class. Otherwise put/get and other map methods wouldn't work properly.

Multiple HashCodes for Java Objects

I'm trying to optimize some code, and when I do this I usually end up getting that helping hand from Hash structures.
What I want to do is divide objects into multiples sets based on some attributes in a very fast way. Basically like SQL GROUP BY statement but for Java.
The thing is that I want to use HashMap<Object, ArrayList<Object>> to do this. I want to use multiple grouping ways but an Object can only have one hashCode().
Is there a way to have multiple hashCodes() in order to be able to group by multiple methods? Are there other structures made to solve this kind of issues? Can I use Java 8 lambda expressions to send a hashCode() in the HashMap parameters? Am I silly and there is a super fast way that isn't this complicated?
Note: The hashCodes I want use multiple attributes that are not constant. So for example, creating a String that represents those attributes uniquely won't work because I'd have to refresh the string every time.

Let's say you have a collection of objects and you want to produce different groupings analogous to SQL GROUP BY. Each group-by is defined by a set of common values. Create a group-by-key class for each distinct grouping type, each with an appropriate hashCode() and equals() method (as required by the Map contract).
For the following pseudocode I assume the existence of a MultiMap class that encapsulates the management of your map's List<Object> values. You could use Guava's MultiMap implementation.
// One group key
public class GroupKey1 {
...
public GroupKey1(MyObject o) {
// populate key from object
}
public GroupKey1(...) {
// populate from individual values so we can create lookup keys
}
public int hashCode() { ... }
public boolean equals() { ... }
}
// A second, different group key
public class GroupKey2 {
...
public GroupKey2(MyObject o) {
// populate key from object
}
public GroupKey2(...) {
// populate from individual values so we can create lookup keys
}
...
}
...
MultiMap<GroupKey1,MyObject> group1 = new HashMultiMap<>();
MultiMap<GroupKey2,MyObject> group2 = new HashMultiMap<>();
for (MyObject m : objectCollection)
{
group1.put(new GroupKey1(m), m);
group2.put(new GroupKey2(m), m);
}
...
// Retrieve the list of objects having a certain group-by key
GroupKey2 lookupKey = new Groupkey2(...);
Collection<MyObject> group = group2.get(lookupKey);

What you're describing sounds like a rather convoluted pattern, and possibly a premature optimization. You might have better luck asking a question about how to efficiently replicate GROUP BY-style queries in Java.
That said the easiest way to have multiple hash codes is to have multiple classes. Here's a trivial example:
public class Person {
String firstName;
String lastName;
/** the "real" hashCode() */
public int hashCode() {
return firstName.hashCode() + 1234 * lastName.hashCode();
}
}
public class PersonWrapper1 {
Person person;
public int hashCode() {
return person.firstName.hashCode();
}
}
public class PersonWrapper2 {
Person person;
public int hashCode() {
return person.lastName.hashCode();
}
}
By using wrapper classes you can redefine the notion of equality in a type-safe way. Just be careful about how exactly you let these types interact; you can only compare instances of Person, PersonWrapper1, or PersonWrapper2 with other instances of the same type; each class' .equals() method should return false if a different type is passed in.
You might also look at the hashing utilities in Guava, they provide several different hashing functions, along with a BloomFilter implementation, which is a data structure that relies on being able to use multiple hashing functions.
This is done by abstracting the hashing function into a Funnel class. Funnel-able classes simply pipe the values they use for equality into the Funnel, and callers (like BloomFilter) then actually compute the hash codes.
Your last paragraph is confusing; you cannot hope to store objects in a hash-based data structure and then change the values used to compute the hash code. If you do so, the object will no longer be discoverable in the data structure.

Taking your thoughts into account:
What I want to do is divide objects into multiples sets based on some attributes in a very fast way. Basically like SQL GROUP BY statement but for Java.
Map<City, Set<String>> lastNamesByCity
= people.stream().collect(groupingBy(Person::getCity,
mapping(Person::getLastName, toSet())));

java best data structure for two to many relations

So I have three important factors, filenames which there are many, there will also be duplicates, violation types which there are 6 of, and the data relating to them.
I was thinking of using a Map for this but it only accepts two types, so I want to sort the data by the filename and for every entry under that filename, i want to retrieve the violation type, from what i want it to retrieve all the matches from the data, so say it's a map I could of said map.get(filename, violation) and it will retrieve all the results that match that.
Is there a data structure that can allow me to do this? or am I being lazy and should just sort the data myself when it comes to outputting it.

One other way to approach this would be to use a custom Class for holding the needed data. Essentially 'building' your own node that you can iterate over.
For example! you could create the following class object: (Node.java)
import java.util.*;
public class Node
{
private String violationType;
private String dataInside;
public Node()
{
this("", "");
}
public Node(String violationType)
{
this(violationType, "");
}
public Node(String violationType, String dataInside)
{
this.violationType = violationType;
this.dataInside = dataInside;
}
public void setViolationType(String violationType)
{
this.violationType = violationType;
}
public void setDataInside(String dataInside)
{
this.dataInside = dataInside;
}
public String getViolationType()
{
return violationType;
}
public String getDataInside()
{
return dataInside;
}
}
ok, great, so we have this 'node' thing with some setters, some getters, and some constructors for ease of use. Cool. Now lets see how to use it:
import java.util.*;
public class main{
public static void main(String[] args){
Map<String, Node> customMap = new HashMap<String, Node>();
customMap.put("MyFilename", new Node("Violation 1", "Some Data"));
System.out.println("This is a test of the custom Node: " + customMap.get("MyFilename").getViolationType());
}
}
Now we have a map that relates all of the data you need it to. Now, you'll get a lot of people saying 'Don't reinvent the wheel" when it comes to things like this, because built in libraries are far more optimized. That is true! If you can find a data structure that is built into java that suits your needs, USE IT. That's always a good policy to follow. That being said, if you have a pretty custom situation, sometimes it calls for a custom approach. Don't be afraid to make your own objects like this, it's easy to do in Java, and it could save you a lot of time and headache!
EDIT
So, after re-reading the OP's question, I realize you want an entire list of associated data for the given violation of a given filename. In which case, you would switch the private String dataInside to something like private ArrayList<String> dataInside; which would allow you to associate as much data as you wanted, still inside that node, just inside of an arraylist. Also note, you'd have to switch up the getters/setters a little to accomodate a list, but that's not too bad.

You could use a custom class for a mapkey which contains the two fields filename and violation type. When doing so you need to implement equals() and hashCode() methods do ensure instances of that class can be used as key for map.

You can use TreeMap. TreeMap is sorted according to the natural ordering of its keys.
TreeMap<String, List<String>> map = new TreeMap<String, List<String>>();

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java cache design question - java

Related

How I can access and add value from the list which is nested in the hashmap and list in Java

Choosing right data structure and design for Java multithreading problem

Maintain java mapping from two different types of values in a single data structure

Multiple HashCodes for Java Objects

java best data structure for two to many relations

Categories

Resources