Two-dimensional Map Java - java

I need a data structure that I could store my information in a two dimensional way. For example imagine a table that contains user-item ratings. I need to store all ratings for all users. let's say, user u1. I need to store ratings for user u1 and u2 and u3 and all other users. But the problem is I also need too store all ratings for all items. For example I need to store ratings provided by all users for each item. So I need something like a map that for users the key is the user ID and the value is the set of ratings. I can do that easily. But my problem is how I can also store ratings for Items. for example a map that the key is the item ID and the value is the set of ratings provided be users for that Item. I wanted to upload a table but since I didn't have enough reputation I couldn't do that.So just imagine a table like a two dimensional matrix that rows are users and columns are items. Is there a data structure that can do that? Or I should build two different maps? maybe there is a better option than Map but Since I had to choose a title for my question I wrote map.
Thanks

You can use the Table class from the free Guava library
Table<Integer, String, Double> table = HashBasedTable.create();
table.put(1, "a", 2.0);
double v = table.get(1, "a"); // getting 2.0

Here is my own version of an appropriate Table object. Mind you, using something provided by an existing library is good. But trying your own implementation will help you understand the issues involved better. So you can try to add "remove" methods etc. to my implementation to complete it.
I prefer keeping the data in a table rather than implementing the maps inside User and Item, because the table can enforce adding each new rating through both row and column. If you keep your maps separate in two independent objects, you won't be able to enforce this.
Note that while I return protective copies of the internal maps in getCol and getRow, I return the reference to the actual values, not copies thereof, so that you can change a user's rating (assuming you chose a mutable object for that) without changing the table structure. Also note that if your user and item objects are mutable and this affects their equals or hashCode, the table will behave unpredictably.
public class Table<K1, K2, V> {
// Two maps allowing us to retrieve the value through the row or the
// column key.
private Map<K1, Map<K2, V>> rowMap;
private Map<K2, Map<K1, V>> colMap;
public Table() {
rowMap = new HashMap<>();
colMap = new HashMap<>();
}
/**
* Allows us to create a key for the row, and place it in the structure
* while there are still no relations for it.
*
* #param key
* The key for which a new empty row will be created.
*/
public void addEmptyRow(K1 key) {
if (!rowMap.containsKey(key)) {
rowMap.put(key, new HashMap<K2, V>());
}
}
/**
* Allows us to create a key for the column, and place it in the
* structure while there are still no relations for it.
*
* #param key
* The key for which a new empty column will be created.
*/
public void addEmptyCol(K2 key) {
if (!colMap.containsKey(key)) {
colMap.put(key, new HashMap<K1, V>());
}
}
/**
* Insert a value into the table using the two keys.
*
* #param rowKey
* Row key to access this value
* #param colKey
* Column key to access this value
* #param value
* The value to be associated with the above two keys.
*/
public void put(K1 rowKey, K2 colKey, V value) {
Map<K2, V> row;
Map<K1, V> col;
// Find the internal row. If there is no entry, create one.
if (rowMap.containsKey(rowKey)) {
row = rowMap.get(rowKey);
} else {
row = new HashMap<K2, V>();
rowMap.put(rowKey, row);
}
// Find the internal column, If there is no entry, create one.
if (colMap.containsKey(colKey)) {
col = colMap.get(colKey);
} else {
col = new HashMap<K1, V>();
colMap.put(colKey, col);
}
// Add the value to both row and column.
row.put(colKey, value);
col.put(rowKey, value);
}
/**
* Get the value associated with the given row and column.
*
* #param rowKey
* Row key to access the value
* #param colKey
* Column key to access the value
* #return Value in the given row and column. Null if mapping doesn't
* exist
*/
public V get(K1 rowKey, K2 colKey) {
Map<K2, V> row;
row = rowMap.get(rowKey);
if (row != null) {
return row.get(colKey);
}
return null;
}
/**
* Get a map representing the row for the given key. The map contains
* only column keys that actually have values in this row.
*
* #param rowKey
* The key to the row in the table
* #return Map representing the row. Null if there is no row with the
* given key.
*/
public Map<K2, V> getRow(K1 rowKey) {
// Note that we are returning a protective copy of the row. The user
// cannot change the internal structure of the table, but is allowed
// to change the value's state if it is mutable.
if (rowMap.containsKey(rowKey)) {
return new HashMap<>(rowMap.get(rowKey));
}
return null;
}
/**
* Get a map representing the column for the given key. The map contains
* only row keys that actually have values in this column.
*
* #param colKey
* The key to the column in the table.
* #return Map representing the column. Null if there is no column with
* the given key.
*/
public Map<K1, V> getCol(K2 colKey) {
// Note that we are returning a protective copy of the column. The
// user cannot change the internal structure.
if (colMap.containsKey(colKey)) {
return new HashMap<>(colMap.get(colKey));
}
return null;
}
/**
* Get a set of all the existing row keys.
*
* #return A Set containing all the row keys. The set may be empty.
*/
public Set<K1> getRowKeys() {
return new HashSet(rowMap.keySet());
}
/**
* Get a set of all the existing column keys.
*
* #return A set containing all the column keys. The set may be empty.
*/
public Set<K2> getColKeys() {
return new HashSet(colMap.keySet());
}
}
The reason that I have methods addEmptyRow and addEmptyCol is that I thought it may be redundant to keep a separate data structure for your users and items. Once you add them to the table like this, you can get them through the getRowKeys and getColKeys so there is no need to keep them separately unless you want to structure them in anything other than a Set.
Note that this Table works with the key's value - two keys which are equals equivalent are the same key, and you should design your User and Item objects accordingly.
With appropriate definitions of User, Item and Rating, you can do something like
Table<User, Item, Rating> table = new Table<>();
table.addEmptyCol(new Item("Television"));
table.addEmptyCol(new Item("Sofa"));
User user = new User("Anakin");
Item item = new Item("Light Sabre");
table.put(user, item, new Rating(5));
Item item1 = new Item("Helmet");
table.put(user, item1, new Rating(7));
Rating rating = table.get(user, item);
rating.setRating(rating.getRating() + 10);
User user1 = new User("Obi-Wan");
table.put(user1, item, new Rating(8));
table.put(user1, new Item("Television"), new Rating(0));
And then query the table for a particular user like so:
Map<Item, Rating> anakinsRatings = table.getRow(user);
for (Map.Entry<Item, Rating> entry : anakinsRatings.entrySet()) {
System.out.println(user + " rated " + entry.getKey() + " as "
+ entry.getValue().getRating());
}
Or display a list of ratings for all items like so:
for (Item currItem : table.getColKeys()) {
Map<User, Rating> itemMap = table.getCol(currItem);
if (itemMap.isEmpty()) {
System.out.println("There are currently no ratings for \""
+ currItem
+ "\"");
} else {
for (Map.Entry<User, Rating> entry : table.getCol(currItem).entrySet()) {
System.out.println("\""
+ currItem
+ "\" has been rated "
+ entry.getValue().getRating() + " by "
+ entry.getKey());
}
}
}
As I said, the implementation is not complete - there is no toString for the table, for example, no remove, removeRow, removeCol, clear, etc.

So, you have two one-many relationships. A user has (gives) many ratings and an item has many ratings; this gives a many-many relationship of users-items which is your problem. Why not simply model it as described:
public class Rating {
private User ratedBy;
private Item itemRated;
public Item getItem() { return itemRated; }
}
public class User {
private Set<Rating> allRatings = new HashSet<>();
public Rating getRatingFor(Item item) {
for(Rating rating: allRatings) {
if(item.equals(rating.getItem()) {
return rating;
}
}
return null;
}
}
public class Item {
private Set<Rating> allRatings = new HashSet<>();
}
... and getters/setters etc as required. You can then get ratings with:
User user1 = new User();
// ... do stuff to populate ratings
Rating itemRatingByUser = user1.getRatingFor(item);

A quick and dirty solution is to use a two dimensional key.
Assuming the id for both user and item is of the same type, you can create a class that is just a holder for the id and the type of the key. (the type can be the class of User or Item if available or just an enum value). Make then the map have keys of this type. Of course each rating will be referenced at least twice (once for each type)

The data structure you need is called a Table. A table has two keys and an object, and looks, more or less, like an excel table with the two key sets as columns and rows. Hence the name. There are a variety of Table implementations. I think its normal to use the Guava implementations now. An explanation of guava's table interface is here.
The API for guava's table is here, and the implementation that you want, the HashBasedTable, is here

Related

Java hashmap with Multiple values

I Know its been asked a hundred times, and the answer is always the same, You Can not use multiple repeating values in a hashmap.
But lets get to the problem. I have an import file, the import file has information around the lines of a CustomerID, a ProductID, and Units sold (its a basic Receipt format).
What I want to do is take the import, put it into a map, and be able to reference it.
Map<integer,DoubleSales> hashmap = new HashMap <integer,DoubleSales>
try {
Scanner dataFile = new Scanner 9new File ("./salesData.csv"));
dataFile.nextLine();
while(dataFile.hasNextLine()){
String[] lineValues = line.split (",");
Integer CustomerID = Integer.parseInt(lineValues[0]);
Integer ProductID = Integer.parseInt(lineValues[1]);
integer Units = Integer.parseInt(lineValues[2]);
DoubleSales sales = new DoubleSales(CustomerID,ProductID,Units);
ProductData.put(CustomerID,sales);
}
class DoubleSales{
int CustomerID;
int ProductID;
int Units;
DoubleSales(int custID, int prodID, int Units){
CustomerID = custID;
ProductID = prodID;
Units = units;
}
}
The import file has data in the format of
CustomerID, ProductID, UnitsSold
1,10002,3
1,10004,5
1,10008,2
1,10010,3
1,10010,3
Using the code up there, When I print the customerID value of 1, I get just the last entry which is 10010,3.
How would I do it to print out, all values of CustomerID 1, and the Units sold?
for example:
1,10002,3
10004,5
10008,2
10010,3
10010,3
(will add the two 10010 values later.)
I do not wish to Use array lists.
Try MultiValueMap from Apache Common Collections.
Click here for more reference
In your case, a simple Map won't do your favor, everything you write to the value of a specified customer will be overridden, if you want to retain all entries while keeping them easily referenced, try:
First, create a structured map
Map<Integer,List<DoubleSales>> productData = new HashMap<Integer,List<DoubleSales>>();
Second, add products like this
List<DoubleSales> entries;
if(productData.get(CustomerID) == null) {
entries = new ArrayList<DoubleSales>();
entries.add(sales);
productData.put(CustomerID, entries);
} else {
List<DoubleSales> entries = productData.get(CustomerID);
entries.add(sales);
}
Third, review your products list that you just added
List<DoubleSales> products = productData.get(CustomerID);
if (products != null) {
for(DoubleSales product : products) {
// access your product here.
}
}
You have duplicated CustomerID (all having 1 as id) and you using that as a key in Hashmap. That is the reason it is keep ovverding when you insert a new record with the same id. Looks like your product id is unique. Try that or have unique customer id.
I think in that case it is better to implement the structure with a matrix. It could be done easily with arrays (or lists), where rows could contain a bean formed by the product id and the units sold, being indexed by the customer id
My first idea was Jerry Chin's solution, but I would like to show you a second approach, just to demonstrate that there are multiple solutions to the same problem.
You could store your values in a TreeSet<DoubleSales>. This would not limit the entries, you can enter for example 1,10010,3 as many times as you want.
Then, define an ordering (Comparator) on the DoubleSales, to group the orders by CustomerID.
When you print your list, you can check if the customerID of the current record is different from the prevoius record. If different, then it is the first record of the new customer. If not, it belongs to the same customer.
And the code:
SortedSet<DoubleSales> set = new TreeSet<DoubleSales>(new Comparator<DoubleSales>() {
#Override
public int compare(DoubleSales o1, DoubleSales o2) {
return Long.compare(o1.customerId, o2.customerId);
}
});
// ... import data
set.add(new DoubleSales( /*...*/ ));
// iterate through data
DoubleSales prevDS = null;
for (DoubleSales ds : set) {
if (prevDS == null || ds.customerId != prevDS.customerId) {
// first record of a customer
// print CustomerID, ProductID, UnitsSold
} else {
// second or next record of a customer
// print ProductID, UnitsSold only
}
prevDS = ds;
}

trouble understanding implementation of hash table with chaining

I'm studying on hash table with chaining in java by its implementation. The trouble is about get() method. An index value is determined with key.hashCode() % table.length. Assume that the table size is 10 and key.hashCode() is 124 so index is found as 4. In for each loop table[index] is started from table[4], AFAIK index is being incremented one by one 4,5,6,7... so on. But what about indices 0,1,2,3? Are they been checked? (I think no) Isn't there any possibility that occurring of key on one of the indices? (I think yes). The other issue that there are null checks but initially there is no any null assignment for key and value. So how can the checking work? Is null assigned as soon as private LinkedList<Entry<K, V>>[] table is declared?
// Data Structures: Abstraction and Design Using Java, Koffman, Wolfgang
package KW.CH07;
import java.util.AbstractMap;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.StringJoiner;
/**
* Hash table implementation using chaining.
* #param <K> The key type
* #param <V> The value type
* #author Koffman and Wolfgang
**/
public class HashtableChain<K, V>
// Insert solution to programming project 7, chapter -1 here
implements KWHashMap<K, V> {
/** The table */
private LinkedList<Entry<K, V>>[] table;
/** The number of keys */
private int numKeys;
/** The capacity */
private static final int CAPACITY = 101;
/** The maximum load factor */
private static final double LOAD_THRESHOLD = 3.0;
// Note this is equivalent to java.util.AbstractMap.SimpleEntry
/** Contains key-value pairs for a hash table.
#param <K> the key type
#param <V> the value type
*/
public static class Entry<K, V>
// Insert solution to programming project 6, chapter -1 here
{
/** The key */
private final K key;
/** The value */
private V value;
/**
* Creates a new key-value pair.
* #param key The key
* #param value The value
*/
public Entry(K key, V value) {
this.key = key;
this.value = value;
}
/**
* Retrieves the key.
* #return The key
*/
#Override
public K getKey() {
return key;
}
/**
* Retrieves the value.
* #return The value
*/
#Override
public V getValue() {
return value;
}
/**
* Sets the value.
* #param val The new value
* #return The old value
*/
#Override
public V setValue(V val) {
V oldVal = value;
value = val;
return oldVal;
}
// Insert solution to programming exercise 3, section 4, chapter 7 here
}
// Constructor
public HashtableChain() {
table = new LinkedList[CAPACITY];
}
// Constructor for test purposes
HashtableChain(int capacity) {
table = new LinkedList[capacity];
}
/**
* Method get for class HashtableChain.
* #param key The key being sought
* #return The value associated with this key if found;
* otherwise, null
*/
#Override
public V get(Object key) {
int index = key.hashCode() % table.length;
if (index < 0) {
index += table.length;
}
if (table[index] == null) {
return null; // key is not in the table.
}
// Search the list at table[index] to find the key.
for (Entry<K, V> nextItem : table[index]) {
if (nextItem.getKey().equals(key)) {
return nextItem.getValue();
}
}
// assert: key is not in the table.
return null;
}
/**
* Method put for class HashtableChain.
* #post This key-value pair is inserted in the
* table and numKeys is incremented. If the key is already
* in the table, its value is changed to the argument
* value and numKeys is not changed.
* #param key The key of item being inserted
* #param value The value for this key
* #return The old value associated with this key if
* found; otherwise, null
*/
#Override
public V put(K key, V value) {
int index = key.hashCode() % table.length;
if (index < 0) {
index += table.length;
}
if (table[index] == null) {
// Create a new linked list at table[index].
table[index] = new LinkedList<>();
}
// Search the list at table[index] to find the key.
for (Entry<K, V> nextItem : table[index]) {
// If the search is successful, replace the old value.
if (nextItem.getKey().equals(key)) {
// Replace value for this key.
V oldVal = nextItem.getValue();
nextItem.setValue(value);
return oldVal;
}
}
// assert: key is not in the table, add new item.
table[index].addFirst(new Entry<>(key, value));
numKeys++;
if (numKeys > (LOAD_THRESHOLD * table.length)) {
rehash();
}
return null;
}
/** Returns true if empty
#return true if empty
*/
#Override
public boolean isEmpty() {
return numKeys == 0;
}
}
Assume that the table size is 10 and key.hashCode() is 124 so index is found as 4. In for each loop table[index] is started from table[4]
Correct.
there are null checks but initially there is no any null assignment for key and value. So how can the checking work?
When an array of objects is initialized, all values are set to null.
index is being incremented one by one 4,5,6,7... so on. But what about indices 0,1,2,3? Are they been checked? (I think no) Isn't there any possibility that occurring of key on one of the indices? (I think yes).
Looks like there's some misunderstanding here. First, think of the data structure like this (with data having already been added to it):
table:
[0] -> null
[1] -> LinkedList -> item 1 -> item 2 -> item 3
[2] -> LinkedList -> item 1
[3] -> null
[4] -> LinkedList -> item 1 -> item 2
[5] -> LinkedList -> item 1 -> item 2 -> item 3 -> item 4
[6] -> null
Another important point is that the hash code for a given key should not change, so it will always map to the same index in the table.
So say we call get with a value who's hash code maps it to 3, then we know that it's not in the table:
if (table[index] == null) {
return null; // key is not in the table.
}
If another key comes in that maps to 1, now we need to iterate over the LinkedList:
// LinkedList<Entry<K, V>> list = table[index]
for (Entry<K, V> nextItem : table[index]) {
// iterate over item 1, item 2, item 3 until we find one that is equal.
if (nextItem.getKey().equals(key)) {
return nextItem.getValue();
}
}
I think you aren't quite visualizing your hash table correctly. There are two equally good simple implementations of a hash table.
Method 1 uses linked lists: An array (well, Vector, actually) of linked lists.
Given a "key", you derive a hash value for that key(*). You take the remainder of that hash value relative to the current size of the vector, let's call that "x". Then you sequentially search the linked list that vector[x] points to for a match to your key.
(*) You hope that the hash values will be reasonably well-distributed. There are complex algorithms for doing this. Let's hope your JVM implementation of HashCode does a good job of this.
Method 2 avoids linked lists: you create a Vector and compute an index into the Vector (as above). Then you look at the Vector.get(x). If that's the key you want, your return the corresponding value. Let's assume it's not. Then you look at Vector.get(x+1), Vector.get(x+2), etc. Eventually, one of the following three things will happen:
a) You find the key you are looking for. Then you return the corresponding value.
b) you find an empty entry (key == null). Return null or whatever value you have chosen to mean "this isn't the droid you're looking for".
c) you have examined every entry in the Vector. Again, return null or whatever.
Checking for (c) is a precaution, so that if the Hash Table happens to be full you won't loop forever. If the hash table is about to be full (you can keep a count of how many entries have been used) you should reallocate a bigger hash table. IDeally, you want to keep the hash table sparse enough that you never get anywhere near searching the whole table: that vitiates the whole purpose of a hash table -- that you can search it in much less than linear time, ideally in order 1 (that is, the number of comparisons is <= a small constant). I would suggest that you allocate a Vector that is at least 10x the number of entries you expect to put in it.
The use of the word "chaining" in you questions suggests to me that you want to implement the second type of hash table.
Btw, you should never use 10 as the size of a hash table. The size should be a prime number.
Hope this helps.

Finding amount of certain item in the hashmap in Java

I have a method, which I use to count the items in a hashmap:
public void getAvailable(final Item item) {
System.out.println("\n" + "Item's \"" + item.getItemName() + "\" stock");
System.out.println("Name\tPrice\tAmount");
for (Map.Entry<Item, Integer> entry : stockItems.entrySet()) {
System.out.println(entry.getKey() + "\t" + entry.getValue());
}
}
But if I have specified the key item, how can I find the amount of all the items with that key in the hashmap? At the moment it returns me all the items with different keys.
Would the following acheive what you're after?
public int getAvailable(final Item item) {
int count = 0;
String itemName = item.getItemName();
for (Map.Entry<Item, Integer> entry : stockItems.entrySet()) {
Item i = entry.getKey();
if(itemName.equals(i.getItemName())) {
count += entry.getValue();
}
}
return count;
}
EDIT: edited count to start at 0
how can I find the amount of all the items with that key in the hashmap?
Hash Map key is unique value. You will have only one value for any key.
I have to guess what you're trying to achieve so I assume the following:
your map keys are instances of Item
you only have the key item and want to find the corresponding entry in the map
What you could do is:
use a separate map to the the Item instance for the key and use it to access the counts map
create a "dummy" (lookup) item which only gets the data which is used in the equals() and hashCode methods and use that to access the counts map
Example for 1.:
Map<String, Item> items = ...;
Integer quantity = stockItems.get(items.get("item"));
Example for 2.:
class Item {
private String key;
public Item(String key) {
this.key = key;
}
...
//equals() and hashCode() should only use the key field
}
Integer quantity = stockItems.get( new Item("item") );
Update:
If the key is not the only attribute of an item, you'd have to iterate over all entries in the map, check the item's key for a match and create the sum yourself.
a key is unique in a hashmap. so there can be only one value with your speicified key.

JAVA HashMap 2D, cant get the right approach to make a 2D HashMap, i mean a HashMap into another HashMap

I want to make a board of Students' names and Subjects and each student has a grade in each subject (or not.. he can leave the exam and doesnt write it, and then his case will be empty). I want to use just HashMaps. I mean, it will be something like that:
HashMap<String,HashMap<String,String>> bigBoard =
new HashMap<String,HashMap<String,String>>();
but I think, I dont have the right idea, because for each subject, there will be many grades (values) so that won't be possible. Do I have to make a map for each student? with his subject? but then the table on output won't be arranged. Do you have a proposition?
I would like a table that look like something like that for example.
Column-Key →
Rowkey↓ Mathematics Physics Finance
Daniel Dolter 1.3 3.7
Micky Mouse 5
Minnie Mouse 1.7 n/a
Dagobert Duck 4.0 1.0
(I would use all the keys/values as Strings, it will be more simple like that.)
After the implementation of our class (for example class-name is String2D), we should use it like that.
public static void main(String[] args) {
String2D map2D = new String2D();
map2D.put("Daniel Doster", "Practical Mathematics", "1.3");
map2D.put("Daniel Doster", "IT Systeme", "3.7");
map2D.put("Micky Mouse", "Finance", "5");
map2D.put("Minnie Mouse", "IT Systeme", "1.7");
map2D.put("Minnie Mouse", "Finance", "n/a");
map2D.put("Dagobert Duck", "Practical Mathematics", "4.0");
map2D.put("Dagobert Duck", "Finance", "1.0");
System.out.println(map2D);
}
No "HashMap" will be seen.. and Arrays aren't allowed
You can use this class:
public class BiHashMap<K1, K2, V> {
private final Map<K1, Map<K2, V>> mMap;
public BiHashMap() {
mMap = new HashMap<K1, Map<K2, V>>();
}
/**
* Associates the specified value with the specified keys in this map (optional operation). If the map previously
* contained a mapping for the key, the old value is replaced by the specified value.
*
* #param key1
* the first key
* #param key2
* the second key
* #param value
* the value to be set
* #return the value previously associated with (key1,key2), or <code>null</code> if none
* #see Map#put(Object, Object)
*/
public V put(K1 key1, K2 key2, V value) {
Map<K2, V> map;
if (mMap.containsKey(key1)) {
map = mMap.get(key1);
} else {
map = new HashMap<K2, V>();
mMap.put(key1, map);
}
return map.put(key2, value);
}
/**
* Returns the value to which the specified key is mapped, or <code>null</code> if this map contains no mapping for
* the key.
*
* #param key1
* the first key whose associated value is to be returned
* #param key2
* the second key whose associated value is to be returned
* #return the value to which the specified key is mapped, or <code>null</code> if this map contains no mapping for
* the key
* #see Map#get(Object)
*/
public V get(K1 key1, K2 key2) {
if (mMap.containsKey(key1)) {
return mMap.get(key1).get(key2);
} else {
return null;
}
}
/**
* Returns <code>true</code> if this map contains a mapping for the specified key
*
* #param key1
* the first key whose presence in this map is to be tested
* #param key2
* the second key whose presence in this map is to be tested
* #return Returns true if this map contains a mapping for the specified key
* #see Map#containsKey(Object)
*/
public boolean containsKeys(K1 key1, K2 key2) {
return mMap.containsKey(key1) && mMap.get(key1).containsKey(key2);
}
public void clear() {
mMap.clear();
}
}
And then create use it like this:
BiHashMap<String,String,String> bigBoard = new BiHashMap<String,String,String>();
However for performance you may want to store the different grades in an array (assuming that you have a fix set of courses)
I don't think a nested hashmap is the way to go. Create a Student class and Subject class.
public class Student{
private ArrayList<Subject> SubjectList = new ArrayList<Subject>();
private String name;
public Student(String name){
this.name=name;
}
public void addSubject(Subject s){
SubjectList.add(s);
}
public String getName(){
return this.name;
}
//...add methods for other operations
}
public class Subject{
private ArrayList<double > GradeList = new ArrayList<double>();
private String name;
public Subject(String name){
this.name=name;
}
public void addGrade(double s){
GradeList.add(s);
}
//...add methods for other operations
}
Then you can store the Students instances in a hashmap.
public static void main(String[] args){
HashMap<Students> hm = new HashMap<Students>();
Student s = new Student("Daniel Dolter");
Subject sub = new Subject("Mathematics");
sub.addGrades(1.3);
s.addSubject(sub);
hm.put(s.getName(),s);
}
With Java 8 it is possible to use computeIfAbsent to insert a default value if it is empty.
So you can simply use this as the type of the 2d-map:
Map<RowType, Map<ColumnType, ValueType>> map = new WhateverMap<>();
let's say all types are int:
int get(int x, int y)
return map.computeIfAbsent(x, (key)->new WhateverMap<>()).computeIfAbsent(y,(key)->0);
}
void put(int x, int y, int value)
return map.computeIfAbsent(x, (key)->new WhateverMap<>()).put(y,value);
}
Note that is not atomic. therefore this is not thread-safe even if WhateverMap is.
You can use Google Guava's Table<R, C, V> collection. It is similar to eabraham's answer. A value V is keyed by a row R and a column C. It is a better alternative to using HashMap<R, HashMap<C, V>> which becomes quickly unreadable and difficult to work with.
See their GitHub Wiki for more information.

Memory efficient multivaluemap

Hi I have the following problem:
I'm storing strings and a corresponding list of integer values in an MultiValueMap<String, Integer>
I'm storing about 13 000 000 million strings and one string can have up to 500 or more values.
For every single value i will have random access on the Map. So worst case are 13 000 000* 500 put calls. Now the speed of the map is good but the memory overhead gets quite high. A MultiValueMap<String, Integer> is nothing else then a HashMap/TreeMap<String, <ArrayList<Integer>>. Both HashMap and TreeMap have quite a lot of memory Overhead. I wont be modifying the map once it is done, but I need it to be fast and as small as possible for random access in a program. (I'm storing it on disk and loading it on start, the serialized map file takes up about 600mb but in memory its about 3gb?)
the most memory efficient thing would be, to store the String in sorted string array and have a corresponding two dimensional int array for values. So access would be a binary search on the string array and getting the corresponding values.
Now I have three ways to get there:
I use a sorted MultivalueMap (TreeMap) for the creation phase to store everything.After I'm finished with getting all values, I get the string array by calling map.keyset().toArray(new String[0]); Make a two dimensional int array and get all the values from the multivaluemap.
Pro: It's easy to implement, It is still fast during creation.
Con: It takes up even more memory during the copying from Map to Arrays.
I use Arrays or maybe ArrayLists from the start and store everything in there
Pro: least memory overhead.
Con: this would be enormously slow because i would have to sort/copy the Array every time a add a new Key, Also i would need to implement my own (propably even slower) sorting to keep the corresponding int array in the same order like the strings. Hard to implement
I use Arrays and a MultivalueMap as buffer. After the program finished 10% or 20% of the creation phase, I will add the values to the Arrays and keep them in order, then start a new Map.
Pro: Propably still fast enough and memory efficient enough.
Con: Hard to implement.
None of these solutions really feel right to me. Do you know any other solutions to this problem, maybe a memory efficient (MultiValue)Map implementation?
I know I could be using a database so don't bother posting it as an answer. I want to know how i could do this without using a database.
If you switched to Guava's Multimap -- I have no idea if that's possible for your application -- you might be able to use Trove and get
ListMultimap<String, Integer> multimap = Multimaps.newListMultimap(
new HashMap<String, Collection<Integer>>(),
new Supplier<List<Integer>>() {
public List<Integer> get() {
return new TIntListDecorator();
}
});
which will make a ListMultimap that uses a HashMap to map to List values backed by int[] arrays, which should be memory-efficient, though you'll pay a small speed penalty because of boxing. You might be able to do something similar for MultiValueMap, though I have no idea what library that's from.
You can use compressed strings to reduce drastically the memory usage.
Parameters to configure your JVM
Comparison of its usage between various java versions
Furthermore, there are other more drastic solutions (it would require some reimplementation):
Memory-disk based list implementation or suggestions about NoSQL database.
Depending on which Integer values you store in your map, a large amount of your heap memory overhead may be caused by having distinct Integer instances, which take up much more RAM than a primitive int value.
Consider using a Map from String to one of the many IntArrayList implementations floating around (e.g. in Colt or in Primitive Collections for Java), which basically implement a List backed by an int array, instead of a being backed by an array of Integer instances.
First, consider the memory taken by the integers. You said that the range will be about 0-4000000. 24 bits is enough to represent 16777216 distinct values. If that is acceptable, you could use byte arrays for the integers, with 3 bytes per integer, and save 25%. You would have to index into the array something like this:
int getPackedInt(byte[] array, int index) {
int i = index*3;
return ((array[i] & 0xFF)<<16) + ((array[i+1] & 0xFF) <<8) + (array[i+2] & 0xFF);
}
int storePackedInt(byte[] array, int index, int value) {
assert value >= 0 && value <= 0xFFFFFF;
int i = index*3;
array[i] = (byte)((value>>16) & 0xFF);
array[i+1] = (byte)((value>>8) & 0xFF);
array[i+2] = (byte)(value & 0xFF);
}
Can you say anything about the distribution of the integers? If many of them will fit in 16 bits, you could use an encoding with a variable number of bytes per number (something like UTF-8 does for representing characters).
Next, consider whether you can save memory on the Strings. What are the characteristics of the Strings? How long will they typically be? Will many strings share prefixes? A compression scheme tailored to the characteristics of your application could save a lot of space (as falsarella pointed out). OR, if many strings will share prefixes, storing them in some type of search trie could be more efficient. (There is a type of trie called "patricia" which might be suitable for this application.) As a bonus, note that searching for Strings in a trie can be faster than searching a hash map (though you'd have to benchmark to see if that is true in your application).
Will the Strings all be ASCII? If so, 50% of the memory used for Strings will be wasted, as a Java char is 16 bits. Again, in this case, you could consider using byte arrays.
If you only need to look Strings up, not iterate over the stored Strings, you could also consider something rather unconventional: hash the Strings, and keep only the hash. Since different String can hash to the same value, there is a chance that a String which was never stored, may still be "found" by a search. But if you use enough bits for the hash value (and a good hash function), you can make that chance so infinitesimally small that it will almost certainly never happen in the estimated lifespan of the universe.
Finally, there is the memory for the structure itself, which holds the Strings and integers. I already suggested using a trie, but if you decide not to do that, nothing will use less memory than parallel arrays -- one sorted array of Strings (which you can do binary search on, as you said), and a parallel array of arrays of integers. After you do a binary search to find an index into the String array, you can use the same index to access the array-of-integer array.
While you are building the structure, if you do decide that a search trie is a good choice, I would just use that directly. Otherwise, you could do 2 passes: one to build up a set of strings (then put them into an array and sort them), and a second pass to add the arrays of integers.
If there are patterns to your key strings, especially common roots, then a a Trie could be an effective method of storing significantly less data.
Here's the code for a working TrieMap.
NB: The usual advice on using EntrySet to iterate across Maps does not apply to Tries. They are exceptionally inefficient in a Trie so please avoid requesting one if at all possible.
/**
* Implementation of a Trie structure.
*
* A Trie is a compact form of tree that takes advantage of common prefixes
* to the keys.
*
* A normal HashSet will take the key and compute a hash from it, this hash will
* be used to locate the value through various methods but usually some kind
* of bucket system is used. The memory footprint resulting becomes something
* like O(n).
*
* A Trie structure essentuially combines all common prefixes into a single key.
* For example, holding the strings A, AB, ABC and ABCD will only take enough
* space to record the presence of ABCD. The presence of the others will be
* recorded as flags within the record of ABCD structure at zero cost.
*
* This structure is useful for holding similar strings such as product IDs or
* credit card numbers.
*
*/
public class TrieMap<V> extends AbstractMap<String, V> implements Map<String, V> {
/**
* Map each character to a sub-trie.
*
* Could replace this with a 256 entry array of Tries but this will handle
* multibyte character sets and I can discard empty maps.
*
* Maintained at null until needed (for better memory footprint).
*
*/
protected Map<Character, TrieMap<V>> children = null;
/**
* Here we store the map contents.
*/
protected V leaf = null;
/**
* Set the leaf value to a new setting and return the old one.
*
* #param newValue
* #return old value of leaf.
*/
protected V setLeaf(V newValue) {
V old = leaf;
leaf = newValue;
return old;
}
/**
* I've always wanted to name a method something like this.
*/
protected void makeChildren () {
if ( children == null ) {
// Use a TreeMap to ensure sorted iteration.
children = new TreeMap<Character, TrieMap<V>>();
}
}
/**
* Finds the TrieMap that "should" contain the key.
*
* #param key
*
* The key to find.
*
* #param grow
*
* Set to true to grow the Trie to fit the key.
*
* #return
*
* The sub Trie that "should" contain the key or null if key was not found and
* grow was false.
*/
protected TrieMap<V> find(String key, boolean grow) {
if (key.length() == 0) {
// Found it!
return this;
} else {
// Not at end of string.
if (grow) {
// Grow the tree.
makeChildren();
}
if (children != null) {
// Ask the kids.
char ch = key.charAt(0);
TrieMap<V> child = children.get(ch);
if (child == null && grow) {
// Make the child.
child = new TrieMap<V>();
// Store the child.
children.put(ch, child);
}
if (child != null) {
// Find it in the child.
return child.find(tail(key), grow);
}
}
}
return null;
}
/**
* Remove the head (first character) from the string.
*
* #param s
*
* The string.
*
* #return
*
* The same string without the first (head) character.
*
*/
// Suppress warnings over taking a subsequence
private String tail(String s) {
return s.substring(1, s.length());
}
/**
*
* Add a new value to the map.
*
* Time footprint = O(s.length).
*
* #param s
*
* The key defining the place to add.
*
* #param value
*
* The value to add there.
*
* #return
*
* The value that was there, or null if it wasn't.
*
*/
#Override
public V put(String key, V value) {
V old = null;
// If empty string.
if (key.length() == 0) {
old = setLeaf(value);
} else {
// Find it.
old = find(key, true).put("", value);
}
return old;
}
/**
* Gets the value at the specified key position.
*
* #param o
*
* The key to the location.
*
* #return
*
* The value at that location, or null if there is no value at that location.
*/
#Override
public V get(Object o) {
V got = null;
if (o != null) {
String key = (String) o;
TrieMap<V> it = find(key, false);
if (it != null) {
got = it.leaf;
}
} else {
throw new NullPointerException("Nulls not allowed.");
}
return got;
}
/**
* Remove the value at the specified location.
*
* #param o
*
* The key to the location.
*
* #return
*
* The value that was removed, or null if there was no value at that location.
*/
#Override
public V remove(Object o) {
V old = null;
if (o != null) {
String key = (String) o;
if (key.length() == 0) {
// Its me!
old = leaf;
leaf = null;
} else {
TrieMap<V> it = find(key, false);
if (it != null) {
old = it.remove("");
}
}
} else {
throw new NullPointerException("Nulls not allowed.");
}
return old;
}
/**
* Count the number of values in the structure.
*
* #return
*
* The number of values in the structure.
*/
#Override
public int size() {
// If I am a leaf then size increases by 1.
int size = leaf != null ? 1 : 0;
if (children != null) {
// Add sizes of all my children.
for (Character c : children.keySet()) {
size += children.get(c).size();
}
}
return size;
}
/**
* Is the tree empty?
*
* #return
*
* true if the tree is empty.
* false if there is still at least one value in the tree.
*/
#Override
public boolean isEmpty() {
// I am empty if I am not a leaf and I have no children
// (slightly quicker than the AbstaractCollection implementation).
return leaf == null && (children == null || children.isEmpty());
}
/**
* Returns all keys as a Set.
*
* #return
*
* A HashSet of all keys.
*
* Note: Although it returns Set<S> it is actually a Set<String> that has been
* home-grown because the original keys are not stored in the structure
* anywhere.
*/
#Override
public Set<String> keySet() {
// Roll them a temporary list and give them a Set from it.
return new HashSet<String>(keyList());
}
/**
* List all my keys.
*
* #return
*
* An ArrayList of all keys in the tree.
*
* Note: Although it returns List<S> it is actually a List<String> that has been
* home-grown because the original keys are not stored in the structure
* anywhere.
*
*/
protected List<String> keyList() {
List<String> contents = new ArrayList<String>();
if (leaf != null) {
// If I am a leaf, a null string is in the set.
contents.add((String) "");
}
// Add all sub-tries.
if (children != null) {
for (Character c : children.keySet()) {
TrieMap<V> child = children.get(c);
List<String> childContents = child.keyList();
for (String subString : childContents) {
// All possible substrings can be prepended with this character.
contents.add((String) (c + subString.toString()));
}
}
}
return contents;
}
/**
* Does the map contain the specified key.
*
* #param key
*
* The key to look for.
*
* #return
*
* true if the key is in the Map.
* false if not.
*/
public boolean containsKey(String key) {
TrieMap<V> it = find(key, false);
if (it != null) {
return it.leaf != null;
}
return false;
}
/**
* Represent me as a list.
*
* #return
*
* A String representation of the tree.
*/
#Override
public String toString() {
List<String> list = keyList();
//Collections.sort((List<String>)list);
StringBuilder sb = new StringBuilder();
Separator comma = new Separator(",");
sb.append("{");
for (String s : list) {
sb.append(comma.sep()).append(s).append("=").append(get(s));
}
sb.append("}");
return sb.toString();
}
/**
* Clear down completely.
*/
#Override
public void clear() {
children = null;
leaf = null;
}
/**
* Return a list of key/value pairs.
*
* #return
*
* The entry set.
*/
public Set<Map.Entry<String, V>> entrySet() {
Set<Map.Entry<String, V>> entries = new HashSet<Map.Entry<String, V>>();
List<String> keys = keyList();
for (String key : keys) {
entries.add(new Entry<String,V>(key, get(key)));
}
return entries;
}
/**
* An entry.
*
* #param <S>
*
* The type of the key.
*
* #param <V>
*
* The type of the value.
*/
private static class Entry<S, V> implements Map.Entry<S, V> {
protected S key;
protected V value;
public Entry(S key, V value) {
this.key = key;
this.value = value;
}
public S getKey() {
return key;
}
public V getValue() {
return value;
}
public V setValue(V newValue) {
V oldValue = value;
value = newValue;
return oldValue;
}
#Override
public boolean equals(Object o) {
if (!(o instanceof TrieMap.Entry)) {
return false;
}
Entry e = (Entry) o;
return (key == null ? e.getKey() == null : key.equals(e.getKey()))
&& (value == null ? e.getValue() == null : value.equals(e.getValue()));
}
#Override
public int hashCode() {
int keyHash = (key == null ? 0 : key.hashCode());
int valueHash = (value == null ? 0 : value.hashCode());
return keyHash ^ valueHash;
}
#Override
public String toString() {
return key + "=" + value;
}
}
}

Categories

Resources