Java, hashmap inside a hashmap - java

follow up from my question here: How To Access hash maps key when the key is an object
I wanted to try something like this: webSearchHash.put(xfile.getPageTitle(i),outlinks.put(keyphrase.get(i), xfile.getOutLinks(i)));
Wonder why my keys are null
here is my code:
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Set;
import readFile.*;
public class WebSearch {
readFile.ReadFile xfile = new readFile.ReadFile("inputgraph.txt");
HashMap webSearchHash = new HashMap();
ArrayList belongsTo = new ArrayList();
ArrayList keyphrase = new ArrayList();
public WebSearch() {
}
public void createGraph()
{
HashMap <Object, ArrayList<Integer> > outlinks = new HashMap <Object, ArrayList<Integer>>();
for (int i = 0; i < xfile.getNumberOfWebpages(); i++ )
{
keyphrase.add(i,xfile.getKeyPhrases(i));
webSearchHash.put(xfile.getPageTitle(i),outlinks.put(keyphrase.get(i), xfile.getOutLinks(i)));
}
}
}
when I do System.out.print(webSearchHash); the output is {Star-Ledger=null, Apple=null, Microsoft=null, Intel=null, Rutgers=null, Targum=null, Wikipedia=null, New York Times=null}
However System.out.print(outlinks); gives me : {[education, news, internet]=[0, 3], [power, news]=[1, 4], [computer, internet, device, ipod]=[2]} Basically I want a hashmap to be a value of my key

You really shouldn't use a HashMap (or any mutable object) as your key, since it will destabilize your Map. Depending on what you're intending to accomplish, there may be a number of useful approaches and libraries, but using an unstable object as a Map key is asking for trouble.

So figured I just do this which gives exactly what I want:
for (int i = 0; i < xfile.getNumberOfWebpages(); i++ )
{
HashMap <Object, ArrayList<Integer> > outlinks = new HashMap <Object, ArrayList<Integer>>();
keyphrase.add(i,xfile.getKeyPhrases(i));
outlinks.put(keyphrase.get(i), xfile.getOutLinks(i));
webSearchHash.put(xfile.getPageTitle(i), outlinks);
}

Your problem is you are putting in null with this statement
webSearchHash.put(xfile.getPageTitle(i),outlinks.put(keyphrase.get(i), xfile.getOutLinks(i)));
lets break it down. a put is of the form
map.put(key,value)
so for your key you have getPageTitle(i). which is fine
for your value, you have the return value of
outlinks.put(keyphrase.get(i), xfile.getOutLinks(i))
according to the javadoc, a hashmap put returns the previous value that was associated with this key (in this case keyphrase.get(i)) or null if no value was previously associated with it.
Since nothing was previously associated with your key, it returns null.
So your statement effectively is saying
webSearchHash.put(xfile.getPageTitle(i),null);
http://docs.oracle.com/javase/6/docs/api/java/util/HashMap.html#put(K, V)

Related

How can I add a string one at a time to a HashMap<Integer, List<String>>?

This function loops through a dictionary (allWords) and uses the
getKey function to generate a key. wordListMap is a HashMap> so I need to loop through and put the key and and a List. If there is not a list I put one if there is I just need to append the next dictionary word. This is where I need help. I just can't figure out the syntax to simply append the next word to the list that is already there. Any Help would be appreciated.
public static void constructWordListMap() {
wordListMap = new HashMap<>();
for (String w : allWords) {
int key = getKey(w);
if (isValidWord(w) && !wordListMap.containsKey(key)) {
List list = new ArrayList();
list.add(w);
wordListMap.put(key, list);
} else if (isValidWord(w) && wordListMap.containsKey(key)) {
wordListMap.put(key, wordListMap.get(key).add(w));
}
}
}
map.get(key).add(value)
Simple as that.
So I've gathered that you want to, given HashMap<Integer, List<String>>, you'd like to:
create a List object
add String objects to said List
add that List object as a value to be paired with a previously generated key (type Integer)
To do so, you'd want to first generate the key
Integer myKey = getKey(w);
Then, you'd enter a loop and add to a List object
List<String> myList = new List<String>;
for(int i = 0; i < intendedListLength; i++) {
String myEntry = //wherever you get your string from
myList.add(myEntry);
}
Lastly, you'd add the List to the HashMap
myHash.put(myKey, myList);
Leave any questions in the comments.
else if (isValidWord(w) && wordListMap.containsKey(key)) {
wordListMap.put(key, wordListMap.get(key).add(w));
}
If you want to add a new value to your list, you need to retrieve that list first. In the code above, you are putting the return value of add into the table (which is a boolean), and that is not what you want.
Instead, you will want to do as Paul said:
else if (isValidWord(w) && wordListMap.containsKey(key)) {
wordListMap.get(key).add(w);
}
The reason this works is because you already added an ArrayList to the table earlier. Here, you are getting that ArrayList, and adding a new value to it.

How to see the distribution of keys in a HashMap?

When using a hash map, it's important to evenly distribute the keys over the buckets.
If all keys end up in the same bucket, you essentially end up with a list.
Is there a way to "audit" a HashMap in Java in order to see how well the keys are distributed?
I tried subtyping it and iterating Entry<K,V>[] table, but it's not visible.
I tried subtyping it and iterating Entry[] table, but it's not visible
Use Reflection API!
public class Main {
//This is to simulate instances which are not equal but go to the same bucket.
static class A {
#Override
public boolean equals(Object obj) { return false;}
#Override
public int hashCode() {return 42; }
}
public static void main(String[] args) {
//Test data
HashMap<A, String> map = new HashMap<A, String>(4);
map.put(new A(), "abc");
map.put(new A(), "def");
//Access to the internal table
Class clazz = map.getClass();
Field table = clazz.getDeclaredField("table");
table.setAccessible(true);
Map.Entry<Integer, String>[] realTable = (Map.Entry<Integer, String>[]) table.get(map);
//Iterate and do pretty printing
for (int i = 0; i < realTable.length; i++) {
System.out.println(String.format("Bucket : %d, Entry: %s", i, bucketToString(realTable[i])));
}
}
private static String bucketToString(Map.Entry<Integer, String> entry) throws Exception {
if (entry == null) return null;
StringBuilder sb = new StringBuilder();
//Access to the "next" filed of HashMap$Node
Class clazz = entry.getClass();
Field next = clazz.getDeclaredField("next");
next.setAccessible(true);
//going through the bucket
while (entry != null) {
sb.append(entry);
entry = (Map.Entry<Integer, String>) next.get(entry);
if (null != entry) sb.append(" -> ");
}
return sb.toString();
}
}
In the end you'll see something like this in STDOUT:
Bucket : 0, Entry: null
Bucket : 1, Entry: null
Bucket : 2, Entry: Main$A#2a=abc -> Main$A#2a=def
Bucket : 3, Entry: null
HashMap uses the keys produced by the hashCode() method of your key objects, so I guess you are really asking how evenly distributed those hash code values are. You can get hold of the key objects using Map.keySet().
Now, the OpenJDK and Oracle implementations of HashMap do not use the key hash codes directly, but apply another hashing function to the provided hashes before distributing them over the buckets. But you should not rely on or use this implementation detail. So you ought to ignore it. So you should just ensure that the hashCode() methods of your key values are well distributed.
Examining the actual hash codes of some sample key value objects is unlikely to tell you anything useful unless your hash cide method is very poor. You would be better doing a basic theoretical analysis of your hash code method. This is not as scary as it might sound. You may (indeed, have no choice but to do so) assume that the hash code methods of the supplied Java classes are well distributed. Then you just need a check that the means you use for combining the hash codes for your data members behaves well for the expected values of your data members. Only if your data members have values that are highly correlated in a peculiar way is this likely to be a problem.
You can use reflection to access the hidden fields:
HashMap map = ...;
// get the HashMap#table field
Field tableField = HashMap.class.getDeclaredField("table");
tableField.setAccessible(true);
Object[] table = (Object[]) tableField.get(map);
int[] counts = new int[table.length];
// get the HashMap.Node#next field
Class<?> entryClass = table.getClass().getComponentType();
Field nextField = entryClass.getDeclaredField("next");
nextField.setAccessible(true);
for (int i = 0; i < table.length; i++) {
Object e = table[i];
int count = 0;
if (e != null) {
do {
count++;
} while ((e = nextField.get(e)) != null);
}
counts[i] = count;
}
Now you have an array of the entry counts for each bucket.
Client.java
public class Client{
public static void main(String[] args) {
Map<Example, Number> m = new HashMap<>();
Example e1 = new Example(100); //point 1
Example e2 = new Example(200); //point2
Example e3 = new Example(300); //point3
m.put(e1, 10);
m.put(e2, 20);
m.put(e3, 30);
System.out.println(m);//point4
}
}
Example.java
public class Example {
int s;
Example(int s) {
this.s =s;
}
#Override
public int hashCode() {
// TODO Auto-generated method stub
return 5;
}
}
Now at point 1, point 2 and point 3 in Client.java, we are inserting 3 keys of type Example in hashmap m. Since hashcode() is overridden in Example.java, all three keys e1,e2,e3 will return same hashcode and hence same bucket in hashmap.
Now the problem is how to see the distribution of keys.
Approach :
Insert a debug point at point4 in Client.java.
Debug the java application.
Inspect m.
Inside m, you will find table array of type HashMap$Node and size 16.
This is literally the hashtable. Each index contains a linked list of Entry objects that are inserted into hashmap. Each non null index has a hash variable that correspond to the hash value returned by the hash() method of Hashmap. This hash value is then sent to indexFor() method of HashMap to find out the index of table array , where the Entry object will be inserted. (Refer #Rahul's link in comments to question to understand the concept of hash and indexFor).
For the case, taken above, if we inspect table, you will find all but one key null.
We had inserted three keys but we can see only one, i.e. all three keys have been inserted into the same bucket i.e same index of table.
Inspect the table array element(in this case it will be 5), key correspond to e1, while value correspond to 10 (point1)
next variable here points to next node of Linked list i.e. next Entry object which is (e2, 200) in our case.
So in this way you can inspect the hashmap.
Also i would recommend you to go through internal implementation of hashmap to understand HashMap by heart.
Hope it helped..

Reduce For loop in Map

This is my code:
package vvv;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class test {
public static void main(String args[]){
Map<Integer, String> map = new HashMap<Integer, String>();
map.put(1, "demo");
map.put(20, "fdemo");
map.put(60, "gdemo");
map.put(500, "udemo");
map.put(8000, "odemo");
// etc
int b = 7999;
for(int i =1; i<=8000; i++)
{
if(i == b)
System.out.println(map.get(b));
}
}
}
I don't want to use a big "for loop" just to find a result from map, and in addition, I can't change a key in the map (for example I can't change 500 to 4).
What should I do to reduce my loop condition?
This would iterate over all the keys in the Map :
for (Integer key : map.keySet()) {
...
}
However, I don't see the point in your loop, since you only do something for a specific key (7999), so your loop can be reduced to:
System.out.println(map.get(b));
Changing a key is not something you can do with a single method. You have to first remove the old key, and then insert the new key, using the same value.
if (map.containsKey(500))
map.put(4,map.remove(500));
Use:
for (Integer theKey : map.keySet()) {
String val=map.get(thekey);
....
}
If your are just trying to check if some key exists or not ,you don't even need the loops, use map.containKey(some_key)
And you may need to pay attention that if your key is your customer class rather than primitive type, you need to override the equals() and hashCode() method, or , something bad will happen.
Try below code it will work fine. No need to iterate map you just use map.conytainsKey(key) method.
{
String tempString=null;
tempString=((map.containsKey(b)?map.get(b):" "));
System.out.println(tempString);
}
or you may also write like this
{
System.out.println((map.containsKey(b)?map.get(b):" "));
}
for (Map.Entry<Integer, String> entry : map.entrySet()) {
Integer key = entry.getKey();
if(b==key);
println entry.getValue();
}

Comparing Strings and Returning Boolean

I am currently working on one of the usecases where you are given 6 strings which has 3 oldValues and 3 newValues like given below:
String oldFirstName = "Yogend"
String oldLastName = "Jos"
String oldUserName = "YNJos"
String newFirstName = "Yogendra"
String newLastName ="Joshi"
String newUserName = "YNJoshi"
now what I basically want to do is compare each of the oldValue with its corresponding new value and return true if they are not equal i.e
if(!oldFirstName.equalsIgnoreCase(newFirstName)) {
return true;
}
Now, since I am having 3 fields and it could very well happen that in future we might have more Strings with old and new value I am looking for an optimum solution which could work in all cases no matter how many old and new values are added and without having gazillions of if else clauses.
One possibility I thought was of having Old values as OldArrayList and new values as newArraylist and then use removeAll where it would remove the duplicate values but that is not working in some cases.
Can anyone on stack help me out with some pointers on how to optimum way get this done.
Thanks,
Yogendra N Joshi
you can use lambdaj (download here,website) and hamcrest (download here,website), this libraries are very powerfull for managing collections, the following code is very simple and works perfectly:
import static ch.lambdaj.Lambda.filter;
import static ch.lambdaj.Lambda.having;
import static ch.lambdaj.Lambda.on;
import static org.hamcrest.Matchers.isIn;
import java.util.Arrays;
import java.util.List;
public class Test {
public static void main(String[] args) {
List<String> oldNames = Arrays.asList("nameA","nameE","nameC","namec","NameC");
List<String> newNames = Arrays.asList("nameB","nameD","nameC","nameE");
List<String> newList = filter(having(on(String.class), isIn(oldNames)),newNames);
System.out.print(newList);
//print nameC, nameE
}
}
With this libraries you can solve your problem in one line. You must add to your project: hamcrest-all-1.3.jar and lambdaj-2.4.jar Hope this help serve.
NOTE: This will help you assuming you can have alternatives to your code.
You can use two HashMap<yourFieldName, yourFieldValue> instead of two Arrays / Lists / Sets of Strings (or multiple random Strings);
Then you need a method to compare each value of both maps by their keys;
The result will be an HashMap<String,Boolean> containing the name of each field key, and true if the value is equal in both maps, while false if it is different.
No matter how many fields you will add in the future, the method won't change, while the result will.
Running Example: https://ideone.com/dIaYsK
Code
private static Map<String,Boolean> scanForDifferences(Map<String,Object> mapOne,
Map<String,Object> mapTwo){
Map<String,Boolean> retMap = new HashMap<String,Boolean>();
Iterator<Map.Entry<String, Object>> it = mapOne.entrySet().iterator();
while (it.hasNext()) {
Map.Entry<String,Object> entry = (Map.Entry<String,Object>)it.next();
if (mapTwo.get(entry.getKey()).equals(entry.getValue()))
retMap.put(entry.getKey(), new Boolean(Boolean.TRUE));
else
retMap.put(entry.getKey(), new Boolean(Boolean.FALSE));
it.remove(); // prevent ConcurrentModificationException
}
return retMap;
}
Test Case Input
Map<String,Object> oldMap = new HashMap<String,Object>();
Map<String,Object> newMap = new HashMap<String,Object>();
oldMap.put("initials","Y. J.");
oldMap.put("firstName","Yogend");
oldMap.put("lastName","Jos");
oldMap.put("userName","YNJos");
oldMap.put("age","33");
newMap.put("initials","Y. J.");
newMap.put("firstName","Yogendra");
newMap.put("lastName","Joshi");
newMap.put("userName","YNJoshi");
newMap.put("age","33");
Test Case Run
Map<String,Boolean> diffMap = Main.scanForDifferences(oldMap, newMap);
Iterator<Map.Entry<String, Boolean>> it = diffMap.entrySet().iterator();
while (it.hasNext()) {
Map.Entry<String,Boolean> entry = (Map.Entry<String,Boolean>)it.next();
System.out.println("Field [" + entry.getKey() +"] is " +
(entry.getValue()?"NOT ":"") + "different" );
}
You should check too if a value is present in one map and not in another one.
You could return an ENUM instead of a Boolean with something like EQUAL, DIFFERENT, NOT PRESENT ...
You should convert your String to some Set.
One set for OLD and another for NEW. And your goal of varity number of elements will also be resolved using same.
As it's set order of it will be same.

How to optimize the updating of values in an ArrayList<Integer>

I want to store all values of a certain variable in a dataset and the frequency for each of these values. To do so, I use an ArrayList<String> to store the values and an ArrayList<Integer> to store the frequencies (since I can't use int). The number of different values is unknown, that's why I use ArrayList and not Array.
Example (simplified) dataset:
a,b,c,d,b,d,a,c,b
The ArrayList<String> with values looks like: {a,b,c,d} and the ArrayList<Integer> with frequencies looks like: {2,3,2,2}.
To fill these ArrayLists I iterate over each record in the dataset, using the following code.
public void addObservation(String obs){
if(values.size() == 0){// first value
values.add(obs);
frequencies.add(new Integer(1));
return;//added
}else{
for(int i = 0; i<values.size();i++){
if(values.get(i).equals(obs)){
frequencies.set(i, new Integer((int)frequencies.get(i)+1));
return;//added
}
}
// only gets here if value of obs is not found
values.add(obs);
frequencies.add(new Integer(1));
}
}
However, since the datasets I will use this for can be very big, I want to optimize my code, and using frequencies.set(i, new Integer((int)frequencies.get(i)+1)); does not seem very efficient.
That brings me to my question; how can I optimize the updating of the Integer values in the ArrayList?
Use a HashMap<String,Integer>
Create the HashMap like so
HashMap<String,Integer> hm = new HashMap<String,Integer>();
Then your addObservation method will look like
public void addObservation(String obs) {
if( hm.contains(obs) )
hm.put( obs, hm.get(obs)+1 );
else
hm.put( obs, 1 );
}
I would use a HashMap or a Hashtable as tskzzy suggested. Depending on your needs I would also create an object that has the name, count as well as other metadata that you might need.
So the code would be something like:
Hashtable<String, FrequencyStatistics> statHash = new Hashtable<String, FrequencyStatistics>();
for (String value : values) {
if (statHash.get(value) == null) {
FrequencyStatistics newStat = new FrequencyStatistics(value);
statHash.set(value, newStat);
} else {
statHash.get(value).incrementCount();
}
}
Now, your FrequencyStatistics objects constructor would automatically set its inital count to 1, while the incrementCound() method would increment the count, and perform any other statistical calculations that you might require. This should also be more extensible in the future than storing a hash of the String with only its corresponding Integer.

Categories

Resources