LinkedHashMap with values as a vector being overwritten - java

When I wrote this piece of code due to the pnValue.clear(); the output I was getting was null values for the keys. So I read somewhere that adding values of one map to the other is a mere reference to the original map and one has to use the clone() method to ensure the two maps are separate. Now the issue I am facing after cloning my map is that if I have multiple values for a particular key then they are being over written. E.g. The output I am expecting from processing a goldSentence is:
{PERSON = [James Fisher],ORGANIZATION=[American League, Chicago Bulls]}
but what I get is:
{PERSON = [James Fisher],ORGANIZATION=[Chicago Bulls]}
I wonder where I am going wrong considering I am declaring my values as a Vector<String>
for(WSDSentence goldSentence : goldSentences)
{
for (WSDElement word : goldSentence.getWsdElements()){
if (word.getPN()!=null){
if (word.getPN().equals("group")){
String newPNTag = word.getPN().replace("group", "organization");
pnValue.add(word.getToken().replaceAll("_", " "));
newPNValue = (Vector<String>) pnValue.clone();
annotationMap.put(newPNTag.toUpperCase(),newPNValue);
}
else{
pnValue.add(word.getToken().replaceAll("_", " "));
newPNValue = (Vector<String>) pnValue.clone();
annotationMap.put(word.getPN().toUpperCase(),newPNValue);
}
}
sentenceAnnotationMap = (LinkedHashMap<String, Vector<String>>) annotationMap.clone();
pnValue.clear();
}
EDITED CODE
Replaced Vector with List and removed cloning. However this still doesn't solve my problem. This takes me back to square one where my output is : {PERSON=[], ORGANIZATION=[]}
for(WSDSentence goldSentence : goldSentences)
{
for (WSDElement word : goldSentence.getWsdElements()){
if (word.getPN()!=null){
if (word.getPN().equals("group")){
String newPNTag = word.getPN().replace("group", "organization");
pnValue.add(word.getToken().replaceAll("_", " "));
newPNValue = (List<String>) pnValue;
annotationMap.put(newPNTag.toUpperCase(),newPNValue);
}
else{
pnValue.add(word.getToken().replaceAll("_", " "));
newPNValue = pnValue;
annotationMap.put(word.getPN().toUpperCase(),newPNValue);
}
}
sentenceAnnotationMap = annotationMap;
}
pnValue.clear();

You're trying a bunch of stuff without really thinking through the logic behind it. There's no need to clear or clone anything, you just need to manage separate lists for separate keys. Here's the basic process for each new value:
If the map contains our key, get the list and add our value
Otherwise, create a new list, add our value, and add the list to the map
You've left out most of your variable declarations, so I won't try to show you the exact solution, but here's the general formula:
List<String> list = map.get(key); // try to get the list
if (list == null) { // list doesn't exist?
list = new ArrayList<>(); // create an empty list
map.put(key, list); // insert it into the map
}
list.add(value); // update the list

Related

How can I add a string one at a time to a HashMap<Integer, List<String>>?

This function loops through a dictionary (allWords) and uses the
getKey function to generate a key. wordListMap is a HashMap> so I need to loop through and put the key and and a List. If there is not a list I put one if there is I just need to append the next dictionary word. This is where I need help. I just can't figure out the syntax to simply append the next word to the list that is already there. Any Help would be appreciated.
public static void constructWordListMap() {
wordListMap = new HashMap<>();
for (String w : allWords) {
int key = getKey(w);
if (isValidWord(w) && !wordListMap.containsKey(key)) {
List list = new ArrayList();
list.add(w);
wordListMap.put(key, list);
} else if (isValidWord(w) && wordListMap.containsKey(key)) {
wordListMap.put(key, wordListMap.get(key).add(w));
}
}
}
map.get(key).add(value)
Simple as that.
So I've gathered that you want to, given HashMap<Integer, List<String>>, you'd like to:
create a List object
add String objects to said List
add that List object as a value to be paired with a previously generated key (type Integer)
To do so, you'd want to first generate the key
Integer myKey = getKey(w);
Then, you'd enter a loop and add to a List object
List<String> myList = new List<String>;
for(int i = 0; i < intendedListLength; i++) {
String myEntry = //wherever you get your string from
myList.add(myEntry);
}
Lastly, you'd add the List to the HashMap
myHash.put(myKey, myList);
Leave any questions in the comments.
else if (isValidWord(w) && wordListMap.containsKey(key)) {
wordListMap.put(key, wordListMap.get(key).add(w));
}
If you want to add a new value to your list, you need to retrieve that list first. In the code above, you are putting the return value of add into the table (which is a boolean), and that is not what you want.
Instead, you will want to do as Paul said:
else if (isValidWord(w) && wordListMap.containsKey(key)) {
wordListMap.get(key).add(w);
}
The reason this works is because you already added an ArrayList to the table earlier. Here, you are getting that ArrayList, and adding a new value to it.

How to compare values in different collections that need different type of iteration loops?

Lets say you have an Iterator which will contains values that you need to compare with values that are located in a separate List.
Iterator<Map.Entry<String, Object>> it = aObj.items();
while (it.hasNext()) {
Map.Entry<String, Object> item = it.next();
nameValue = item.getNameValue();
keyValue = item.getKeyValue();
System.out.println("Name: " + nameValue);
System.out.println("Value: " + keyValue);
}
This outputs:
Name: header
Value: 22222
Lets say you have a separate list (in which you want to compare the above values with):
List<Items> items = new ArrayList<>();
for (Item item : items) {
itemNameValue = item.getName();
itemKeyValue = item.getKey();
System.out.println("Name: " + itemNameValue);
System.out.println("Value: " + itemKeyValue);
}
This outputs:
Name: header
Value: 44444
Since these are different types of loops (one is a while loop and the other one is a for each loop)
how can you compare for example:
if (nameValue.equals(itemNameValue())) {
// do something?
}
I need to iterate over both collections / data structures at the same time...
Would this be the solution?
String nameValue = "";
Object keyValue = "";
String itemNameValue = "";
String itemKeyValue = "";
Iterator<Map.Entry<String, Object>> it = aObj.items();
while (it.hasNext()) {
Map.Entry<String, Object> item = it.next();
nameValue = item.getNameValue();
keyValue = item.getKeyValue();
for (Item item : items) {
itemNameValue = item.getName();
itemKeyValue = item.getKey();
}
if (nameValue.equals(itemNameValue())) {
// do something?
}
}
Basically, what I am trying to ask (in a very simplified way is this):
(1) The collection that needs to be iterated in a while loop is just test input (sample data)
(2) The array list from the second collection is really a list of data which was returned from a database call (DAO) and placed into the ArrayList.
I am trying to verify if the input from Iterator inside the while loop is the same as the values from the ArrayList (which came from a database). Since these are different data structures requiring different looping mechanisms. How could I iterate through both data structures at the same time and compare them? The second data structure (the array list) is the actual set of values that are correct.
I don't know if there's a guarantee that each iteration would be comparing the same items if I use a nested loop?
Thank you for taking the time to read this...
The problem you are facing is a direct result of a BAD Application design.
The underline incorrect assumption of this question is that the map and the list will hold the objects in the same sequence.
List --> A data structure that is ordered by not sorted
Map --> A data structure that is neither ordered nor sorted
This is not to say that these two data structures don't work well together. However, using them to store the same list should only result from an awkward program design.
Even though to answer your question, you can use the below code to accomplish this:
Iterator<Map.Entry<String, Object>> it = aObj.items();
List<Items> items = dbCall.getItems(); // Get the list of Items from the DB
int index = 0;
while (it.hasNext()) {
Map.Entry<String, Object> itemFromMap = it.next();
Item itemFromList = items.get(index);
if(itemFromMap.getNameValue().equals(itemFromList.getName()) &&
itemFromMap.getKeyValue().equals(itemFromList.getKey())){
// If you prefer a single .equals() method over &&, then you can implement a Comparator<Item>
return false;
}
index++;
}
return true;

Unexplained Java Hashmap Behavior

In the code below, I am creating a hashmap to store objects called Datums, which contain a String (location) and a count. Unfortunately, the code is giving very strange behavior.
FileSystem fs = FileSystem.get(new Configuration());
Random r = new Random();
FSDataOutputStream fsdos = fs.create(new Path("error/" + r.nextInt(1000000)));
HashMap<String, Datum> datums = new HashMap<String, Datum>();
while (itrtr.hasNext()) {
Datum next = itrtr.next();
synchronized (datums) {
if (!datums.containsKey(next.location)) {
fsdos.writeUTF("INSERTING: " + next + "\n");
datums.put(next.location, next);
} else {
} // skit those that are already indexed
}
}
for (Datum d : datums.values()) {
fsdos.writeUTF("PRINT DATUM VALUES: " + d.toString() + "\n");
}
The hashmap has Strings as keys.
Here is the output I get in the error files (example):
INSERTING: (test.txt,3)
INSERTING: (test2.txt,1)
PRINT DATUM VALUES: (test.txt,3)
PRINT DATUM VALUES: (test.txt,3)
The correct output for the print should be:
INSERTING: (test.txt,3)
INSERTING: (test2.txt,1)
PRINT DATUM VALUES: (test.txt,3)
PRINT DATUM VALUES: (test2.txt,1)
What is happening to the Datum with test2.txt as its location? Why is it getting replaced with test.txt??
Basically, I should never see the same location twice. (that is what the !datums.containsKey is checking for). Unfortunately, I'm getting very strange behavior.
This is on Hadoop, by the way, in a reducer.
I tried putting the synchronized here in case it was running in multiple threads, which, to my knowledge, it isn't. Still, the same thing happens.
According to this answer Hadoop's iterator always returns the same object, instead of creating a new object to return each time around the loop.
So, holding onto references to the object returned by the iterator is not valid and will produce surprising results. You'll need to copy the data to a new object:
while (itrtr.hasNext()) {
Datum next = itrtr.next();
// copy any values from the Datum to a fresh instance
Datum insert = new Datum(next.location, next.value);
if (!datums.containsKey(insert.location)) {
datums.put(insert.location, insert);
}
}
Here is a reference to the Hadoop Reducer documentation which confirms this:
The framework will reuse the key and value objects that are passed
into the reduce, therefore the application should clone the objects
they want to keep a copy of.
it is not problem of the map but of the code
datums.put(next.location, next); inserts as value reference that is later chnaged :)
that is why at the end all values in the map are the same equal to last processed datum in the map

Java Parsing Using Hmap

I am new to Java. I want to Parse the data which is in this Format
Apple;Mango;Orange:1234;Orange:1244;...;
There could be more than one "Orange" at any point of time. Numbers (1,2...) increase and accordingly as the "Orange".
Okay. After splitting it, Lets assume I have stored the first two data(Apple, Orange) in a variable(in setter) to return the same in the getter function. And now I want to add the value(1234,1244....etc) in the 'orange' thing into a variable to return it later. Before that i have to check how many oranges have come. For that, i know i have to use for loop. But don't know how to store the "Value" into a variable.
Please Help me guys.
String input = "Apple;Mango;Orange:1234;Orange:1244;...;"
String values[] = input.split(";");
String value1 = values[0];
String value2 = values[1];
Hashmap< String, ArrayList<String> > map = new HashMap<String, ArrayList<String>>();
for(int i = 2; i < values.length; i = i + 2){
String key = values[i];
String id = values[i+1];
if (map.get(key) == null){
map.put(key, new ArrayList<String>());
}
map.get(key).add(id);
}
//for any key s:
// get the values of s
map.get(s); // returns a list of all values added
// get the count of s
map.get(s).size(); // return the total number of values.
Let me try to rephrase the question by how I interpreted it and -- more importantly -- how it focuses on the input and output (expectations), not the actual implementation:
I need to parse the string
"Apple;Mango;Orange:1234;Orange:1244;...;"
in a way so I can retrieve the values associated (numbers after ':') with the fruits:
I should receive an empty list for both the Apple and Mango in the example, because they have no value;
I should receive a list of 1234, 1244 for Orange.
Of course your intuition of HashMap is right on the spot, but someone may always present a better solution if you don't get too involved with the specifics.
There are a few white spots left:
Should the fruits without values have a default value given?
Should the fruits without values be in the map at all?
How input errors should be handled?
How duplicate values should be handled?
Given this context, we can start writing code:
import java.util.*;
public class FruitMarker {
public static void main(String[] args) {
String input = "Apple;Mango;Orange:1234;Orange:1244";
// replace with parameter processing from 'args'
// avoid direct implementations in variable definitions
// also observe the naming referring to the function of the variable
Map<String, Collection<Integer>> fruitIds = new HashMap<String, Collection<Integer>>();
// iterate through items by splitting
for (String item : input.split(";")) {
String[] fruitAndId = item.split(":"); // this will return the same item in an array, if separator is not found
String fruitName = fruitAndId[0];
boolean hasValue = fruitAndId.length > 1;
Collection<Integer> values = fruitIds.get(fruitName);
// if we are accessing the key for the first time, we have to set its value
if (values == null) {
values = new ArrayList<Integer>(); // here I can use concrete implementation
fruitIds.put(fruitName, values); // be sure to put it back in the map
}
if (hasValue) {
int fruitValue = Integer.parseInt(fruitAndId[1]);
values.add(fruitValue);
}
}
// display the entries in table iteratively
for (Map.Entry<String, Collection<Integer>> entry : fruitIds.entrySet()) {
System.out.println(entry.getKey() + " => " + entry.getValue());
}
}
}
If you execute this code, you will get the following output:
Mango => []
Apple => []
Orange => [1234, 1244]

how to manipulate list in java

Edit: My list is sorted as it is coming from a DB
I have an ArrayList that has objects of class People. People has two properties: ssn and terminationReason. So my list looks like this
ArrayList:
ssn TerminatinoReason
123456789 Reason1
123456789 Reason2
123456789 Reason3
568956899 Reason2
000000001 Reason3
000000001 Reason2
I want to change this list up so that there are no duplicates and termination reasons are seperated by commas.
so above list would become
New ArrayList:
ssn TerminatinoReason
123456789 Reason1, Reason2, Reason3
568956899 Reason2
000000001 Reason3, Reason2
I have something going where I am looping through the original list and matching ssn's but it does not seem to work.
Can someone help?
Code I was using was:
String ssn = "";
Iterator it = results.iterator();
ArrayList newList = new ArrayList();
People ob;
while (it.hasNext())
{
ob = (People) it.next();
if (ssn.equalsIgnoreCase(""))
{
newList.add(ob);
ssn = ob.getSSN();
}
else if (ssn.equalsIgnoreCase(ob.getSSN()))
{
//should I get last object from new list and append this termination reason?
ob.getTerminationReason()
}
}
To me, this seems like a good case to use a Multimap, which would allow storing multiple values for a single key.
The Google Collections has a Multimap implementation.
This may mean that the Person object's ssn and terminationReason fields may have to be taken out to be a key and value, respectively. (And those fields will be assumed to be String.)
Basically, it can be used as follows:
Multimap<String, String> m = HashMultimap.create();
// In reality, the following would probably be iterating over the
// Person objects returned from the database, and calling the
// getSSN and getTerminationReasons methods.
m.put("0000001", "Reason1");
m.put("0000001", "Reason2");
m.put("0000001", "Reason3");
m.put("0000002", "Reason1");
m.put("0000002", "Reason2");
m.put("0000002", "Reason3");
for (String ssn : m.keySet())
{
// For each SSN, the termination reasons can be retrieved.
Collection<String> termReasonsList = m.get(ssn);
// Do something with the list of reasons.
}
If necessary, a comma-separated list of a Collection can be produced:
StringBuilder sb = new StringBuilder();
for (String reason : termReasonsList)
{
sb.append(reason);
sb.append(", ");
}
sb.delete(sb.length() - 2, sb.length());
String commaSepList = sb.toString();
This could once again be set to the terminationReason field.
An alternative, as Jonik mentioned in the comments, is to use the StringUtils.join method from Apache Commons Lang could be used to create a comma-separated list.
It should also be noted that the Multimap doesn't specify whether an implementation should or should not allow duplicate key/value pairs, so one should look at which type of Multimap to use.
In this example, the HashMultimap is a good choice, as it does not allow duplicate key/value pairs. This would automatically eliminate any duplicate reasons given for one specific person.
What you might need is a Hash. HashMap maybe usable.
Override equals() and hashCode() inside your People Class.
Make hashCode return the people (person) SSN. This way you will have all People objects with the same SSN in the same "bucket".
Keep in mind that the Map interface implementation classes use key/value pairs for holding your objects so you will have something like myHashMap.add("ssn",peopleobject);
List<People> newlst = new ArrayList<People>();
People last = null;
for (People p : listFromDB) {
if (last == null || !last.ssn.equals(p.ssn)) {
last = new People();
last.ssn = p.ssn;
last.terminationReason = "";
newlst.add(last);
}
if (last.terminationReason.length() > 0) {
last.terminationReason += ", ";
}
last.terminationReason += p.terminationReason;
}
And you get the aggregated list in newlst.
Update: If you are using MySQL, you can use the GROUP_CONCAT function to extract data in your required format. I don't know whether other DB engines have similar function or not.
Update 2: Removed the unnecessary sorting.
Two possible problems:
This won't work if your list isn't sorted
You aren't doing anything with ob.getTerminationReason(). I think you mean to add it to the previous object.
EDIT: Now that i see you´ve edited your question.
As your list is sorted, (by ssn I presume)
Integer currentSSN = null;
List<People> peoplelist = getSortedList();//gets sorted list from DB.
/*Uses foreach construct instead of iterators*/
for (People person:peopleList){
if (currentSSN != null && people.getSSN().equals(currentSSN)){
//same person
system.out.print(person.getReason()+" ");//writes termination reason
}
else{//person has changed. New row.
currentSSN = person.getSSN();
system.out.println(" ");//new row.
system.out.print(person.getSSN()+ " ");//writes row header.
}
}
If you don´t want to display the contents of your list, you could use it to create a MAP and then use it as shown below.
If your list is not sorted
Maybe you should try a different approach, using a Map. Here, ssn would be the key of the map, and values could be a list of People
Map<Integer,List<People>> mymap = getMap();//loads a Map from input data.
for(Integer ssn:mymap.keyset()){
dorow(ssn,mymap.get(ssn));
}
public void dorow(Integer ssn, List<People> reasons){
system.out.print(ssn+" ");
for (People people:reasons){
system.out.print(people.getTerminationReason()+" ");
}
system.out.println("-----");//row separator.
Last but not least, you should override your hashCode() and equals() method on People class.
for example
public void int hashcode(){
return 3*this.reason.hascode();
}

Categories

Resources