Hashmap get function returns null - java

I have a hashmap which is
public HashMap<String, ArrayList<Integer>> invertedList;
I show you my invertedList in watch list during debugging:
invertedList.toString(): "{ryerson=[0, 2, 3], 23=[3], award=[1], andisheh=[0, 2]}"
In the same watch list when I enter:
invertedList.get("ryerson")
I get null as result, also in the code. As you can see "ryerson" is already there as a key in my invertedList and I should get [0, 2, 3] as a result!!! What is happening here? I'm so confused!
I know there is a problem with ArrayList as values, because I tested Integer as values and it worked fine, but still don't know how to solve it. I am new to java, used to work with C#.
The complete code of invertedList:
public class InvertedIndex {
public HashMap<String, ArrayList<Integer>> invertedList;
public ArrayList<String> documents;
public InvertedIndex(){
invertedList = new HashMap<String, ArrayList<Integer>>();
documents = new ArrayList<String>();
}
public void buildFromTextFile(String fileName) throws IOException {
FileReader fileReader = new FileReader(fileName);
BufferedReader bufferedReader = new BufferedReader(fileReader);
int documentId = 0;
while(true){
String line = bufferedReader.readLine();
if(line == null){
break;
}
String[] words = line.split("\\W+");
for (String word : words) {
word = word.toLowerCase();
if(!invertedList.containsKey(word))
invertedList.put(word, new ArrayList<Integer>());
invertedList.get(word).add(documentId);
}
documents.add(line);
documentId++;
}
bufferedReader.close();
}
The test code:
#Test
public void testBuildFromTextFile() throws IOException {
InvertedIndex invertedIndex = new InvertedIndex();
invertedIndex.buildFromTextFile("input.tsv");
Assert.assertEquals("{ryerson=[0, 2, 3], 23=[3], award=[1], andisheh=[0, 2]}", invertedIndex.invertedList.toString());
ArrayList<Integer> resultIds = invertedList.get("ryerson");
ArrayList<Integer> expectedResult = new ArrayList<Integer>();
expectedResult.add(0);
expectedResult.add(2);
expectedResult.add(3);
Assert.assertEquals(expectedResult, resultIds);
}
The first Assert works fine, the second one, resultIds is null.

If I'm reading this right, and assuming correctly, this test function is inside the InvertedIndex class. I only make that assumption because the line
ArrayList<Integer> resultIds = invertedList.get("ryerson");
should actually be uncompilable as there is no local variable called "invertedList".
That line should read
ArrayList<Integer> resultIds = invertedIndex.invertedList.get("ryerson");

Your first assert tests the value of invertedIndex.invertedList. The second one gets a value from invertedList, and not from invertedIndex.invertedList. You've probably defined a map with the same name in your test, which is different from the one used by invertedIndex.

Related

Displaying word frequencies of 0 in an ArrayList

i'm looking for some assistance. I've made a program that uses two classes - that i've also made. The first class is called CollectionOfWords that reads in text-files and store the words contained in the text-files within a HashMap. The second is called WordFrequencies that calls an object called Collection from the CollectionOfWords class, which in turn reads in another document and to see if the documents contents are in the Collection. This then outputs an ArrayList with the frequencies counted in the document.
Whilst this works and returns the frequencies of the words found in both the collection and document, i'd like it to be able to produce zero values for the words that are in the collection, but not in the document, if that makes sense? For example, test3 returns [1, 1, 1], but i'd like it to return [1, 0, 0, 0, 1, 0, 1] - where the zeroes represent the words in the collection, but are not found in test3.
The test text-files i use can be found here:
https://drive.google.com/open?id=1B1cDpjmZZo01HizxJUSWSVIlHcQke2mU
Cheers
WordFrequencies
public class WordFrequencies {
static HashMap<String, Integer> collection = new HashMap<>();
private static ArrayList<Integer> processDocument(String inFileName) throws IOException {
// Rests collections frequency values to zero
collection.clear();
// Reads in the new document file to an ArrayList
Scanner textFile = new Scanner(new File(inFileName));
ArrayList<String> file = new ArrayList<String>();
while(textFile.hasNext()) {
file.add(textFile.next().trim().toLowerCase());
}
/* Iterates the ArrayList of words -and- updates collection with
frequency of words in the document */
for(String word : file) {
Integer dict = collection.get(word);
if (!collection.containsKey(word)) {
collection.put(word, 1);
} else {
collection.put(word, dict + 1);
}
}
textFile.close();
// Stores the frequency values in an ArrayList
ArrayList<Integer> values = new ArrayList<>(collection.values());
return values;
}
public static void main(String[] args) {
// Stores text files for the dictionary (collection of words)
List<String> textFileList = Arrays.asList("Test.txt", "Test2.txt");
// Declares empty ArrayLists for output of processDocument function
ArrayList<Integer> test3 = new ArrayList<Integer>();
ArrayList<Integer> test4 = new ArrayList<Integer>();
// Creates a new CollectionOfWords object called dictionary
CollectionOfWords dictionary = new CollectionOfWords(collection);
// Reads in the ArrayLists text files and processes it
for (String text : textFileList) {
dictionary.scanFile(text);
}
try {
test3 = processDocument("test3.txt");
test4 = processDocument("test4.txt");
} catch(IOException e){
e.printStackTrace();
}
System.out.println(test3);
System.out.println(test4);
}
}
CollectionOfWords
public class CollectionOfWords {
// Declare set in a higher scope (making it a property within the object)
private HashMap<String, Integer> collection = new HashMap<String, Integer>();
// Assigns the value of the parameter to the field of the same name
public CollectionOfWords(HashMap<String, Integer> collection) {
this.collection = collection;
}
// Gets input text file, removes white spaces and adds to dictionary object
public void scanFile(String textFileName) {
try {
Scanner textFile = new Scanner(new File(textFileName));
while (textFile.hasNext()) {
collection.put(textFile.next().trim(), 0);
}
textFile.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
public void printDict(HashMap<String, Integer> dictionary) {
System.out.println(dictionary.keySet());
}
}
I didn't go through the trouble of figuring out your entire code, so sorry if this answer is stupid.
As a solution to your problem, you could initialize the map with every word in the dictionary mapping to zero. Right now, you use the clear method on the hashmap, this does not set everything to zero, but removes all the mappings.
The following code should work, use it instead of collection.clear()
for (Map.Entry<String, Integer> entry : collection.entrySet()) {
entry.setValue(0);
}

Why do I need to create an array many times?

This programm shuffles a source list by pairs. So that original list
"1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20"
trasfoms to
11^12 19^20 17^18 15^16 1^2 5^6 3^4 13^14 7^8 9^10
The above is true while commented line is uncommented. Now, if line A is commented then all the elements in shuffleList are 19^20.
public class ShuffleService {
public static void shuffleList(List<String> list) {
System.out.println(list);
ArrayList<String[]> shuffleList = new ArrayList<String[]>(10);
String[] arr = new String[2];
boolean flag = false;
int step = 0;
for(String s: list){
if(flag){
arr[1]=s;
} else {
arr[0]=s;
}
flag=!flag;
step++;
if(step==2){
shuffleList.add(arr);
step=0;
//arr = new String[2]; //**line A**
}
}
Collections.shuffle(shuffleList);
for(String[] val: shuffleList){
System.out.print(val[0]);
System.out.print("^");
System.out.println(val[1]);
}
}
public static void main(String[] args) {
String[] a = new String[]{"1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20"};
List<String> list1 = Arrays.asList(a);
shuffleList(list1);
}
}
So why do I need to uncomment line A in the program to work properly?
Because when you rewrite the values to arr (without remaking it), you're also going to modify the values already in the list.
Adding an object to the list doesn't stop you from modifying it, it will not make copies on its own. By calling new String[2] in your loop you're effectively building a new string array for each pair that you add to the list, which is what you want.

How to compare String to all elements in an array? [duplicate]

This question already has answers here:
How do I determine whether an array contains a particular value in Java?
(30 answers)
Closed 7 years ago.
I am looking for a way to compare a string (which in this case will be a line from a text file) to every element in an array and see if there is a match. At a high level overview, I have a string array (about 100 elements) full of strings that are all contained somewhere in the file that need to be deleted. So I am reading a file into a StringBuffer and writing each line, except skipping over all lines that match an element in the array. This is what I have so far:
//Main Class calling the method
public class TestApp {
public static void main(String[] args) {
CompareAndDelete.RemoveDuplicateLines("C:/somelocation", 2Darray);
}
}
public class CompareAndDelete {
static string Line_of_Text;
static StringBuffer localBuff = new StringBuffer();
static FileReader Buffer;
static BufferedReader User_File;
public static void RemoveDuplicateLines(String local, String[][] duplicates) throws IOException
{
//Converting 2D array to one-dimensional array
final String[] finalDups = new String[duplicates.length];
for(int i = 0; i < duplicates.length; i++)
{
finalDups[i] = duplicates[i][0]+" "+duplicates[i][1];
}
int count = 0;
User_File = new BufferedReader(Buffer);
Set<String> Values = new HashSet<String>(Arrays.asList(finalDups));
while((Line_of_Text = User_File.readLine()) != null){
if(!(Values.contains(Line_of_Text))){
localBuff.append(Line_of_Text+"\n");
}else{
count++;
}
}
System.out.println(count);
//Printing StringBuffer to file
BufferedWriter testOutFile = new BufferedWriter(new FileWriter("C:/test.txt"));
testOutFile.write(localBuff.toString());
testOutFile.flush();
testOutFile.close();
}
So I am unsure of the IF statment, I know that it does not work properly, it currently is only removing the first few elements in the new StringBuffer because those lines happen to be towards the end of the file, and it does not recheck every line for a match with each element. I know there has to be a better way to do this... Thanks in advance for any help/suggestions.
**Updated: with code above, it is now throwing the following error on this line:
while((Line_of_Text = User_File.readLine()) != null){
Error:
Exception in thread "main" java.io.IOException: Stream closed
at sun.nio.cs.StreamDecoder.ensureOpen(StreamDecoder.java:51)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:204)
at java.io.InputStreamReader.read(InputStreamReader.java:188)
at java.io.BufferedReader.fill(BufferedReader.java:147)
at java.io.BufferedReader.readLine(BufferedReader.java:310)
at java.io.BufferedReader.readLine(BufferedReader.java:373)
at compare.CompareAndDelete.RemoveDuplicateLines(CompareAndDelete.java:48)
at mainPackage.TestApp.main(TestApp.java:326)
This can be accomplished quite efficiently by adding your String array members to a Set, and then checking whether the set contains() the current line. Example:
Set<String> ignoredStrings = new HashSet<String>(Arrays.asList(arr));
String line;
while ((line = file.readLine()) != null) {
if (!ignoredStrings.contains(line)) {
buffer.append(line);
buffer.append("\n");
}
}
Here is a method:
public boolean isStringInArray(String str, String[] strarr){
for(String s: strarr){
if(str.equals(s)) return true;
}
return false
}
isStringInArray("Hello", new String[]{"Hello", "World"}); // True
isStringInArray("Hello", new String[]{"hello", "World"}); // False

How to find all error messages and display them in descending order

Hi I am trying to sort input file from user for error messages in descending orders of occurrence.
input_file.txt
23545 debug code_to_debug
43535 error check your code
34243 error check values
32442 run program execute
24525 error check your code
I want to get output as
error check your code
error check values
My code currently:
import java.io.*;
import java.util.*;
public class Sort {
public static void main(String[] args) throws Exception {
BufferedReader reader = new BufferedReader(new FileReader("fileToRead"));
Map<String, String> map=new TreeMap<String, String>();
String line="";
while((line=reader.readLine())!=null){
map.put(getField(line),line);
}
reader.close();
FileWriter writer = new FileWriter("fileToWrite");
for(String val : map.values()){
writer.write(val);
writer.write('\n');
}
writer.close();
}
private static String getField(String line) {
return line.split(" ")[0];//extract value you want to sort on
}
}
Change your mapping from <String, String> to <Integer, String>. Then, use a custom Comparator to compare the Integers from least to greatest.
It appears that your error messages are ranked by an integer value from most severe to least severe. This should allow you to use that fact.
Rather than having a Map<String,String> where the key is the integer value you could have the key as the error message and then the value could hold a list of the integer values so when reading the file it would become something like and also implement a comparator in the map to order them:
Map<String, String> map = new TreeMap<String, List<String>>(new Comparator<String>()
{
#Override
public int compare(String s1, String s2)
{
//Implement a compare to get the order of string you want
}
}
);
String line = "";
while((line = reader.readLine()) != null)
{
String lineStr = line.split(" ")[1]; // get the message
List<String> vals = map.get(lineStr) // get the existing list
if( vals == null)
vals = new ArrayList<String>(); // create a new list if there isn't one
vals.add(getFeild(line)); // add the int value to the list
map.put(lineStr,vals); // add to map
}
You could then sort the list into numeric order if you wanted. Also this would then require a bit more work to print out the map - but this depends on the format
If all you want to do is reorder the input so all the error messages appear at the top, a very simple way to do it is like the following:
static String[] errorsToTop(String[] input) {
String[] output = new String[input.length];
int i = 0;
for(String line : input) {
if(line.contains("error"))
output[i++] = line;
}
for(String line : input) {
if(!line.contains("error"))
output[i++] = line;
}
return output;
}
That just copies the array first starting with all errors messages, then will all non-error messages.
It's also possible to make those two loops a nested loop though the logic is less obvious.
static String[] errorsToTop(String[] input) {
String[] output = new String[input.length];
int i = 0;
boolean not = false;
do {
for(String line : input) {
if(line.contains("error") ^ not)
output[i++] = line;
}
} while(not = !not);
return output;
}
It's unclear to me whether the numbers appear in your input text file or not. If they don't, you can use startsWith instead of contains:
if(line.startsWith("error"))
You could also use matches with a regex like:
if(line.matches("^\\d+ error[\\s\\S]*"))
which says "starts with any integer followed by a space followed by error followed by anything or nothing".
Since no answer has been marked I'll add 2 cents.
This code below works for exactly what you posted (and maybe nothing else), it assumes that errors have higher numbers than non errors, and that you are grabbing top N of lines based on a time slice or something.
import java.util.NavigableMap;
import java.util.TreeMap;
public class SortDesc {
public static void main(String[] args) {
NavigableMap<Integer, String> descendingMap = new TreeMap<Integer, String>().descendingMap();
descendingMap.put(23545, "debug code_to_debug");
descendingMap.put(43535, "error check your code");
descendingMap.put(34243, "error check values");
descendingMap.put(32442, "run program execute");
descendingMap.put(24525, "error check your code");
System.out.println(descendingMap);
}
}
results look like this
{43535=error check your code, 34243=error check values, 32442=run program execute, 24525=error check your code, 23545=debug code_to_debug}

Java HashMap content seems to change without changing it

I have a problem concerning a HashMap in Java. To explain the problem in a detailed way, i will first post some code you can refer to.
public void BLASTroute(String args[]) throws IOException, InterruptedException{
...
correctMapping CM = new correctMapping();
CM.correctMapping(RB.BLASTresults, exists);
CalculateNewConsensusSequence CNCS =
new CalculateNewConsensusSequence();
char[] consensus = CNCS.calculateNewConsensusSequence(
CM.newSeq, CM.remindGaps, EMBLreaderReference.sequence, exists);
HashMap<Integer, ArrayList<String>> gapsFused =
new HashMap<Integer, ArrayList<String>>();
for (Integer i : CM.remindGaps.keySet()) {
ArrayList<String> newList = CM.remindGaps.get(i);
gapsFused.put(i, newList);
}
GenerateGeneLists GGL = new GenerateGeneLists(
EMBLreaderReference, CM.newSeq, gapsFused, exists,
GQList, allMappedPositions);
System.out.println(CM.remindGaps.hashCode());
gapsFused=GGL.generateGeneListSNP(gapsFused);
System.out.println(CM.remindGaps.hashCode());
System.out.println(gapsFused.hashCode());
GGL.generateGeneListFrameShift(gapsFused);
}
The following occurs:
in my class correctMapping, i fill a global variable called remindGaps. I use it later in some functions, and nothing happens/everything works as expected.
Then, i make a copy of the HashMap called gapsFused (i don't know if this has something to do with my problem).
Now comes the interesting part: In the class GenerateGeneLists, i don't do a thing with the remindGaps HashMap.
However, after the function generateGeneListSNP is performed, remindGaps changed! I'll post the code for you as well, so that you can help me better:
public GenerateGeneLists(EMBL_reader EMBLreaderReference,
HashMap<String,ArrayList<String>> newSeq,
HashMap<Integer,ArrayList<String>> gapsFused, File exists,
ArrayList<GeneQualifier> GQlist,
HashMap<Integer,Integer> allMappedPositions)
throws InterruptedException{
this.EMBLreaderReference=EMBLreaderReference;
this.newSeq=newSeq;
//this.gapsFused=gapsFused;
this.exists=exists;
this.GQlist=GQlist;
this.allMappedPositions=allMappedPositions;
for (GeneQualifier GQ : this.GQlist){
startlist.add(GQ.start);
stoplist.add(GQ.stop);
startMap.put(GQ.start,GQ);
}
}
public HashMap<Integer,ArrayList<String>> generateGeneListSNP(
HashMap<Integer,ArrayList<String>> gapsFused)
throws IOException{
File GQSNP = new File (exists+"/GQsnp.txt");
BufferedWriter SNP = new BufferedWriter(new FileWriter(GQSNP));
SNP.write("#Gene_start\tGene_stop\tlocus_tag\tproduct" +
"\tputative_SNP_positions(putative_changes)\n");
HashMap<GeneQualifier,ArrayList<Integer>> GQreminder =
new HashMap<GeneQualifier,ArrayList<Integer>>();
for (String s : newSeq.keySet()){
ArrayList<String> blub = newSeq.get(s);
char[] qrySeq = blub.get(0).toCharArray();
char[] refSeq = blub.get(1).toCharArray();
int start = Integer.valueOf(blub.get(2));
int stop = Integer.valueOf(blub.get(3));
for (int i=0;i<refSeq.length;i++){
if (qrySeq[i]!=refSeq[i]&&qrySeq[i]!='-'&&qrySeq[i]!='.'){
if (mismatchList.containsKey(start+i)){
ArrayList<Character> blah = mismatchList.get(start+i);
blah.add(qrySeq[i]);
mismatchList.put(start+i, blah);
}
else {
ArrayList<Character> blah = new ArrayList<Character>();
blah.add(qrySeq[i]);
mismatchList.put(start+i,blah);
}
}
else if (qrySeq[i]!=refSeq[i]&&(qrySeq[i]=='-'||qrySeq[i]=='.')){
if (!gapsFused.containsKey(start+i)){
ArrayList<String> qwer = new ArrayList<String>();
qwer.add(String.valueOf(qrySeq[i]));
gapsFused.put(start+i,qwer);
}
else {
ArrayList<String> qwer = gapsFused.get(start+i);
qwer.add(String.valueOf(qrySeq[i]));
gapsFused.put(start+i,qwer);
}
if (!deletionPositionsAndCount.containsKey((start+i))){
int count = 1;
deletionPositionsAndCount.put(start+i, count);
}
else {
int count = deletionPositionsAndCount.get(start+i);
count = count+1;
deletionPositionsAndCount.put(start+i, count);
}
}
}
}
for (Integer a : mismatchList.keySet()){
for (int i=0;i<startlist.size();i++){
int start = startlist.get(i);
int stop = stoplist.get(i);
if (a>=start && a<=stop){
GeneQualifier GQ = startMap.get(start);
if (!GQreminder.containsKey(GQ)){
ArrayList save = new ArrayList<Integer>();
save.add(a);
GQreminder.put(GQ,save);
}
else {
ArrayList save = GQreminder.get(GQ);
save.add(a);
GQreminder.put(GQ,save);
}
break;
}
}
}
for (GeneQualifier GQ : GQreminder.keySet()) {
ArrayList<Integer> save = GQreminder.get(GQ);
int start = GQ.start;
int stop = GQ.stop;
String locus_tag =
GQ.geneFeatures.get("locus_tag").get(0).replace("\n", "");
String product =
GQ.geneFeatures.get("product").get(0).replace("\n", "");
SNP.write(start + "\t" + stop + "\t" + locus_tag +
"\t" + product + "\t");
boolean end = false;
for (int i = 0; i < save.size(); i++) {
if (i==save.size()-1) end=true;
int posi = save.get(i);
SNP.write(posi + "(");
ArrayList<Character> mismatches = mismatchList.get(posi);
for (int j = 0; j < mismatches.size(); j++) {
char snipp = mismatches.get(j);
if (j == mismatches.size() - 1) {
SNP.write(snipp + ")");
} else {
SNP.write(snipp + ",");
}
}
if (end == false){
SNP.write(",");
}
}
SNP.write("\n");
}
SNP.close();
return gapsFused;
}
As you can see, remindGaps is not used in this class, but still it undergoes changes. Do you have an idea why this is the case?
What I tested is, whether remindGaps changes if i manually change gapsFused (the made copy of the first HashMap). This is not the case, so i don't think that the copying process went wrong (for example only points to the other HashMap or references it).
I would really appreciate your ideas and help in order to solve this problem.
You have to remember that in Java all objects are passed as reference. So, when you did:
ArrayList<String> newList = CM.remindGaps.get(i);
you basically pointed newList to the same list as contained in the remindGaps map. Now, even though you work with the gapsFused, any changes to its values effect the same underlying list in the memory - to which both remindGaps and gapsFused are pointing.
Change your copy code to the following and see if it makes a difference:
ArrayList<String> newList = new ArrayList<String>(CM.remindGaps.get(i));
By doing this, you are creating a new list that newList will be pointing to and thus the changes will be encapsulated.
Your code is very long and hard to read (mainly because it doesn't respect Java naming conventions), but my guess is that your problem comes from the fact that your copy of the map simply copies the ArrayList references from one map to another:
HashMap<Integer, ArrayList<String>> gapsFused = new HashMap<Integer, ArrayList<String>>();
for (Integer i : CM.remindGaps.keySet()) {
ArrayList<String> newList = CM.remindGaps.get(i);
gapsFused.put(i, newList);
}
In the above code, you don't create any new list. You just store the same lists in another map. If you need a new list, the code should be:
Map<Integer, List<String>> gapsFused = new HashMap<Integer, List<String>>();
for (Integer i : CM.remindGaps.keySet()) {
List<String> newList = new ArrayList<STring>(CM.remindGaps.get(i));
gapsFused.put(i, newList);
}
Without analyzing all your code:
HashMap<Integer, ArrayList<String>> gapsFused = new HashMap<Integer, ArrayList<String>>();
for (Integer i : CM.remindGaps.keySet()) {
ArrayList<String> newList = CM.remindGaps.get(i);
gapsFused.put(i, newList);
}
After this code gapFused will contain entries that are copies of the entries of remindGaps, therefore those entries will reference the same objects (key and values). So if you add or remove entries in one Map it will have no effect on the other, but if you change a value accessing it through one Map you will see the change also accessing it through the other map (for example remingGaps.get(1).add("hello")).
The name "newList" used in your code is confusing because it is not a new list, just a reference on an existing one...
Since the value of the Map is an ArrayList and you are doing just a shallow copy (meaning the new Map has a reference to the same Lists as are in the first Map) and changes to the lists in the second map would be reflected in the first map. To avoid this you would need to make deep copies of the lists when you create the new Map.

Categories

Resources