Issue when iterating through lists: IndexOutOfBoundsException in Java - java

I'm writing a program that applies many principles of computational linguistics. My problem at this moment is the following piece of code form a method that "flexibilizes two definitions". This is, it compares two different definitions of the same word, and in each definition empty or blank spaces will be added to later on work with the altered definitions (with blank spaces added).
Say we have the following two definitions, defining the term "free fall".
1) Free fall descent of a body subjected only to the action of gravity.
2) Free fall movement of a body in a gravitational field under the influence of gravity
There is a list of words called stoplist, which contains the words: "of", "a", "in", "to", and "under". After the process, each word in the definition that is also contained in the stoplist must correspond to a blank space OR another stoplist word of the other definition. So after executing such process, the previous definitions, represented in two different lists, should look like this:
1) Free fall descent of a body ____ ____ subjected only to the action of gravity.
2) Free fall movement of a body in a gravitational field under the influence of gravity.
The code I wrote to achieve this is the following:
[...]
String[] sList = STOPLIST.split(" "); //this is the stoplist
String[] definition1 = defA1.split(" "); //this is the array of words of the first definition
String[] definition2 = defA2.split(" "); //this is the array of words of the second definition
List<String> def1 = new ArrayList<String>();
List<String> def2 = new ArrayList<String>();
List<String> stopList = new ArrayList<String>();
for(String word : definition1){
def1.add(word); //I transform arrays into lists this way because I used to think that using .asList() was the problem.
}
for(String word : definition2){
def2.add(word);
}
for(String word : sList){
stopList.add(word);
}
int mdef = (def1.size() <= def2.size()) ? def1.size() : def2.size(); //here mdef will have the value of the lenght of the shortest definition, and we are going to use the value of mdef to iterate later on.
for(int i = 0; i < mdef; i++){
if (stopList.contains(def1.get(i))) { //here I check if the first word of the first definition is also found in the stoplist.
if (!stopList.contains(def2.get(i))) { //If the word of def1 previously checked is in the stoplist, as well as the corresponding word in the second definition, then we won't add a " "(blank) space in the corresponding position of the second definition.
def2.add(i , " "); //here I add that blank space, only if the stoplist word in def1 corresponds to a non-stoplist word in def2. Again, we do this so the stoplist word in def1 corresponds to a blank space OR another stoplist word in def2.
if(mdef == def2.size())
mdef++; //In case the shortest definition is the definition to which we just added spaces, we increment mdef++, because that space added increases the length of the shortest definition, and to iterate in this recenlty extended definiton, we have to increment the index with which we iterate.
}
} else if (stopList.contains(def2.get(i))) { //this else if does the same than the previous one, but checks for the second definition instead of the first one. And adds blanks to def1 instead of def2 if necessary.
if (!stopList.contains(def1.get(i))) {
def1.add(i , " ");
if(mdef == def1.size())
mdef++;
}
}
}
[...]
Now, if you analyze the code carefully, you will realize that not all words of the lengthiest list will be checked, given that we iterate ove the definitions using the lenght of the shortest definition as index. This is fine, the remainding words of the lenghtiest definitions don't have to be checked, they will correspond to null spaces of the other definition (in case the lists don't end up being of the same lenght after the addition of spaces, as the previous exaple shows).
Now, after the explanation, the problem is the following: after running the main class, which calls the method that contains the previous code, a runtime exceptions pops out:
Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 1, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:571)
at java.util.ArrayList.get(ArrayList.java:349)
at main2.main(main2.java:75)
I don't understand why it is finding any of the lists as "empty". I have tried to solve it in too many ways, I hope a I gave a good explanation.
It may help as a clue that if I assign mdef to the lengthiest size instead of the shortest, that is :
int mdef = (def1.size() >= def2.size()) ? def1.size() : def2.size();
the error changes to:
Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 15, Size: 15
at java.util.ArrayList.rangeCheck(ArrayList.java:571)
at java.util.ArrayList.get(ArrayList.java:349)
at asmethods.lcc.turnIntoFlex(lcc.java:55)
at asmethods.lcc.calLcc(lcc.java:99)
at main2.main(main2.java:73)'
Where lcc is the class that contains the method turnIntoFlex that contains the piece of code I'm showing. The line 55 of "turnIntoFlex" corresponds to the first line of the loop, that is:
if (stopList.contains(def1.get(i))) { [...]
Comments: The values of defA1 and defA2 are the definitions, respectively. i.e. def1 and def2, initially, are lists in which each separate element is a word. I can't check if these lists are being populated by printing them because the indexoutofboundsexception pops at the very moment the loop starts. However, I do print the values of the sizes of mdef, def1.size() and def2.size(), and the values turn out to be 13, or 15, showing that no list is empty before the "for" loop starts.
The mdef++ was something I added recently, not to exactly to solve this specific problem, but the error has been popping since before I added the mdef++ part. As I explained, The intention is to increase mdef++ when the shortest list is extended (but only when the short list is extended) so we iterate through all the words of the short list, and not more.

One issue with your code is that when you increment mdef you do not check to see if it now exceeds the length of the other list.
For example, suppose def1 had 3 words and def2 had 4 words. mdef would start at 3. But then suppose you successively add two spaces to def1 and increment mdef twice to be 5. This now exceeds the length of def2 and will then cause an index out of bounds exception in the def2 else condition if you keep iterating up to 5.
Added later:
Another serious issue with your code (that I thought of later) is that when you add the space to a list (either def1 or def2) this shifts the indices of all of the subsequent elements up by 1. So, for example, if you add a space at spot 0 in def1 when i is 0, then on the next pass through the loop, having incremented i to 1, you will look at the same word in def1 that you looked at in the previous pass. This is probably the source of some of your exceptions (as it would lead to a continual loop until you exceed the length of the other list: problem #1 above).
To correct both of these issues, you would need to change your code to something like:
int i = 0;
int j = 0;
while (i < def1.size() && j < def2.size()) {
if (stopList.contains(def1.get(i)) && !stopList.contains(def2.get(j)))
def2.add(j++, " ");
else if (stopList.contains(def2.get(j)) && !stopList.contains(def1.get(i)))
def1.add(i++, " ");
++i;
++j;
}
Note that you don't ned mdef any more in this implementation.

Man, I think I got it. I modified the code, but I hope you understand what I did:
static public void main(String[] argv) {
String[] sList = "of a in to under".split(" ");
String[] definition1 = "Free fall descent of a body subjected only to the action of gravity"
.split(" ");
String[] definition2 = "Free fall movement of a body in a gravitational field under the influence of gravity"
.split(" ");
List<String> def1 = new ArrayList<String>();
List<String> def2 = new ArrayList<String>();
List<String> stopList = new ArrayList<String>();
for (String word : definition1) {
def1.add(word);
}
for (String word : definition2) {
def2.add(word);
}
for (String word : sList) {
stopList.add(word);
}
int mdef = (def1.size() <= def2.size()) ? def1.size() : def2.size(); // Shortest
// length
for (int i = 0; i < mdef; i++) {
System.out.println(i);
if (!stopList.contains(def1.get(i)) && !stopList.contains(def2.get(i))) {
continue;
}
else if (stopList.contains(def1.get(i)) && stopList.contains(def2.get(i))) {
continue;
}
else if (!stopList.contains(def1.get(i)) && stopList.contains(def2.get(i))) {
def1.add(i, " ");
mdef = (def1.size() <= def2.size()) ? def1.size() : def2.size(); // define mdef again
}
else if (stopList.contains(def1.get(i)) && !stopList.contains(def2.get(i))) {
def2.add(i, " ");
mdef = (def1.size() <= def2.size()) ? def1.size() : def2.size(); // define mdef again
}
}
for (String word : def1) {
if (word.equals(" "))
System.out.print("_ ");
else
System.out.print(word+" ");
}
System.out.println();
for (String word : def2) {
if (word.equals(" "))
System.out.print("_ ");
else
System.out.print(word+" ");
}
}

Is this the exact code you're using? I just ran it and it worked fine, I used:
import java.util.*;
public class HelloWorld {
public static void main(String []args) {
String stoplist= "of a in to and under";
String defA1 = "Free fall descent of a body subjected only to the action of gravity";
String defA2 = "Free fall movement of a body in a gravitational field under the influence of gravity";
String[] sList = stoplist.split(" "); //this is the stoplist
String[] definition1 = defA1.split(" "); //this is the array of words of the first definition
String[] definition2 = defA2.split(" "); //this is the array of words of the second definition
List<String> def1 = new ArrayList<String>();
List<String> def2 = new ArrayList<String>();
List<String> stopList = new ArrayList<String>();
for (String word : definition1) {
def1.add(word); //I transform arrays into lists this way because I used to think that using .asList() was the problem.
}
for (String word : definition2) {
def2.add(word);
}
for (String word : sList) {
stopList.add(word);
}
int mdef = (def1.size() <= def2.size()) ? def1.size() : def2.size(); //here mdef will have the value of the lenght of the shortest definition, and we are going to use the value of mdef to iterate later on.
for (int i = 0; i < mdef; i++) {
if (stopList.contains(def1.get(i))) { //here I check if the first word of the first definition is also found in the stoplist.
if (!stopList.contains(def2.get(i))) { //If the word of def1 previously checked is in the stoplist, as well as the corresponding word in the second definition, then we won't add a " "(blank) space in the corresponding position of the second definition.
def2.add(i , " "); //here I add that blank space, only if the stoplist word in def1 corresponds to a non-stoplist word in def2. Again, we do this so the stoplist word in def1 corresponds to a blank space OR another stoplist word in def2.
if (mdef == def2.size())
mdef++; //In case the shortest definition is the definition to which we just added spaces, we increment mdef++, because that space added increases the length of the shortest definition, and to iterate in this recenlty extended definiton, we have to increment the index with which we iterate.
}
} else if (stopList.contains(def2.get(i))) { //this else if does the same than the previous one, but checks for the second definition instead of the first one. And adds blanks to def1 instead of def2 if necessary.
if (!stopList.contains(def1.get(i))) {
def1.add(i , " ");
if (mdef == def1.size())
mdef++;
}
}
}
for (String word : def1) {
System.out.print(word+",");
}
System.out.println();
for (String word : def2) {
System.out.print(word+",");
}
}
}

Related

Comparing two Strings if word spacing and capitalization do not matter-Java

What I want to do is create a method that takes two objects as input
of type String. The method will return logical truth if both strings are the same (word spacing and capitalization do not matter). I thought to split String, make an Array of elements, add each element to List and then compare each element to space and remove it from List. At the end use a compareToIgnoreCase() method. I stopped on removing space from List for string2. It works to string1List and doesn't work to string2List, I'm wondering why?? :(
I will be grateful for help, I spend a lot of time on it and I'm stuck. Maybe someone know a better solution.
import java.util.ArrayList;
import java.util.List;
public class Strings {
public static void main(String[] args) {
String string1 = "This is a first string";
String string2 = "this is a first string";
String[] arrayOfString1 = string1.split("");
List<String> string1List = new ArrayList<>();
for (int i = 0; i < arrayOfString1.length; ++i) {
string1List.add(arrayOfString1[0 + i]);
}
String[] arrayOfString2 = string2.split("");
List<String> string2List = new ArrayList<>();
for (int i = 0; i < arrayOfString2.length; ++i) {
string2List.add(arrayOfString2[0 + i]);
}
for (int i = 0; i < string1List.size(); ++i) {
String character = string1List.get(0 + i);
if (character.equals(" ")) {
string1List.remove(character);
}
}
for (int i = 0; i < string2List.size(); ++i) {
String character = string2List.get(0 + i);
if (character.equals(" ")) {
string2List.remove(character);
}
}
System.out.println(string2List.size());
}
}
You can try below solution. As you mentioned word spacing and capitalization do not matter
1.remove capitalization - using toLowercase()
2.for word spacing - remove all word spacing using removeAll() with regex pattern "\\s+" so it removes all spaces.
3. check both strings now.
public class StringChecker {
public static void main(String[] args) {
System.out.println(checkString("This is a first string", "this is a first string"));
}
public static boolean checkString(String string1, String string2){
String processedStr1 = string1.toLowerCase().replaceAll("\\s+", "");
String processedStr2 = string2.toLowerCase().replaceAll("\\s+", "");
System.out.println(" s1 : " + processedStr1);
System.out.println(" s2 : " + processedStr2);
return processedStr1.equals(processedStr2);
}
}
Your problem has nothing to do with spaces. You can replace them with any other character (for example "a") to test this. Therefore, removing spaces in any of the methods given above will not improve your code.
The source of the problem is iterating the list with the for command. When you remove an item from a list inside the for loop, after removing the i-th element, the next element in the list becomes the i-th current element.
On the next repetition of the loop - when i is incremented by one - the current i + 1 item becomes the next item in the list, and thus you "lose" (at least) one item. Therefore, it is a bad idea to iterate through the list with the for command.
However you may use many other methods available for collections - for instance Iterators - and your program will work fine.
Iterator <String> it = string1List.iterator();
while(it.hasNext())
{
if(it.next().equals("a")) it.remove();
}
Of course there is no need at all to use Lists to compare these two strings.

Java word counts: Error: Cannot find symbol. Java compiler

Please help me get this work :(
(My objective is to count words in a file from input.txt file.)
I get 8 compiling errors in the following code. (All cannot find symbol)
Compiler points out:
The letter "P" in "Paths.get".
The letter "F" in "Files.readAllLines"
The symbol "." in "titleList.add(line)"
The symbol "." in "titleList.get"
The symbol "." in "st.add(filterList);"
The symbol "." in "filterList.removelAll(stopWordsArray);"
The letter "n" in "Map map = new Map< String, Integer>();"
The symbol "." in "for (int i =0; i < map.length && i < 20; i++)"
Lines of code with errors are bold.
Any idea what is wrong with my code? I'm a beginer with java and help is greatly aprecited.
public String[] process() throws Exception {
String[] ret = new String[20];
//TODO
// Pass user id
initialRandomGenerator(this.userName);
// Get the index into array
Integer[] indexes = this.getIndexes();
// Create Array to store input.txt
String[] titleList = new String[10000];
// Put each line of input.txt in Array
// ERRORS HERE
**for (String line : Files.readAllLines(Paths.get(this.inputFileName))){
titleList.add(line);
}**
// Create array to store list of words to be proccess
String[] filterList = new String[50000];
// Look for words in file
for (int i = 0; i < indexes.length; i++){
// Tokennize, lower case, trim and stop delimiter.
// ERRORS HERE
**StringTokenizer st = new StringTokenizer(titleList.get(indexes[i]).trim().toLowerCase(), this.delimiters);**
// Add word to Filter list array
**st.add(filterList);**
}
// Remove stopWords from filter list array.
// ERRORS HERE
**filterList.removelAll(stopWordsArray);**
// Declaring the map to count
// ERRORS HERE
**Map<String, Integer> map = new Map< String, Integer>();**
// Loop to count
for (int i = 0; i < filterList.length; i++ ){
// Get the word
String word = filterList[i];
//Count the word
if (map.get(word) != null) {
// another occurence of an existing
// word
int count = map.get(word);
count++;
map.put(word, count);
} else {
// first occurence of this word
map.put(word, 1);
}
}
// Sort the list by frequency in a descending order. Use sort function.
// map.collections.sort( list, new Comparator<Map.Entry<word, count>>();
// Display first 20 words.
// ERRORS HERE
**for (int i =0; i < map.length && i < 20; i++){
System.out.println(filterList[i]);**
}
return ret;
}
Ok, so I got it t work. It was a bit more difficult because the file given could not be debug with the IDE. The issues were pretty simple.
I get 8 compiling errors in the following code. (All cannot find symbol) Compiler points out:
1- The letter "P" in "Paths.get".
(I needed to import library: import java.nio.file.Paths;)
2- The letter "F" in "Files.readAllLines"
(I needed to import library: import java.nio.file.Files;)
3- The symbol "." in "titleList.add(line)"
(The method .add is only available in collections, e.g. ArrayList
I was trying to use it in an Array. )
4- The symbol "." in "titleList.get"
(The method .get is only available in collections, e.g. ArrayList
I was trying to use it in an Array. )
5- The symbol "." in "st.add(filterList);"
(The method .add is only available in collections, e.g. ArrayList
I was trying to use it in a StringTokenizer )
6- The symbol "." in "filterList.removelAll(stopWordsArray);"
(The method .get is only available in collections, e.g. ArrayList
I was trying to use it in an Array. )
7- The letter "n" in "Map map = new Map< String, Integer>();"
(Map needed to be initialise to be a type of Map. Not only Map. E.g. HashMap)
8- The symbol "." in "for (int i =0; i < map.length && i < 20; i++)"
(Length is not a property of maps; Size must be use instead.)
Kind regards

getting the words before a specific String value

I asked a question about the following method a while ago and came up with quite different question. Suppose I have String A B -> C or A B -> carry sum or X Y Cin -> Cout Sum. How can I extract the words before -> without including ->? And then extracting the words after ->?
public void parseContactsLine(String line)
{
String[] words = line.split("->");
for(int i = 0; i < words.length; i++)
{
}
}
Your first split for "->" will separate into two strings the list of words of each side.
Then you can re-split by spaces using .split("\\s"); to get a list of each words.
You would end up with
String[] wordsAfterLambda = line.split("->")[1].split("\\s");
for(String s : wordsAfterLambda)
System.out.println(s);
Notice that I used a for-each instead of for which I tend to prefer when there is no need to keep the index.
Edit
As per your comment, the [1] is to access the array value and not linked to the split itself, it is the same as doing
String[] words = line.split("->");
String[] wordsAfterLambda = words[1].split("\\s");

how to find the index of an item in an array java

I want to find the index of the start up letter and then show the index of that item in array or array list in java.
Example: I have:
String[] arr={"apple","at","book","bad","car","cat"};
when I input a: then i will get index =0, b: i will get index=2, c: i will get index=4
Because my array item is over 20,000 , so using linear Search is too slow.
and my list is unsorted, so i can't use Binary Search also.
I want to get index of the item like the example above, what can i solve with this?
You can run some initialization code (before the user starts to type letters in).
// initialize an array that has a cell for each letter with value -1
int[] firstIndexes = new int[26];
for(int i=0;i<firstIndexes.length;i++) {
firstIndexes[i] = -1;
}
// loop over original array and look for each letter's first occurence
for(int i=0;i<wordsArray.length;i++) {
char c=wordsArray[i][0];
if(firstIndexes[c-'a'] < 0) {
firstIndexes[c-'a'] = i;
}
}
Then when the user types a letter you just need to find its index in the 'firstIndexes' array.
If you want to get all the indexes of words starting with a certain letter then try this one:
While adding the Words to your Array/list (that will hold all your words) you could also add it to a map that will hold all indexes for every first letters.
Map<String, ArrayList<Integer>> myMap = new HashMap<String, ArrayList<Integer>>();
public void yourmethod() {
//adding all your words to an Array/arraylist goes here. (arr[] in this case)
string firstLetter = yourword.substring(0,1);
if(myMap.constainsKey(firstLetter)) {
myMap.get(letter).add(yourword);
} else {
myMap.put(firstLetter, yourword);
}
}

Remove Null elements from a (String) Array in Java

Hey guys, I'm new to Java (well, 3/4 of a year spent on it).
So I don't know much about it, I can do basic things, but the advanced concepts have not been explained to me, and there is so much to learn! So please go a little but easy on me...
Ok, so I have this project where I need to read lines of text from a file into an array but only those which meet specific conditions. Now, I read the lines into the array, and then skip out on all of those which don't meet the criteria. I use a for loop for this. This is fine, but then when I print out my array (required) null values crop up all over the place where I skipped out on the words.
How would I remove the null elements specifically? I have tried looking everywhere, but the explanations have gone way over my head!
Here is the code that I have to deal with the arrays specifically: (scanf is the scanner, created a few lines ago):
//create string array and re-open file
scanf = new Scanner(new File ("3letterWords.txt"));//re-open file
String words [] = new String [countLines];//word array
String read = "";//to read file
int consonant=0;//count consonants
int vowel=0;//count vowels
//scan words into array
for (int i=0; i<countLines; i++)
{
read=scanf.nextLine();
if (read.length()!=0)//skip blank lines
{
//add vowels
if (read.charAt(0)=='a'||read.charAt(0)=='e'||read.charAt(0)=='i'||read.charAt(0)=='o'||read.charAt(0)=='u')
{
if (read.charAt(2)=='a'||read.charAt(2)=='e'||read.charAt(2)=='i'||read.charAt(2)=='o'||read.charAt(2)=='u')
{
words[i]=read;
vowel++;
}
}
//add consonants
if (read.charAt(0)!='a'&&read.charAt(0)!='e'&&read.charAt(0)!='i'&&read.charAt(0)!='o'&&read.charAt(0)!='u')
{
if (read.charAt(2)!='a'&&read.charAt(2)!='e'&&read.charAt(2)!='i'&&read.charAt(2)!='o'&&read.charAt(2)!='u')
{
words[i]=read;
consonant++;
}
}
}//end if
//break out of loop when reached EOF
if (scanf.hasNext()==false)
break;
}//end for
//print data
System.out.println("There are "+vowel+" vowel words\nThere are "+consonant+" consonant words\nList of words: ");
for (int i=0; i<words.length; i++)
System.out.println(words[i]);
Thanks so much for any help received!
Just have a different counter for the words array and increment it only when you add a word:
int count = 0;
for (int i=0; i<countLines; i++) {
...
// in place of: words[i] = read;
words[count++] = read;
...
}
When printing the words, just loop from 0 to count.
Also, here's a simpler way of checking for a vowel/consonant. Instead of:
if (read.charAt(0)=='a'||read.charAt(0)=='e'||read.charAt(0)=='i'||read.charAt(0)=='o'||read.charAt(0)=='u')
you can do:
if ("aeiou".indexOf(read.charAt(0)) > -1)
Update: Say read.charAt(0) is some character x. The above line says look for that character in the string "aeiou". indexOf returns the position of the character if found or -1 otherwise. So anything > -1 means that x was one of the characters in "aeiou", in other words, x is a vowel.
public static String[] removeElements(String[] allElements) {
String[] _localAllElements = new String[allElements.length];
for(int i = 0; i < allElements.length; i++)
if(allElements[i] != null)
_localAllElements[i] = allElements[i];
return _localAllElements;
}

Categories

Resources