Java string Out of bounds foolishness - java

I am reading from a file and on one of the lines within the file reads:
"{a,b,c,d}" I ran a split function on that line which returned a string array of 10(not sure why 10 instead of 9) elements. Each element being one character long. I then did a foreach loop over that array and within the body of that loop, I am doing a check to see if that "char" is a letterOrDigit and if it is.. do something else. But Within the first iteration of that loop, I get a StringIndexOutOfBounds Exception and for the life of me I do not know Why. I did not run into this issue while using intelliJ. This is being ran in eclipse and it's driving me nuts. Please help.
for(int line = 1; line < file.size(); line++) // start at line 1 cause the line with the alphabet is useless.
{
if(line == 1) // Line 1 is always the states... so create the states.
{
String[] array = file.get(line).split("");
for(String str :array) // split returns a string Array. each element is is one char long
{
// we only care about the letters and not the '{' or the ','
if (Character.isLetterOrDigit(str.charAt(0))) {
dfa.addState(new State(str.charAt(0)));
}
}
}
This is a screenshot of what the String array looks like after the split
This is what the line looks like after the file has been read.

The first string in the array is an empty string: "". It has a length of 0 so str.charAt(0) is out of bounds.

I think you don't need to use spilt function, you want to read whole stuff char by char just like array access.
Empty value to string split function produces extra element because spilt function has no special case handling for such case, although it does handle case properly when length of split string is 1
You can rewrite your code something thing below
for (int line = 1; line < file.size(); line++) // start at line 1 cause the line with the alphabet is useless.
{
if (line == 1) // Line 1 is always the states... so create the states.
{
String lineValue = file.get(line);
int len = lineValue.length();
for (int index = 0; index < len; index++) {
if (Character.isLetterOrDigit(lineValue.charAt(index))) {
//dfa.addState(new State(str.charAt(0)));
}
}
}
}

From your screenshot, the array element at the index 0's length is 0, i.e it's a empty String, but during iteration you're trying to access the first char of empty String, so it throws the IndexOutOfBoundException.
To avoid this check the length or isEmpty before accessing it's char

Related

modify empty characters in a character array

I'm attempting to take in a string from the console of a certain length and set the empty characters in the string to an asterisk.
System.out.println("Enter a string of digits.");
someString = input.next();
if(someString.matches("\\d{0,9}")) {
charArr = someString.toCharArray();
for ( char digit: charArr) {
if(!Character.isDefined(charArr[digit])){
charArr[digit] = '*';
}
}
System.out.printf("Your string is: %s%n", new String(charArr));
This code is throwing an array index out of bounds exception and I'm not sure why.
for ( char digit: charArr) will iterate over each character from charArr.
Thus, digit contains a character value from charArr.
When you access the element from charArr by writing charArr[digit], you are converting digit from datatype char to int value.
For example, you have charArr = new char[]{'a','b','c'}.
charArr['a'] is equivalent to charArr[97] but charArr has size of length 3 only.
Thus, charArr cannot access the element outsize of its size and throws ArrayIndexOutOfBoundsException.
Solution: loop through the array index wise rather than element wise.
for(int i = 0; i < charArr.length; i++) {
// access using charArr[i] instead of charArr[digit]
...
}
Think you could do it in one line with:
newString = someString.replaceAll("\\s", "*");
"\s" is the regex pattern for a whitespace character.
I think you're mixing your for blocks. In your example, you're going over every character in your someString.toCharArray() so you can't do !Character.isDefined(charArr[digit]) because digit is a char, not an int. You can't take the index of an array with a char.
If you're checking purely if a character is a space, you can simply do one of the following:
if (digit != ' ')
if (!Character.isWhiteSpace(digit)
if (Character.isDigit(digit))
This loop statement:
for (char digit: charArr) {
iterates the values in the array. The values have type char and can be anything from 0 to 65535. However, this statement
if (!Character.isDefined(charArr[digit])) {
uses digit as an index for the array. For that to "work" (i.e. not throw an exception), the value needs to be in the range 0 to charArr.length - 1. Clearly, for the input string you are using, some of those values are not acceptable as indexes (e.g. value >= charArr.length) and an exception ensues.
But you don't want to fix that by testing value is in the range required. The values of value are not (from a semantic perspective) array indexes anyway. (If you use them as if they are indexes, you will end up missing some positions in the array.)
If you want to index the values in the array, do this:
for (int i = 0; i < charArr.length; i++) {
and then use i as the index.
Even when you have fixed that, there is still a problem with your code ... for some usecases.
If your input is encoded using UTF-8 (for example) it could include Unicode codepoints (characters) that are greater than 65535, and are encoded in the Java string as two consective char values. (A so-called surrogate pair.) If your string contains surrogate pairs, then isDefined(char) is not a valid test. Instead you should be using isDefined(int) and (more importantly) iterating the Unicode codepoints in the string, not the char values.

Character occurrence in a txt file java

I'm writing a character occurrence counter in a txt file. I keep getting a result of 0 for my count when I run this:
public double charPercent(String letter) {
Scanner inputFile = new Scanner(theText);
int charInText = 0;
int count = 0;
// counts all of the user identified character
while(inputFile.hasNext()) {
if (inputFile.next() == letter) {
count += count;
}
}
return count;
}
Anyone see where I am going wrong?
This is because Scanner.next() will be returning entire words rather than characters. This means that the string from will rarely be the same as the single letter parameter(except for cases where the word is a single letter such as 'I' or 'A'). I also don't see the need for this line:
int charInText = 0;
as the variable is not being used.
Instead you could try something like this:
public double charPercent(String letter) {
Scanner inputFile = new Scanner(theText);
int totalCount = 0;
while(inputFile.hasNext()) {
//Difference of the word with and without the given letter
int occurencesInWord = inputFile.next().length() - inputFile.next().replace(letter, "").length();
totalCount += occurencesInWord;
}
return totalCount;
}
By using the difference between the length of the word at inputFile.next() with and without the letter, you will know the number of times the letter occurs in that specific word. This is added to the total count and repeated for all words in the txt.
use inputFile.next().equals(letter) instead of inputFile.next() == letter1.
Because == checks for the references. You should check the contents of the String object. So use equals() of String
And as said in comments change count += count to count +=1 or count++.
Read here for more explanation.
Do you mean to compare the entire next word to your desired letter?
inputFile.next() will return the next String, delimited by whitespace (tab, enter, spacebar). Unless your file only contains singular letters all separated by spaces, your code won't be able to find all the occurrences of letters in those words.
You might want to try calling inputFile.next() to get the next String, and then breaking that String down into a charArray. From there, you can iterate through the charArray (think for loops) to find the desired character. As a commenter mentioned, you don't want to use == to compare two Strings, but you can use it to compare two characters. If the character from the charArray of your String matches your desired character, then try count++ to increment your counter by 1.

Length of an array that was initialized using a split string?

I have the following code to split a String of a list of words separated by spaces. The string is split and used to populate the array. If I use the .length method, will it return the amount of split strings? As in, would it be similar to the String.length method that counts the amount of characters and returns that value?
I want to have the program increment the array position by one each time it's run, but I want it to reset to 0 if it already used the last word, so it circles back and starts with the first word again.
Would this bit of code reset the word position to 0 when it has already used the last word in the array?
String wordsList[] = words.split(" ");
this.currentWord = wordsList[wordPos];
if(wordPos < wordsList.length)
wordPos++;
else
wordPos = 0;
If you use array.length on an array doesn't it tell you what the length is?
The answer is yes. Also, the .length is a property and not a method.
Take a look at this going over the .length property with a bit more detail. Cheers.
If I understand correctly, It sounds like you want to use a for loop like this?:
String words = "Blah blarg bloob"
String wordsList[] = words.split(" ");
String currentWord = "";
for (int i = 0; i < wordsList.length; i++){
if (currentWord !=null && currentWord.equals(wordsList[i])) {i = 0;}
currentWord = wordsList[i];
}

Split by a comma that is not inside parentheses, skipping anything inside them

I know it might be another topic about regexes, but despite I searched it, I couldn't get the clear answer. So here is my problem- I have a string like this:
{1,2,{3,{4},5},{5,6}}
I'm removing the most outside parentheses (they are there from input, and I don't need them), so now I have this:
1,2,{3,{4},5},{5,6}
And now, I need to split this string into an array of elements, treating everything inside these parentheses as one, "seamless" element:
Arr[0] 1
Arr[1] 2
Arr[2] {3,{4},5}
Arr[3] {5,6}
I have tried doing it using lookahead but so far, I'm failing (miserably). What would be the neatest way of dealing with those things in terms of regex?
You cannot do this if elements like this should be kept together: {{1},{2}}. The reason is that a regex for this is equivalent to parsing the balanced parenthesis language. This language is context-free and cannot be parsed using a regular expression. The best way to handle this is not to use regex but use a for loop with a stack (the stack gives power to parse context-free languages). In pseudo code we could do:
for char in input
if stack is empty and char is ','
add substring(last, current position) to output array
last = current index
if char is '{'
push '{' on stack
if char is '}'
pop from stack
This pseudo code will construct the array as desired, note that it's best to loop over the indexes of the chars in the given string as you'll need those to determine the boundaries of the substrings to add to the array.
Almost near to the requirement. Running out of time. Will complete rest later (A single comma is incorrect).
Regex: ,(?=[^}]*(?:{|$))
To check regex validity: Go to http://regexr.com/
To implement this pattern in Java, there is a slight difference. \ needs to be added before { and }.
Hence, regex for Java Input: ,(?=[^\\}]*(?:\\{|$))
String numbers = {1,2,{3,{4},5},{5,6}};
numbers = numbers.substring(1, numbers.length()-1);
String[] separatedValues = numbers.split(",(?=[^\\}]*(?:\\{|$))");
System.out.println(separatedValues[0]);
Could not figure out a regex solution, but here's a non-regex solution. It involves parsing numbers (not in curly braces) before each comma (unless its the last number in the string) and parsing strings (in curly braces) until the closing curly brace of the group is found.
If regex solution is found, I'd love to see it.
public static void main(String[] args) throws Exception {
String data = "1,2,{3,{4},5},{5,6},-7,{7,8},{8,{9},10},11";
List<String> list = new ArrayList();
for (int i = 0; i < data.length(); i++) {
if ((Character.isDigit(data.charAt(i))) ||
// Include negative numbers
(data.charAt(i) == '-') && (i + 1 < data.length() && Character.isDigit(data.charAt(i + 1)))) {
// Get the number before the comma, unless it's the last number
int commaIndex = data.indexOf(",", i);
String number = commaIndex > -1
? data.substring(i, commaIndex)
: data.substring(i);
list.add(number);
i += number.length();
} else if (data.charAt(i) == '{') {
// Get the group of numbers until you reach the final
// closing curly brace
StringBuilder sb = new StringBuilder();
int openCount = 0;
int closeCount = 0;
do {
if (data.charAt(i) == '{') {
openCount++;
} else if (data.charAt(i) == '}') {
closeCount++;
}
sb.append(data.charAt(i));
i++;
} while (closeCount < openCount);
list.add(sb.toString());
}
}
for (int i = 0; i < list.size(); i++) {
System.out.printf("Arr[%d]: %s\r\n", i, list.get(i));
}
}
Results:
Arr[0]: 1
Arr[1]: 2
Arr[2]: {3,{4},5}
Arr[3]: {5,6}
Arr[4]: -7
Arr[5]: {7,8}
Arr[6]: {8,{9},10}
Arr[7]: 11

How can I compare 2 strings character by character?

for a college project, I am doing a spelling test for children and i need to give 1 mark for a minor spelling error. For this I am going to do if the spelling has 2 characters wrong. How can I compare the saved word to the inputed word?
char wLetter1 = word1.charAt(0);
char iLetter1 = input1.charAt(0);
char wLetter2 = word1.charAt(1);
char iLetter2 = input1.charAt(1);
I have started out with this where word1 is the saved word and input1 is the user input word.
However, if I add lots of these, if the word is 3 characters long but I am trying to compare the 4th character, I will get an error? Is there a way of knowing how many characters are in the string and only finding the characters of those letters?
Just use a for loop. Since I'm assuming this is about JavaScript, calling charAt() with an index out-of-bounds will just return the empty string "".
To avoid a out-of-bounds exception you'll have to iterate up until the lower of the lengths:
int errs = Math.abs(word1.length - input1.length);
int len = Math.min(word1.length, input1.length);
for (int i = 0; i < len; i++) {
if (word1.charAt(i) != input1.charAt(i)) errs++;
}
// errs now holds the number of character mismatches

Categories

Resources