ArrayIndexOutOfBoundsException when splitting the line of a text file - java

I'm trying to spit this text file:
asdf;asdf;asdf;N/A;N/A;N/A;N/A;N/A;N/A;N/A;N/A
Just so you know. there is no empty line at the bottom of it. That's all there is.
The piece of code that does this job is this.
try{
s = fIN.readLine();
while(s != null){
Parts = s.split(";");
NameFile = Parts[0];
IngredientFile[1] = Parts[1];
QuantityFile[1] = Parts[2];
IngredientFile[2] = Parts[3];
QuantityFile[2] = Parts[4];
IngredientFile[3] = Parts[5];
QuantityFile[3] = Parts[6];
IngredientFile[4] = Parts[7];
QuantityFile[4] = Parts[8];
IngredientFile[5] = Parts[9];
QuantityFile[5] = Parts[10];
list1.add(NameFile + "\n");
for(i=1; i<6; i++){
list1.add(" " + IngredientFile[i] + "" + QuantityFile[i] + "\n");
}
s = fIN.readLine();
}
}catch(IOException e){
list1.add(" ERROR READING FILE. \n");
}
It's throwing the error ArrayIndexOutOfBoundsException: 1 on line 45 which is this
IngredientFile[1] = Parts[1];
Apparently it's the Parts array which is giving me this but that can't be right because I declared it with the size of 1000 just for safety.
public String[] Parts = new String[1000];
Anyone have any ideas what's going on?

You should check the length of Parts before making assignments to make sure it split correctly or throw an exception if not. In your case, the line probably didn't have any ";" separator.
also, you could eliminate the case where you have an empty line or some weird characters.
if (!s.isEmpty() && !s.trim().equals("") && !s.trim().equals("\n")){
//split
}

Related

Indexing for each word in the TextFile Content Using Java

I am trying to index each word in a text file Using java
Index means i am denoting indexing of words here..
This is my sample file https://pastebin.com/hxB8t56p
(the actual file I want to index is much larger)
This is the code I have tried so far
ArrayList<String> ar = new ArrayList<String>();
ArrayList<String> sen = new ArrayList<String>();
ArrayList<String> fin = new ArrayList<String>();
ArrayList<String> word = new ArrayList<String>();
String content = new String(Files.readAllBytes(Paths.get("D:\\folder\\poem.txt")), StandardCharsets.UTF_8);
String[] split = content.split("\\s"); // Split text file content
for(String b:split) {
ar.add(b); // added into the ar arraylist //ar contains every line of poem
}
FileInputStream fstream = null;
String answer = "";fstream=new FileInputStream("D:\\folder\\poemt.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
String strLine;
int count = 1;
int songnum = 0;
while((strLine=br.readLine())!=null) {
String text = strLine.replaceAll("[0-9]", ""); // Replace numbers from txt
String nums = strLine.split("(?=\\D)")[0]; // get digits from strLine
if (nums.matches(".*[0-9].*")) {
songnum = Integer.parseInt(nums); // Parse string to int
}
String regex = ".*\\d+.*";
boolean result = strLine.matches(regex);
if (result == true) { // check if strLine contain digit
count = 1;
}
answer = songnum + "." + count + "(" + text + ")";
count++;
sen.add(answer); // added songnum + line number and text to sen
}
for(int i = 0;i<sen.size();i++) { // loop to match and get word+poem number+line number
for (int j = 0; j < ar.size(); j++) {
if (sen.get(i).contains(ar.get(j))) {
if (!ar.get(j).isEmpty()) {
String x = ar.get(j) + " - " + sen.get(i);
x = x.replaceAll("\\(.*\\)", ""); // replace single line sentence
String[] sp = x.split("\\s+");
word.add(sp[0]); // each word in the poem is added to the word arraylist
fin.add(x); // word+poem number+line number
}
}
}
}
Set<String> listWithoutDuplicates = new LinkedHashSet<String>(fin); // Remove duplicates
fin.clear();fin.addAll(listWithoutDuplicates);
Locale lithuanian = new Locale("ta");
Collator lithuanianCollator = Collator.getInstance(lithuanian); // sort array
Collections.sort(fin,lithuanianCollator);
System.out.println(fin);
(change in blossom. - 0.2,1.2, & the - 0.1,1.2, & then - 0.1,1.2)
I will first copy the intended output for your pasted example, and then go over the code to find how to change it:
Poem.txt
0.And then the day came,
to remain blossom.
1.more painful
then the blossom.
Expected output
[blossom. - 0.2,1.2, came, - 0.1, day - 0.1, painful - 1.1, remain - 0.2, the - 0.1,1.2, then - 0.1,1.2, to - 0.2]
As #Pal Laden notes in comments, some words (the, and) are not being indexed. It is probable that stopwords are being ignored for indexing purposes.
Current output of code is
[blossom. - 0.2, blossom. - 1.2, came, - 0.1, day - 0.1, painful - 1.1, remain - 0.2, the - 0.1, the - 1.2, then - 0.1, then - 1.2, to - 0.2]
So, assuming you fix your stopwords, you are actually quite close. Your fin array contains word+poem number+line number, but it should contain word+*list* of poem number+line number. There are several ways to fix this. First, we will need to do stopword removal:
// build stopword-removal set "toIgnore"
String[] stopWords = new String[]{ "a", "the", "of", "more", /*others*/ };
Set<String> toIgnore = new HashSet<>();
for (String s: stopWords) toIgnore.add(s);
if ( ! toIgnore.contains(sp[0)) fin.add(x); // only process non-ignored words
// was: fin.add(x);
Now, lets fix the list problem. The easiest (but ugly) way is to fix "fin" at the very end:
List<String> fixed = new ArrayList<>();
String prevWord = "";
String prevLocs = "";
for (String s : fin) {
String[] parts = s.split(" - ");
if (parts[0].equals(prevWord)) {
prevLocs += "," + parts[1];
} else {
if (! prevWord.isEmpty()) fixed.add(prevWord + " - " + prevLocs);
prevWord = parts[0];
prevLocs = parts[1];
}
}
// last iteration
if (! prevWord.isEmpty()) fixed.add(prevWord + " - " + prevLocs);
System.out.println(fixed);
How to do it the right way (TM)
You code can be much improved. In particular, using flat ArrayLists for everything is not always the best idea. Maps are great for building indices:
// build stopwords
String[] stopWords = new String[]{ "and", "a", "the", "to", "of", "more", /*others*/ };
Set<String> toIgnore = new HashSet<>();
for (String s: stopWords) toIgnore.add(s);
// prepare always-sorted, quick-lookup set of terms
Collator lithuanianCollator = Collator.getInstance(new Locale("ta"));
Map<String, List<String>> terms = new TreeMap<>((o1, o2) -> lithuanianCollator.compare(o1, o2));
// read lines; if line starts with number, store separately
Pattern countPattern = Pattern.compile("([0-9]+)\\.(.*)");
String content = new String(Files.readAllBytes(Paths.get("/tmp/poem.txt")), StandardCharsets.UTF_8);
int poemCount = 0;
int lineCount = 1;
for (String line: content.split("[\n\r]+")) {
line = line.toLowerCase().trim(); // remove spaces on both sides
// update locations
Matcher m = countPattern.matcher(line);
if (m.matches()) {
poemCount = Integer.parseInt(m.group(1));
lineCount = 1;
line = m.group(2); // ignore number for word-finding purposes
} else {
lineCount ++;
}
// read words in line, with locations already taken care of
for (String word: line.split(" ")) {
if ( ! toIgnore.contains(word)) {
if ( ! terms.containsKey(word)) {
terms.put(word, new ArrayList<>());
}
terms.get(word).add(poemCount + "." + lineCount);
}
}
}
// output formatting to match that of your code
List<String> output = new ArrayList<>();
for (Map.Entry<String, List<String>> e: terms.entrySet()) {
output.add(e.getKey() + " - " + String.join(",", e.getValue()));
}
System.out.println(output);
Which gives me [blossom. - 0.2,1.2, came, - 0.1, day - 0.1, painful - 1.1, remain - 0.2, to - 0.2]. I have not fixed the list of stopwords to get a perfect match, but that should be easy to do.

Parsing a file and replacing White spaces fond within double quotes using Java

I am reading a file and trying to modify it in the following order:
if line is empty trim()
if line ends with \ strip that char and add next line to it.
The complete line contains double quotes and there are white spaces between the quotes, replace the white space with ~.
For example: "This is text within double quotes"
change to : "This~is~text~within~double~quotes"
This code is working but buggy.
Here is the issue when it finds a line that ends with \ and others that done.
for example:
line 1 and \
line 2
line 3
so Instead of having
line 1 and line 2
line 3
I have this:
line 1 and line 2 line 3
Coded updated:
public List<String> OpenFile() throws IOException {
try (BufferedReader br = new BufferedReader(new FileReader(path))) {
String line;
//StringBuilder concatenatedLine = new StringBuilder();
List<String> formattedStrings = new ArrayList<>();
//Pattern matcher = Pattern.compile("\"([^\"]+)\"");
while ((line = br.readLine()) != null) {
boolean addToPreviousLine;
if (line.isEmpty()) {
line.trim();
}
if (line.contains("\"")) {
Matcher matcher = Pattern.compile("\"([^\"]+)\"").matcher(line);
while (matcher.find()) {
String match = matcher.group();
line = line.replace(match, match.replaceAll("\\s+", "~"));
}
}
if (line.endsWith("\\")) {
addToPreviousLine = false;
line = line.substring(0, line.length() - 1);
formattedStrings.add(line);
} else {
addToPreviousLine = true;
}
if (addToPreviousLine) {
int previousLineIndex = formattedStrings.size() - 1;
if (previousLineIndex > -1) {
// Combine the previous line and current line
String previousLine = formattedStrings.remove(previousLineIndex);
line = previousLine + " " + line;
formattedStrings.add(line);
}
}
testScan(formattedStrings);
//concatenatedLine.setLength(0);
}
return formattedStrings;
}
Update
I'm giving you what you need, without trying to write all the code for you. You just need to figure out where to place these snippets.
If line is empty trim()
if (line.matches("\\s+")) {
line = "";
// I don't think you want to add an empty line to your return result. If you do, just omit the continue;
continue;
}
If line contains double quotes and white spaces in them, replace the white space with ~. For example: "This is text within double quotes" change to : "This~is~text~within~double~quotes"
Matcher matcher = Pattern.compile("\"([^\"]+)\"").matcher(line);
while (matcher.find()) {
String match = matcher.group();
line = line.replace(match, match.replaceAll("\\s+", "~"));
}
If line ends with \ strip that char and add the next line. You need to have flag to track when to do this.
if (line.endsWith("\\")) {
addToPreviousLine = true;
line = line.substring(0, line.length() - 1);
} else {
addToPreviousLine = false;
}
Now, to add the next line to the previous line you'll need something like (Figure out where to place this snippet):
if (addToPreviousLine) {
int previousLineIndex = formattedStrings.size() - 1;
if (previousLineIndex > -1) {
// Combine the previous line and current line
String previousLine = formattedStrings.remove(previousLineIndex);
line = previousLine + " " + line;
}
}
You still do not need the StringBuffer or StringBuilder. Just modify the current line and add the current line to your formattedStrings List.
I'm not very good with regex, so here's a programmatic method to do it:
String string = "He said, \"Hello Mr Nice Guy\"";
// split it along the quotes
String splitString[] = string.split("\"");
// loop through, each odd indexted item is inside quotations
for(int i = 0; i < splitString.length; i++) {
if(i % 2 > 0) {
splitString[i] = splitString[i].replaceAll(" ", "~");
}
}
String finalString = "";
// re-build the string w/ quotes added back in
for(int i = 0; i < splitString.length; i++) {
if(i % 2 > 0) {
finalString += "\"" + splitString[i] + "\"";
} else {
finalString += splitString[i];
}
}
System.out.println(finalString);
Output: He said, "Hello~Mr~Nice~Guy"
Step 3:
String text;
text = text.replaceAll("\\s", "~");
If you want to replace spaces occur within double quotes with ~s,
if (line.contains("\"")) {
String line = "\"This is a line with spaces\"";
String result = "";
Pattern p = Pattern.compile("\"([^\"]*)\"");
Matcher m = p.matcher(line);
while (m.find()) {
result = m.group(1).replace(" ", "~");
}
}
instead of
if (line.contains("\"")) {
StringBuffer sb = new StringBuffer();
Matcher matcher = Pattern.compile("\"([^\"]+)\"").matcher(line);
while (matcher.find()) {
matcher.appendReplacement(sb, matcher.group().replaceAll("\\s+", ""));
}
I would do this
if (line.matches(("\"([^\"]+)\"")) {
line= line.replaceAll("\\s+", ""));
}
How can I add this to what I have above ?
concatenatedLine.append(line);
String fullLine = concatenatedLine.toString();
if (fullLine.contains("\"")) {
StringBuffer sb = new StringBuffer();
Matcher matcher = Pattern.compile("\"([^\"]+)\"").matcher(fullLine);
while (matcher.find()) {
matcher.appendReplacement(sb, matcher.group().replaceAll("\\s+", ""));
formattedStrings.add(sb.toString());
}else
formattedStrings.add(fullLine);

Creating an ArrayList from data in a text file

I am trying to write a program that uses two classes to find the total $ amount from a text file of retail transactions. The first class must read the file, and the second class must perform the calculations. The problem I am having is that in the first class, the ArrayList only seems to get the price of the last item in the file. Here is the input (which is in a text file):
$69.99 3 Shoes
$79.99 1 Pants
$17.99 1 Belt
And here is my first class:
class ReadInputFile {
static ArrayList<Double> priceArray = new ArrayList<>();
static ArrayList<Double> quantityArray = new ArrayList<>();
static String priceSubstring = new String();
static String quantitySubstring = new String();
public void gatherData () {
String s = "C:\\filepath";
try {
FileReader inputFile = new FileReader(s);
BufferedReader bufferReader = new BufferedReader(inputFile);
String line;
String substring = " ";
while ((line = bufferReader.readLine()) != null)
substring = line.substring(1, line.lastIndexOf(" ") + 1);
priceSubstring = substring.substring(0,substring.indexOf(" "));
quantitySubstring = substring.substring(substring.indexOf(" ") + 1 , substring.lastIndexOf(" ") );
double price = Double.parseDouble(priceSubstring);
double quantity = Double.parseDouble(quantitySubstring);
priceArray.add(price);
quantityArray.add(quantity);
System.out.println(priceArray);
} catch (IOException e) {
e.printStackTrace();
}
}
The output and value of priceArray is [17.99], but the desired output is [69.99,79.99,17.99].
Not sure where the problem is, but thanks in advance for any help!
Basically what you have is:
while ((line = bufferReader.readLine()) != null) {
substring = line.substring(1, line.lastIndexOf(" ") + 1);
}
priceSubstring = substring.substring(0,substring.indexOf(" "));
quantitySubstring = substring.substring(substring.indexOf(" ") + 1 , substring.lastIndexOf(" ") );
double price = Double.parseDouble(priceSubstring);
double quantity = Double.parseDouble(quantitySubstring);
priceArray.add(price);
quantityArray.add(quantity);
System.out.println(priceArray);
So all you are doing is creating a substring of the line you just read, then reading the next line, so basically, only the substring of the last will get processed by the remaining code.
Wrap the code in {...} which you want to be executed on each iteration of the loop
For example...
while ((line = bufferReader.readLine()) != null) {
substring = line.substring(1, line.lastIndexOf(" ") + 1);
priceSubstring = substring.substring(0,substring.indexOf(" "));
quantitySubstring = substring.substring(substring.indexOf(" ") + 1 , substring.lastIndexOf(" ") );
double price = Double.parseDouble(priceSubstring);
double quantity = Double.parseDouble(quantitySubstring);
priceArray.add(price);
quantityArray.add(quantity);
System.out.println(priceArray);
}
This will execute all the code within the {...} block for each line of the file

(Java + Android) Parsing string to float strange error

This may sound like a trivial question, but I'm having a really hard time trying to figure it out. Basically I'm sending a string from my Android to my PC. All the connection is ok, and the string is transfered successfully. This is the Android code (sends string to computer):
try
{
println(scSocket + "");
if (scSocket!=null)
{
SendReceiveBytes sendReceiveBT = new SendReceiveBytes(scSocket);
String red = rotZ + " \n";
byte[] myByte = stringToBytesUTFCustom(red);
sendReceiveBT.write(myByte);
}
}
catch(Exception e)
{
println(e);
}
Where rotZ is what I want to send, it is a "float" value. I need to put the " \n" on the end of the message so that it will be recognized as a full message on the PC. So far so good. Now I want to read this on my PC, which is achieved by:
//BlueTooth
String lineRead = "";
try
{
lineRead = new String(sampleSPPServer.readFromDevice());
if(lineRead != null && !lineRead.isEmpty())
{
String lineTransf = lineRead.replace("\n", "").replace("\r", "").replace(" ", "").replace("\"", "").trim();
println("LineTransf: " + lineTransf);
rotZ += 0.01*(Float.parseFloat(lineTransf));
println("Zrotation: " + rotZ); //Never gets here, throws and error before...
}
else
rotZ += 0;
}
catch(Exception e)
{
println("Exception: " + e);
}
Which gives me the error:
NumberFormatException: invalid float value: "1.1400002"
In my code you can see I check for null, empty, etc. So that's not the problem. I've already tried:
NumberFormat nf = NumberFormat.getInstance(Locale.US);
rotZ += 0.01*(nf.parse(lineTransf).floatValue());
Got the same result... In stackoverflow there is a similar question:
Here
There is one more strange thing, If I try the code:
for(int i = 0; i < lineTransf.length(); i++)
println(lineTransf.substring(i,1));
I get that the string's length is 19, but it only prints the first two and gives the message:
Exception: java.lang.StringIndexOutOfBoundsException: String index out of range: -1
Even more strange thing, when I did ctrl-c, ctrl-v on the number "1.1400002" that appears in the console, it only pastes "1" here on stack overflow.
I know that the number is right, but somewhere the conversion is not. I think that's because the string is sent as a byte and read as a String, but how do I solve this problem? Thanks in advance!!
Nothing strange. that's the expected behavior of substring. It throws an IndexOutOfBoundsException, if the startIndex is negative, the endIndex is greater than the string's length or if startIndex is greater the endIndex (which is your case). To me it looks like you want to print the char at index. Try with
for(int i = 0; i < lineTransf.length(); i++)
println(lineTransf.charAt(i));
I found a work around, but I really, really would like an explanation (if possible), because this is just too ugly... I changed the code to:
//BlueTooth
String lineRead = "";
try
{
lineRead = new String(sampleSPPServer.readFromDevice());
if(lineRead != null && !lineRead.isEmpty())
{
String lineTransf = lineRead.replace("\n", "").replace("\r", "").replace(" ", "").replace("\"", "").trim();
println("LineTransf: " + lineTransf + " " + lineTransf.length());
String lastTry = "";
for(int i = 0; i < lineTransf.length(); i++)
{
if(lineTransf.charAt(i) != ' ' && lineTransf.charAt(i) != '\u0000')
{
println(lineTransf.charAt(i));
lastTry += lineTransf.charAt(i);
}
}
println("LastTry: " + lastTry);
rotZ += 0.01*(Float.parseFloat(lastTry));
println("Zrotation: " + rotZ);
}
else
rotZ += 0;
//System.out.println("Line Read:" + lineRead);
}
catch(Exception e)
{
println("Exception: " + e);
}
I'm basically creating a new String called lastTry and then checking if each of the bluetooth read characters are not empty(?) null(?) (since I'm testing for:)
if(lineTransf.charAt(i) != ' ' && lineTransf.charAt(i) != '\u0000')
And if they pass this test I individually "assemble" the lastTry String. It seems that the bluetooth is sending a null character between each of the characters of the whole string. I don't understand why this happens and it actually consumes some time while reading the incoming string. I really would love another answer if someone have another idea...

Replacing Strings Java

I have this function to check if some words appear in a specific line, and then surround them with a given char.
The code above works like a charm, however since the words in the string array "words" are always low case, the words will be lower case as well. How can i fix this issue ?
The inputs:
BufferedReader in = "Hello, my name is John:";
char c = '*';
String [] words = {"hello","john"};
The desired output:
BufferedWriter out = "*Hello*, my name is *John*:";
The actual output:
BufferedWriter out = "*hello*, my name is *john*";
The code:
public void replaceString(BufferedReader in, BufferedWriter out, char c, String[] words){
String line_in = in.readLine();
while (line_in != null) {
for (int j = 0; j < words.length; j++) {
line_in = line_in.replaceAll("(?i)" + words[j], bold + words[j]
+ bold);
}
out.write(line_in);
out.newLine();
line_in = in.readLine();
}
}
Use
line_in.replaceAll("(?i)(" + words[j] + ")", bold + "$1" + bold);
// \________________/ \/
// capture word reference it

Categories

Resources