Read text file and split each newline into a string array - java

So basically I'm reading a text file that has a bunch of lines. I need to extract certain lines from the text file and add those specific lines into string array. I've been trying to split each newLine with: "\n" , "\r". This did not work. I keep getting this error as well:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
at A19010.main(A19010.java:47)
Here is the code:
Path objPath = Paths.get("dirsize.txt");
if (Files.exists(objPath)){
File objFile = objPath.toFile();
try(BufferedReader in = new BufferedReader(
new FileReader(objFile))){
String line = in.readLine();
while(line != null){
String[] linesFile = line.split("\n");
String line0 = linesFile[0];
String line1 = linesFile[1];
String line2 = linesFile[2];
System.out.println(line0 + "" + line1);
line = in.readLine();
}
}
catch(IOException e){
System.out.println(e);
}
}
else
{
System.out.println(
objPath.toAbsolutePath() + " doesn't exist");
}

String[] linesFile = new String[] {line}; // this array is initialized with a single element
String line0 = linesFile[0]; // fine
String line1 = linesFile[1]; // not fine, the array has size 1, so no element at second index
String line2 = linesFile[2];
You're creating a String[] linesFile with one element, line, but then trying to access elements at index 1 and 2. This will give you an ArrayIndexOutOfBoundsException
You're not actually splitting anything here. in.readLine();, as the method says, reads a full line from the file.
Edit: You can add lines (Strings) dynamically to a list instead of an array, since you don't know the size.
List<String> lines = new LinkedList<String>(); // create a new list
String line = in.readLine(); // read a line at a time
while(line != null){ // loop till you have no more lines
lines.add(line) // add the line to your list
line = in.readLine(); // try to read another line
}

readLine() method reads a entire line from the input but removes the newLine characters from it. When you split the line on \n character, you will not find one in the String. Hence, you get the exception.
Please, refer the answer in this link for more clarity.

You are initializing your String array with 1 element, namely line. linesFile[0] is therefore line and the rest of your array is out of bounds.

Try this:
String[] linesFile = line.split("SPLIT-CHAR-HERE");
if(linesFile.length >= 3)
{
String line0 = linesFile[0];
String line1 = linesFile[1];
String line2 = linesFile[2];
// further logic here
}else
{
//handle invalid lines here
}

You are using array to store the strings. Instead use ArrayList from Java as ArrayList are dynamically growing. after your reading operation completes convert it into array.
String line = in.readLine();
ArrayList<String> str_list = new ArrayList<String>();
String[] strArr = new String[str_list.size()];
while(line != null){
str_list.add(line);
line = in.readLine();
}
// at the end of the operation convert Arraylist to array
return str_list.toArray(strArr);

The issue here is that you are creating a new String array every time your parser reads in a new line. You then populate only the very first element in that String array with the line that is being read in with:
String[] linesFile = new String[] {line};
Since you create a new String[] with one element every single time your while loop runs from the top, you lose the values it stored from the previous iteration.
The solution is to use new String[]; right before you enter the while loop. If you don't know how to use ArrayList, then I suggest a while loop like this:
int numberOfLine = 0;
while (in.readLine() != null)
{
numberOfLine++;
}
String linesFile = new String[numberOfLine];
This will let you avoid using a dynamically resized ArrayList because you know how many lines your file contains from the above while loop. Then you would keep an additional counter (or resuse numberOfLine since we have no use for it anymore) so that you can populate this array:
numberOfLine = 0;
in = new BufferedReader(new FileReader(objFile)); // reset the buffer
while ((String line = in.readLine()) != null)
{
linesFile[numberOfLine] = line;
numberOfLine++;
}
At this point linesFile should be correctly populated with the lines in your file, such that linesFile[i] can be used to access the i'th line in the file.

Related

Read file using delimiter and add to array

I am trying to read from a text file that is in my project workspace then;
Create an object depending on the first element on the first line of the file
Set some variables within the object
Then add it to my arrayList
I seem to be reading the file ok but am struggling to create the different objects based off what the first element on each line in the text file is
Text file is like this
ul,1,gg,0,33.0
sl,2,hh,0,44.0
My expected result is to create an UltimateLanding object or StrongLanding object based on the first element in the text above file example
Disclaimer - I know the .equals is not correct to use in the IF statement, i've tried many ways to resolve this
My Code -
Edited -
It seems the program is now reading the file and correctly and adding to the array. However, it is only doing this for the first line in the file? There should be 2 objects created as there are 2 lines in the text file.
Scanner myFile = new Scanner(fr);
String line;
myFile.useDelimiter(",");
while (myFile.hasNext()) {
line = myFile.next();
if (line.equals("sl")) {
StrongLanding sl = new StrongLanding();
sl.setLandingId(Integer.parseInt(myFile.next()));
sl.setLandingDesc(myFile.next());
sl.setNumLandings(Integer.parseInt(myFile.next()));
sl.setCost(Double.parseDouble(myFile.next()));
landings.add(sl);
} else if (line.equals("ul")) {
UltimateLanding ul = new UltimateLanding();
ul.setLandingId(Integer.parseInt(myFile.next()));
ul.setLandingDesc(myFile.next());
ul.setNumLandings(Integer.parseInt(myFile.next()));
ul.setCost(Double.parseDouble(myFile.next()));
landings.add(ul);
}
}
TIA
There are multiple issues with your current code.
myFile.equals("sl") compares your Scanner object with a String. You would actually want to compare your read string line, not your Scanner object. So line.equals("sl").
nextLine() will read the whole line. So line will never be equal to "sl". You should split the line using your specified delimiter, then use the split parts to build your object. This way, you will not have to worry about newline in combination with next().
Currently, your evaluation of the read input is outside of the while loop, so you will read all the content of the file, but only evaluate the last line (currently). You should move the evaluation of the input and creation of your landing objects inside the while loop.
All suggestions implemented:
...
Scanner myFile = new Scanner(fr);
// no need to specify a delimiter, since you want to read line by line
String line;
String[] splitLine;
while (myFile.hasNextLine()) {
line = myFile.nextLine();
splitLine = line.split(","); // split the line by ","
if (splitLine[0].equals("sl")) {
StrongLanding sl = new StrongLanding();
sl.setLandingId(Integer.parseInt(splitLine[1]));
sl.setLandingDesc(splitLine[2]);
sl.setNumLandings(Integer.parseInt(splitLine[3]));
sl.setCost(Double.parseDouble(splitLine[4]));
landings.add(sl);
} else if (splitLine[0].equals("ul")) {
UltimateLanding ul = new UltimateLanding();
ul.setLandingId(Integer.parseInt(splitLine[1]));
ul.setLandingDesc(splitLine[2]);
ul.setNumLandings(Integer.parseInt(splitLine[3]));
ul.setCost(Double.parseDouble(splitLine[4]));
landings.add(ul);
}
}
...
However, if you don't want to read the contents line by line (due to whatever requirement you have), you can keep reading it via next(), but you have to specify the delimiter correctly:
...
Scanner myFile = new Scanner(fr);
String line; // variable naming could be improved, since it's not the line
myFile.useDelimiter(",|\\n"); // comma and newline as delimiters
while (myFile.hasNext()) {
line = myFile.next();
if (line.equals("sl")) {
StrongLanding sl = new StrongLanding();
sl.setLandingId(Integer.parseInt(myFile.next()));
sl.setLandingDesc(myFile.next());
sl.setNumLandings(Integer.parseInt(myFile.next()));
sl.setCost(Double.parseDouble(myFile.next()));
landings.add(sl);
} else if (line.equals("ul")) {
UltimateLanding ul = new UltimateLanding();
ul.setLandingId(Integer.parseInt(myFile.next()));
ul.setLandingDesc(myFile.next());
ul.setNumLandings(Integer.parseInt(myFile.next()));
ul.setCost(Double.parseDouble(myFile.next()));
landings.add(ul);
}
}
...
A solution.
List<Landing> landings = Files.lines(Paths.get("LandingsData.txt")).map(line -> {
String[] split = line.split(",");
if (split[0].equals("sl")) {
StrongLanding sl = new StrongLanding();
sl.setLandingId(Integer.parseInt(split[1]));
sl.setLandingDesc(split[2]);
sl.setNumLandings(split[3]);
sl.setCost(Double.parseDouble(split[4]));
return sl;
} else if (split[0].equals("ul")) {
UltimateLanding ul = new UltimateLanding();
ul.setLandingId(Integer.parseInt(split[1]));
ul.setLandingDesc(split[2]);
ul.setNumLandings(split[3]);
ul.setCost(Double.parseDouble(split[4]));
return ul;
}
return null;
}).filter(t -> t!= null).collect(Collectors.toList());

skipping lines while reading from csv file in java [duplicate]

This question already has answers here:
BufferedReader is skipping every other line when reading my file in java
(3 answers)
Closed 3 years ago.
private static List<Book> readDataFromCSV(String fileName) {
List<Book> books = new ArrayList<>();
Path pathToFile = Paths.get(fileName);
// create an instance of BufferedReader
// using try with resource, Java 7 feature to close resources
try (BufferedReader br = Files.newBufferedReader(pathToFile,
StandardCharsets.US_ASCII)) {
// read the first line from the text file
String line = br.readLine();
// loop until all lines are read
while ((line = br.readLine())!= null) {
// use string.split to load a string array with the values from
// each line of
// the file, using a comma as the delimiter
String[] attributes = line.split("\\|");
Book book = createBook(attributes);
// adding book into ArrayList
books.add(book);
// read next line before looping
// if end of file reached, line would be null
line = br.readLine();
}
} catch (IOException ioe) {
ioe.printStackTrace();
}
return books;
}
private static Book createBook(String[] metadata) {
String name = metadata[0];
String author = metadata[1]; // create and return book of this metadata
return new Book(name, price, author);
}
The above code skips every second line from text file (a csv file).
It gives data of alternate lines and it uses Java 7 syntax.
Please provide some suggestion what is wrong or how to improve it.
Remove the br.readLine() inside the while condition i.e.
// read the first line from the text file
String line = br.readLine();
// loop until all lines are read
while (line != null)
{
...
// read next line before looping
// if end of file reached, line would be null
line = br.readLine();
}
You have called the br.readLine() function twice in the loop.
One is in the condition:
while((line = br.readLine()) != null)
and the second one is at end of the loop.
So the loop is actually reading a line at the end, and then reading the next line at the beginning without processing it. To avoid this, you can remove the br.readLine at the end of the loop.
while ((line = br.readLine())!= null)
{
// use string.split to load a string array with the values from
// each line of
// the file, using a comma as the delimiter
String[] attributes = line.split("\\|");
Book book = createBook(attributes);
// adding book into ArrayList
books.add(book);
}
If you did not get it, the condition:
while((line = br.readLine()) != null)
is actually doing the following:
storing the returned value of br.readLine() in the variable line,
and then checking the condition. Therefore, you do not need to call it again in the loop.

Remove stop words from file - going over it multiple times causes content duplication and does not remove the words

I am trying to go over a bunch of files, read each of them, and remove all stopwords from a specified list with such words. The result is a disaster - the content of the whole file copied over and over again.
What I tried:
- Saving the file as String and trying to look with regex
- Saving the file as String and going over line by line and comparing tokens to the stopwords that are stored in a LinkedHashSet, I can also store them in a file
- tried to twist the logic below in multiple ways, getting more and more ridiculous output.
- tried looking into text / line with the .contains() method, but no luck
My general logic is as follows:
for every word in the stopwords set:
while(file has more lines):
save current line into String
while (current line has more tokens):
assign current token into String
compare token with current stopword:
if(token equals stopword):
write in the output file "" + " "
else: write in the output file the token as is
Tried what's in this question and many other SO questions, but just can't achieve what I need.
Real code below:
private static void removeStopWords(File fileIn) throws IOException {
File stopWordsTXT = new File("stopwords.txt");
System.out.println("[Removing StopWords...] FILE: " + fileIn.getName() + "\n");
// create file reader and go over it to save the stopwords into the Set data structure
BufferedReader readerSW = new BufferedReader(new FileReader(stopWordsTXT));
Set<String> stopWords = new LinkedHashSet<String>();
for (String line; (line = readerSW.readLine()) != null; readerSW.readLine()) {
// trim() eliminates leading and trailing spaces
stopWords.add(line.trim());
}
File outp = new File(fileIn.getPath().substring(0, fileIn.getPath().lastIndexOf('.')) + "_NoStopWords.txt");
FileWriter fOut = new FileWriter(outp);
Scanner readerTxt = new Scanner(new FileInputStream(fileIn), "UTF-8");
while(readerTxt.hasNextLine()) {
String line = readerTxt.nextLine();
System.out.println(line);
Scanner lineReader = new Scanner(line);
for (String curSW : stopWords) {
while(lineReader.hasNext()) {
String token = lineReader.next();
if(token.equals(curSW)) {
System.out.println("---> Removing SW: " + curSW);
fOut.write("" + " ");
} else {
fOut.write(token + " ");
}
}
}
fOut.write("\n");
}
fOut.close();
}
What happens most often is that it looks for the first word from the stopWords set and that's it. The output contains all the other words even if I manage to remove the first one. And the first will be there in the next appended output in the end.
Part of my stopword list
about
above
after
again
against
all
am
and
any
are
as
at
With tokens I mean words, i.e. getting every word from the line and comparing it to the current stopword
After awhile of debugging I believe I have found the solution. This problem is very tricky as you have to use several different scanners and file readers etc. Here is what I did:
I changed how you added to your StopWords set, as it wasn't adding them correctly. I used a buffered reader to read each line, then a scanner to read each word, then added it to the set.
Then when you compared them I got rid of one of your loops as you can easily use the .contains() method to check if the word was a stopWord.
I left you to do the part of writing to the file to take out the stop words, as I'm sure you can figure that out now that everything else is working.
-My sample stop words txt file:
Stop words
Words
-My samples input file was the exact same, so it should catch all three words.
The code:
// create file reader and go over it to save the stopwords into the Set data structure
BufferedReader readerSW = new BufferedReader(new FileReader("stopWords.txt"));
Set<String> stopWords = new LinkedHashSet<String>();
String stopWordsLine = readerSW.readLine();
while (stopWordsLine != null) {
// trim() eliminates leading and trailing spaces
Scanner words = new Scanner(stopWordsLine);
String word = words.next();
while(word != null) {
stopWords.add(word.trim()); //Add the stop words to the set
if(words.hasNext()) {
word = words.next(); //If theres another line, read it
}
else {
break; //else break the inner while loop
}
}
stopWordsLine = readerSW.readLine();
}
BufferedReader outp = new BufferedReader(new FileReader("Words.txt"));
String line = outp.readLine();
while(line != null) {
Scanner lineReader = new Scanner(line);
String line2 = lineReader.next();
while(line2 != null) {
if(stopWords.contains(line2)) {
System.out.println("removing " + line2);
}
if(lineReader.hasNext()) { //If theres another line, read it
line2 = lineReader.next();
}
else {
break; //else break the first while loop
}
}
lineReader.close();
line = outp.readLine();
}
OutPut:
removing Stop
removing words
removing Words
Let me know if I can elaborate any more on my code or why I did something!

Java - Read and storing in an array

I want to read the contents of a text file, split on a delimiter and then store each part in a separate array.
For example the-file-name.txt contains different string all on a new line:
football/ronaldo
f1/lewis
wwe/cena
So I want to read the contents of the text file, split on the delimiter "/" and store the first part of the string before the delimiter in one array, and the second half after the delimiter in another array. This is what I have tried to do so far:
try {
File f = new File("the-file-name.txt");
BufferedReader b = new BufferedReader(new FileReader(f));
String readLine = "";
System.out.println("Reading file using Buffered Reader");
while ((readLine = b.readLine()) != null) {
String[] parts = readLine.split("/");
}
} catch (IOException e) {
e.printStackTrace();
}
This is what I have achieved so far but I am not sure how to go on from here, any help in completing the program will be appreciated.
You can create two Lists one for the first part and se second for the second part :
List<String> part1 = new ArrayList<>();//create a list for the part 1
List<String> part2 = new ArrayList<>();//create a list for the part 2
while ((readLine = b.readLine()) != null) {
String[] parts = readLine.split("/");//you mean to split with '/' not with '-'
part1.add(parts[0]);//put the first part in ths list part1
part2.add(parts[1]);//put the second part in ths list part2
}
Outputs
[football, f1, wwe]
[ronaldo, lewis, cena]

Read two lines of CSV at a time (Java)

Is there a way to read two lines of a csv file at a time in Java?
I can read one at a time using Scanner (it has to be done like this):
String line = input.nextLine();
String[] nline = line.split ("[,]");
......
Here is some sample data and a short explanation. I need these read two at a time so I can can go about my other processing.
the first line that starts with "Create" creates a person
the second line "action" is the action of the created person
create,Mr. Jones,blah,blah
action,1,3
create,Mrs.Smith,blah,blah
action,4,10
....
Thanks in advance.
If you're looking to parse CSV files in Java I'd avoid line split via the string.split() method. You can run into issues if your field contains commas. For Java I'd recommend opencsv to parse the data. Similar to using the scanner you can read it in line by line, or slurp the entire file if it's not too large, and just iterate over the list to items at a time.
CSVReader reader = ...
String[] firstLine; // fields from first line
String[] secondLine; // fields from second line
while ((firstLine = reader.next()) != null && (secondLine = reader.next()) != null) {
// do something with two lines
}
Or
CSVReader reader = ...
List<String[]> allLines = reader.readAll();
// TODO: validate we have an even number of lines
for (int i = 0; i < allLines.size(); i += 2) {
String[] firstLine = allLines.get(i);
String[] secondLine = allLines.get(i+1);
// do something with two lines
}
String line = input.nextLine() + input.nextLine();
Cory's answer is good but for your next part to work
String[] nline = line.split ("[,]");
......
You need to add the comma in there
String line = input.nextLine() + "," + input.nextLine();

Categories

Resources