Splitting an input file in Java

Splitting an input file in Java - java

I've spent a good couple hours trying to solve this myself, but I just can't figure it out and decided to ask.
I am loading a file into my program for me to split into three fields.Each line in the text file contains 3 comma separated values
double, double, char. I plan on creating an array for each and then type wrapping each string in each array index into its respective array of its type. So first I need to split each line of the input.
So I opened the file and split the input as such:
Scanner fileIn = null;
String temp = "";
String[] data;
String fileName = "test.txt";
File textFile = new File(fileName);
fileIn = new Scanner(textFile);
while(fileIn.hasNext()){
temp += in.next;
}
data = temp.split(",");
for(String string: data) {
System.out.println(string);
}
*NOTE: I know this isn't the prettiest way, but this is just one of the many ways I tried to produce my output.
After using various variations of .split() such as temp.split(",") temp.split(",|\n") temp.split(",|\r") temp.split(",|\r\n") and others I get the same output of
0
0
.1
0
.2
0
.3
0
.4
0
.5
0
So basically after the last character of a line gets paired with the first character of the next line. And I have no Idea how to get it to output to one character per line. Here's a copy of the text file. Thanks for all the help in advance!
EDIT: Text copy of output.

It's your while loop. Just after the loop, temp looks like...
0,0,.1,0,.2,0,.3,0,.4,0,.5,0,.6,0,.7,0,.8,0,.9,0,.10,0,.
You can manually insert a comma like...
while(fileIn.hasNext()){
temp += fileIn.next() + ",";
}
Then temp looks like...
0,0,.,1,0,.,2,0,.,3,0,.,4,0,.,5,0,.,6,0,.,7,0,.,8,0,.,9,0,.,10,0,.,
which can be split with ","

Related

Stop printing line of text from a file after a character appears a second time

I am currently trying to stop printing a line of text after a , character is read on that line a second time from a text file. Example; 14, "Stanley #2 Philips Screwdriver", true, 6.95. Stop reading and print out the text after the , character is read a second time. So the output text should look like 14, "Stanley #2 Philips Screwdriver". I tried to use a limit on the regex to achieve this but, it just omits all the commas and prints out the entire text. This is what my code looks like so far;
public static void fileReader() throws FileNotFoundException {
File file = new File("/Users/14077/Downloads/inventory.txt");
Scanner scan = new Scanner(file);
String test = "4452";
while (scan.hasNext()) {
String line = scan.nextLine();
String[] itemID = line.split(",", 5); //attempt to use a regex limit
if(itemID[0].equals(test)) {
for(String a : itemID)
System.out.println(a);
}//end if
}//end while
}//end fileReader
I also tried to print just part of the text up until the first comma like;
String itemID[] = line.split(",", 5);
System.out.println(itemID[0]);
But no luck, it just prints 14. Please any help will be appreciated.

What about something using String.indexOf and String.substring functions (https://docs.oracle.com/javase/7/docs/api/java/lang/String.html)
int indexSecondOccurence = line.indexOf(",", line.indexOf(",") + 1);
System.out.println(line.substring(0, indexSecondOccurence + 1));

I'd suggest to modify your code as follows.
...
String[] itemID = line.split(",", 3); //attempt to use a regex limit
if(itemID[0].equals(test)) {
System.out.println(String.join (",", itemID[0],itemID[1]));
}
...
The split() call will produce an array with maximum 3 elements. First two will be the string pieces that you need. The last element is the remaining "tail" of the original string.
Now we only need to merge the pieces back with the join() method.
Hope this helps.

Reading int value from text file, and using value to alter file contents to separate text file.(java)

So I'm trying to read input from a text file, store it into variables, and then output an altered version of that text onto a different file using a variable from the file. I'm using Filereader, Scanner, and Printwriter to do this.
I have to store the last line (which is a number) from this text document and use that number to multiply the body of text onto a different file without including the number.
So the text is:
Original file text
And the output is SUPPOSED to be:
desired output
I'm able to retrieve the number and store it into my multiplier variable and retrieve the text into my string BUT it's stored as a single line if I check inside the console:
how the text is stored seen through console
so it outputs like this on the new file:
undesired output
I'm pretty new to Java, forgive me if there are any questions I can't answer that could help solve any issues with my code.
I've tried adding +"\n" to the file output line but no dice. I've also tried adding it to words += keys.nextLine() +"\n", and it separates the lines in the CONSOLE but not the file itself, unfortunately. Am I at least on the right track?
Here's my code:
public class fileRead {
public static void main(String[] args) throws IOException{
String words = "" ; //Stores the text from file
int multiplier = 1; // Stores the number
FileReader fr = new FileReader("hw3q3.txt");
Scanner keys = new Scanner(fr);
//while loop returns true if there is another line of text will repeat and store lines of text until the last line which is an int
while(keys.hasNextLine())
if (keys.hasNextInt()) { //validation that will read lines until it see's an integer and stores that number
multiplier = keys.nextInt(); //Stores the number from text
} else {
words += keys.nextLine();//Stores the lines of text
keys.nextLine();
}
keys.close();
PrintWriter outputFile = new PrintWriter("hw3q3x3.txt");
for(int i = 1; i <= multiplier; i++) {
outputFile.println(words);
}
outputFile.close();
System.out.println("Data Written");
System.out.println(multiplier);//just to see if it read the number
System.out.println(words); //too see what was stored in 'words'
}
}

See the if-statement below:
words += keys.nextLine(); //Stores the lines of text
if(words.endsWith(words.substring(words.lastIndexOf(" ")+1))) { //detect last word in sentence
words += '\n'; //after last word, append newline
}
...
for(int i = 1; i <= multiplier; i++) {
outputFile.print(words); //change this to print instead of println
}
Basically, after the last word in the sentence within the file we want to append a newline character to start writing the next sentence from new line.
The above if-statement detects the end of the sentence by determining the last word within the words String, and then appending a newline character to the words String. This will yield the result that you are expecting.
Breaking down the expression for you:
words.substring(words.lastIndexOf(" ")+1)
Return the part of the String (substring) that is located at the index of the last whitespace in the String plus 1 (lastIndexOf(" ") + 1) - i.e. we're getting the word after the last whitespace, so the last word.
Entire while-loop:
while(keys.hasNextLine()) {
if (keys.hasNextInt()) { //validation that will read lines until it see's an integer and stores that number
multiplier = keys.nextInt(); //Stores the number from text
} else {
words += keys.nextLine();//Stores the lines of text
if(words.endsWith(words.substring(words.lastIndexOf(" ")+1))) {
words += '\n';
}
}
}

Parse .csv File in java returns outofbounds exception

I have the following issue: I am trying to parse a .csv file in java, and store specifically 3 columns of it in a 2 Dimensional array. The Code for the method looks like this:
public static void parseFile(String filename) throws IOException{
FileReader readFile = new FileReader(filename);
BufferedReader buffer = new BufferedReader(readFile);
String line;
String[][] result = new String[10000][3];
String[] b = new String[6];
for(int i = 0; i<10000; i++){
while((line = buffer.readLine()) != null){
b = line.split(";",6);
System.out.println("ID: "+b[0]+" Title: "+b[3]+ "Description: "+b[4]); // Here is where the outofbounds exception occurs...
result[i][0] = b[0];
result[i][1] = b[3];
result[i][2] = b[4];
}
}
buffer.close();
}
I feel like I have to specify this: the .csv file is HUGE. It has 32 columns, and (almost) 10.000 entries (!).
When Parsing, I keep getting the following:
XXXXX CHUNKS OF SUCCESFULLY EXTRACTED CODE
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException:3
at ParseCSV.parseFile(ParseCSV.java:24)
at ParseCSV.main(ParseCSV.java:41)
However, I realized that SOME of the stuff in the file has a strange format e.g. some of the texts inside it for instance have newlines in them, but there is no newline character involved in any way. However, if I delete those blank lines manually, the output generated (before the error message is prompted) adds the stuff to the array up until the next blank line ...
Does anyone have an idea how to fix this? Any help would be greately appreciated...

Your first problem is that you probably have at least one blank line in your csv file. You need to replace:
b = line.split(";", 6);
with
b = line.split(";");
if(b.length() < 5){
System.err.println("Warning, line has only " + b.length() +
"entries, so skipping it:\n" + line);
continue;
}
If your input can legitimately have new lines or embedded semi-colons within your entries, that is a more complex parsing problem, and you are probably better off using a third-party parsing library, as there are several very good ones.
If your input is not supposed to have new lines in it, the problem probably is \r. Windows uses \r\n to represent a new line, while most other systems just use \n. If multiple people/programs edited your text file, it is entirely possible to end up with stray \r by themselves, which are not easily handled by most parsers.
A way to easily check if that's your problem is before you split your line, do
line = line.replace("\r","").
If this is a process you are repeating many times, you might need to consider using a Scanner (or library) instead to get more efficient text processing. Otherwise, you can make do with this.

When you have new lines in your CSV file, after this line
while((line = buffer.readLine()) != null){
variable line will have not a CSV line but just some text without ;
For example, if you have file
column1;column2;column
3 value
after first iteration variable line will have
column1;column2;column
after second iteration it will have
3 value
when you call "3 value".split(";",6) it will return array with one element. and later when you call b[3] it will throw exception.
CSV format has many small things, to implement which you will spend a lot of time. This is a good article about all possible csv examples
http://en.wikipedia.org/wiki/Comma-separated_values#Basic_rules_and_examples
I would recommend to you some ready CSV parsers like this
https://commons.apache.org/proper/commons-csv/apidocs/org/apache/commons/csv/CSVParser.html

String's split(pattern, limit) method returns an array sized to the number of tokens found up to the the number specified by the limit parameter. Limit is the maximum, not the minimum number of array elements returned.
"1,2,3" split with (",", 6) with return an array of 3 elements: "1", "2" and "3".
"1,2,3,4,5,6,7" will return 6 elements: "1", "2", "3", "4", "5" and ""6,7" The last element is goofy because the split method stopped splitting after 5 and returned the rest of the source string as the sixth element.
An empty line is represented as an empty string (""). Splitting "" will return an array of 1 element, the empty string.
In your case, the string array created here
String[] b = new String[6];
and assigned to b is replaced by the the array returned by
b = line.split(";",6);
and meets it's ultimate fate at the hands of the garbage collector unseen and unloved.
Worse, in the case of the empty lines, it's replaced by a one element array, so
System.out.println("ID: "+b[0]+" Title: "+b[3]+ "Description: "+b[4]);
blows up when trying to access b[3].
Suggested solution is to either
while((line = buffer.readLine()) != null){
if (line.length() != 0)
{
b = line.split(";",6);
System.out.println("ID: "+b[0]+" Title: "+b[3]+ "Description: "+b[4]); // Here is where the outofbounds exception occurs...
...
}
or (better because the previous could trip over a malformed line)
while((line = buffer.readLine()) != null){
b = line.split(";",6);
if (b.length() == 6)
{
System.out.println("ID: "+b[0]+" Title: "+b[3]+ "Description: "+b[4]); // Here is where the outofbounds exception occurs...
...
}
You might also want to think about the for loop around the while. I don't think it's doing you any good.
while((line = buffer.readLine()) != null)
is going to read every line in the file, so
for(int i = 0; i<10000; i++){
while((line = buffer.readLine()) != null){
is going to read every line in the file the first time. Then it going to have 9999 attempts to read the file, find nothing new, and exit the while loop.
You are not protected from reading more than 10000 elements because the while loop because the while loop will read a 10001th element and overrun your array if there are more than 10000 lines in the file. Look into replacing the big array with an arraylist or vector as they will size to fit your file.

Please check b.length>0 before accessing b[].

Printing and matching values in java

I have a program that will read a text file starting on line number 29. If the line contains the words "n.a" or "Total" the program will skip those lines.
The program will get the elements [2] and [6] from the array.
I need to get element [6] of the array and print it underneath its corresponding value.
Element[2] of the array is where all the analytes are and element[6] contains the amount of each analyte.
The files that the program will read look like this:
12 9-62-1
Sample Name: 9-62-1 Injection Volume: 25.0
Vial Number: 37 Channel: ECD_1
Sample Type: unknown Wavelength: n.a.
Control Program: Anions Run Bandwidth: n.a.
Quantif. Method: Anions Method Dilution Factor: 1.0000
Recording Time: 10/2/2013 19:55 Sample Weight: 1.0000
Run Time (min): 14.00 Sample Amount: 1.0000
No. Ret.Time Peak Name Height Area Rel.Area Amount Type
min µS µS*min % mG/L
1 2.99 Fluoride 7.341 1.989 0.87 10.458 BMB
2 3.88 Chloride 425.633 108.551 47.72 671.120 BMb
3 4.54 Nitrite 397.537 115.237 50.66 403.430 bMB
4 5.39 n.a. 0.470 0.140 0.06 n.a. BMB
5 11.22 Sulfate 4.232 1.564 0.69 13.064 BMB
Total: 835.213 227.482 100.00 1098.073
The program needs to read that type of files and stores the element[6] of the array under a heading in a separate file in a folder. That file will have a heading like this:
Fluoride,Chloride,Nitrite,Sulfate,
The amount of fluoride should go under fluoride, the amount of chloride should go under chloride and so on and if there isn`t Nitrite or any other analyte it should put a zero for each analyte.
I just need to know how to match that and then I know I have to make write to the file which I will do later, but for know I need help matching.
The final output should looe like this.
The first line will be written in the textfile and then the second line will be values that will be match under its corresponding analyte like this:
Sample#,Date,Time,Fluoride,Chloride,Nitrite,Sulfate,9-62-1,10/2/2013,19:55,10.458,671.120,403.430,13.064,
Also again if an analyte isnt present on the file or it is null it should put a 0.
Here is my code:
//Get the sample#, Date and time.
String line2;
while ((line2 = br2.readLine()) != null) {
if (--linesToSkip2 > 0) {
continue;
}
if (line2.isEmpty() || line2.trim().equals("") || line2.trim().equals("\n")) {
continue;
}
if (line2.contains("n.a.")) {
continue;
}
if (line2.contains("Total")) {
continue;
}
String[] values2 = line2.split("\t");
String v = values2[2];//Stored element 2 in a string.
String v2 = values2[6];//Stored element 6 in a string.
String analytes = "Fluoride,Chloride,Nitrite,Sulfate";//Stored the analytes into an array.
if (analytes.contains(v)) {
System.out.println(v2);
}
int index2 = 0;
for (String value2 : values2) {
/*System.out.println("values[" + index + "] = " + value);*/
index2++;
}
System.out.print(values2[6] + "\b,");
/*System.out.println(values[6]+"\b,");*/
br.close();
}
Thanks in advance!

So if i understand your task right and every element is in new line.
Where is a lot of ways how to solve this, but with your code simpliest way to solve it in my opinion would be with StringBuffer.
//In your code i saw you have to arrays one of them with element name
//other with element code or smth
StringBuffer firstLine = new StringBuffer();
StringBuffer secondLine = new StringBuffer();
public static void printResult(String[] Name, String[] Code){
//First we gona make first line
//Here we are adding data before Names
firstLine.append("Stuff before Names");
for(int i =0;i<name.length;i++){
//Here we gona add all names in the list which is good
//Dont forget spaces
firstLine.append(name[i]+ " ");
}
//And same goes for second line just change loop array and data added before loop.
//And in the end this should print out your result
System.out.println(firstLine+"\n" + secondLine);
}
Call this method after all file reading is done.
Hope it helps!

Sorting info from a txt file into two different arrays

For a uni assignment, I have to take input from a text file and sort it into two separate arrays. The text file is a football league table, arranged as such:
Barcelona 34
Real Madrid 32
I have written a piece of code like this:
holdingString = fileInput.readLine ();
StringTokenizer sort = new StringTokenizer (holdingString + " ");
countOfTokens = sort.countTokens();
System.out.println (countOfTokens + " tokens: " + holdingString);
This prints out the number of tokens and what the tokens are for each line, so it gives output of
Two tokens: Barcelona 34
Three tokens: Real Madrid 32
I've then written this piece of code:
for (int i = 0; i < countOfTokens; i++)
{
String temp = sort.nextToken ();
System.out.println(temp);
}
This reads just the next token and prints it out.
However, rather than printing the next token out, I want to check if it is a word or a number, and separate it into a different array accordingly, so it will be like this:
ArrayTeam Zero Element Barcelona
ArrayTeam First Element Real Madrid
ArrayPoints Zero Element 34
ArrayPoints First Element 32
What's the easiest way to do this? I've tried using a try/catch, but didn't get it right. I've also tried using an if statement with \d, but that's not worked either.

Like AmitD, I agree that using split is more appropriate in this case, but if you still like to use a StringTokenizer you do something like:
StringBuilder teamName=new StringBuilder();
for (int i = 0; i < countOfTokens-1; i++)
{
if (i>0) teamName.append(' ');
teamName.append(sort.nextToken());
}
teamNames[k]=teamName.toString(); //add the new team to your teamNames array
points[k]=Integer.parseInt(sort.nextToken()); //if your points array is of int type

you could use java.util.Scanner class to read data from the file. it has methods such as nextInt(), nextDouble ...whhich might be useful in your case.
Scanner scan = new Scanner(file);
int number;
if(scan.hasNextInt()){
number = scan.nextInt();
}
check Scanner API

String readLine = "Real Madrib 40";
String[] team = readLine.split( "\\d" );
System.out.println(team[0]);
String score = readLine.replace( team[0],"" );
System.out.println(score);
Output :
team[0] : Real Madrib
score : 40

You can save all that trouble using split
String strs[] = holdingString.split("\\s");
E.g.
"Barcelona 34".split("\\s"); will return you Array of Strings where
array[0]=Barcelona array[1]=34
From Javadoc of StringTokenizer
StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.
Update
As #madhairsilence pointed out
You need another deliminator. You can use = like property files
"Real Madrid =34".split("=");//will return you Array of Strings where
array[0]=Real Madrid, array[1]=34
You can use Scanner as you are reading from file.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Splitting an input file in Java - java

Related

Stop printing line of text from a file after a character appears a second time

Reading int value from text file, and using value to alter file contents to separate text file.(java)

Parse .csv File in java returns outofbounds exception

Printing and matching values in java

Sorting info from a txt file into two different arrays

Categories

Resources