Read and split a text file (java) - java

I have some text files with time information, like:
46321882696937;46322241663603;358966666
46325844895266;46326074026933;229131667
46417974251902;46418206896898;232644996
46422760835237;46423223321897;462486660
For now, I need the third column of the file, to calculate the average.
How can I do this? I need to get every text lines, and then get the last column?

You can read the file line by line using a BufferedReader or a Scanner, or even some other techinique. Using a Scanner is pretty straightforward, like this:
public void read(File file) throws IOException{
Scanner scanner = new Scanner(file);
while(scanner.hasNext()){
System.out.println(scanner.nextLine());
}
}
For splitting a String with a defined separator, you can use the split method, that recevies a Regular Expression as argument, and splits a String by all the character sequences that match that expression. In your case it's pretty simple, just the ;
String[] matches = myString.split(";");
And if you want to get the last item of an array you can just use it's length as parameter. remembering that the last item of an array is always in the index length - 1
String lastItem = matches[matches.length - 1];
And if you join all that together you can get something like this:
public void read(File file) throws IOException{
Scanner scanner = new Scanner(file);
while(scanner.hasNext()){
String[] tokens = scanner.nextLine().split(";");
String last = tokens[tokens.length - 1];
System.out.println(last);
}
}

Yes you have to read each line of the file and split it by ";" separator and read third element.

Related

Stop printing line of text from a file after a character appears a second time

I am currently trying to stop printing a line of text after a , character is read on that line a second time from a text file. Example; 14, "Stanley #2 Philips Screwdriver", true, 6.95. Stop reading and print out the text after the , character is read a second time. So the output text should look like 14, "Stanley #2 Philips Screwdriver". I tried to use a limit on the regex to achieve this but, it just omits all the commas and prints out the entire text. This is what my code looks like so far;
public static void fileReader() throws FileNotFoundException {
File file = new File("/Users/14077/Downloads/inventory.txt");
Scanner scan = new Scanner(file);
String test = "4452";
while (scan.hasNext()) {
String line = scan.nextLine();
String[] itemID = line.split(",", 5); //attempt to use a regex limit
if(itemID[0].equals(test)) {
for(String a : itemID)
System.out.println(a);
}//end if
}//end while
}//end fileReader
I also tried to print just part of the text up until the first comma like;
String itemID[] = line.split(",", 5);
System.out.println(itemID[0]);
But no luck, it just prints 14. Please any help will be appreciated.
What about something using String.indexOf and String.substring functions (https://docs.oracle.com/javase/7/docs/api/java/lang/String.html)
int indexSecondOccurence = line.indexOf(",", line.indexOf(",") + 1);
System.out.println(line.substring(0, indexSecondOccurence + 1));
I'd suggest to modify your code as follows.
...
String[] itemID = line.split(",", 3); //attempt to use a regex limit
if(itemID[0].equals(test)) {
System.out.println(String.join (",", itemID[0],itemID[1]));
}
...
The split() call will produce an array with maximum 3 elements. First two will be the string pieces that you need. The last element is the remaining "tail" of the original string.
Now we only need to merge the pieces back with the join() method.
Hope this helps.

How to convert scanner to String or List

I have a scanner with many lines of text(representing number) and I want to convert all the text in the scanner to a List.
Example:
Scanner myScanner = new Scanner(new File("input.txt"));
input.txt:
000110100110
010101110111
111100101011
101101001101
011011111110
011100011001
110010011100
000001011100
101110100110
010001011100
011111001010
100111100101
111111000010
My first thought was to convert it to a String by changing the delimiter to something I know is not in the file:
myScanner.useDelimiter("impossible String");
String content = myScanner.next();
and then use
List<String> fullInput = Arrays.asList(content.split("\n"));
However, it gives me problems later on with parsing the numbers on the scanner. I've tried debugging it but I can't seem to understand the problem. For example, I made it print the String to the console before parsing it. It would print a proper number(asString) and then give me NumberFormatException when it is supposed to parse.
Here's the runnable code:
public static void main(String[] args) throws FileNotFoundException {
Scanner myScanner = new Scanner(new File("input.txt"));
myScanner.useDelimiter("impossible String");
String content = myScanner.next();
List<String> fullInput = Arrays.asList(content.split("\n"));
System.out.println(fullInput.get(1));
System.out.println(Long.parseLong(fullInput.get(1)));
}
This is what I ended up using after the first didn't work:
Scanner myScanner = new Scanner(new File("input.txt"));
List<String> fullInput = new ArrayList<>();
while (sc.hasNextLine())
fullInput.add(myScanner.nextLine());
Do you know what's wrong with the first method or is there a better way to do this?
Because you are parsing a string that represents a number that's beyond the size of an integer.
int values can be between -2,147,483,648 to 2,147,483,647.
fullInput.get(1) gives you 010101110111 which is greater than 2,147,483,647.
You can use long.
long val = Long.parseLong(fullInput.get(1));
If the string represents binary numbers and you want to convert them to int, then you need to provide the base when parsing the string.
int val = Integer.parseInt(fullInput.get(1), 2);
For what you are trying to do here, Scanner is the wrong solution.
If your goal is to simply read the all lines of the file as String[] you can use the Files.readAllLines(Path, Charset) method (javadoc) to do this. You could then wrap that as a List using Arrays.asList(...).
What you are actually doing could work under some circumstances. But one possible problem is that String.split("\n") only works on systems where the line terminator is a single NL character. On Windows, the line terminator is a CR NL sequence. And in that case, String.split("\n") will leave a CR at the end of all but the last string / line. That would be sufficient to cause Long.parseLong(...) to throw a NumberFormatException. (The parseXxx methods do not tolerate extraneous characters such as whitespace in the argument.)
A possible solution to the extraneous whitespace problem is to trim the string; e.g.
System.out.println(Long.parseLong(fullInput.get(1).trim()));
The trim() method (javadoc) returns a string with any leading and/or trailing whitespace removed.
But there is another way to deal with this. If you don't care whether each number in the input file is on a separate line, you could do something like this:
Scanner myScanner = new Scanner(new File("input.txt"));
List<Long> numbers = new ArrayList<>();
while (myScanner.hasNextLong()) {
numbers.append(myScanner.nextLong());
}
Finally, #ChengThao makes a valid point. It looks like these are binary numbers. If they are in fact binary, then it makes more sense to parse them using Long.parseLong(string, radix) with a radix value of 2. However if you parse them as decimal using parseLong (as you are currently doing) the values in your question will fit into a long type.

How to skip a character when using Scanner

I want to read words from a text file which looks like:
"A","ABILITY","ABLE","ABOUT","ABOVE","ABSENCE","ABSOLUTELY","ACADEMIC","ACCEPT","ACCESS","ACCIDENT","ACCOMPANY", ...
I read the words using split("\",\"") so I have them in a matrix. Unfortunately I cannot skip reading the first quotation mark, which starts my .txt file, so as a result in my console I have:
"A
ABILITY
ABLE
ABOUT
ABOVE
Do you know how can I skip the first quotation mark? I was trying both
Scanner in = new Scanner(file).useDelimiter("\"");
and parts[0].replace("\"", "");, but it doesn't work.
package list_1;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
public class exercise {
public static void main(String[] args) throws FileNotFoundException{
File file = new File("slowa.txt");
Scanner in = new Scanner(file).useDelimiter("\""); //delimiter doesn't work!
String sentence = in.nextLine();
String[] parts = sentence.split("\",\"");
parts[0].replace("\"", ""); //it doesn't work!
for (int i=0; i<10 ; i++){
System.out.println(parts[i]);
}
}
}
Strings are immutable which means that you can't change their state. Because of that replace doesn't change string on which it was invoked, but creates new one with replaced data which you need to store somewhere (probably in reference which stored original string). So instead of
parts[0].replace("\"", "");
you need to use
parts[0] = parts[0].replace("\"", "");
Anyway setting delimiter and using nextLine doesn't make much sense because this method is looking for line separators (like \n \r \r\n), not your delimiters. If you want to make scanner use delimiter use its next() method.
You can also use different delimiter which will represent " or ",". You can create one with following regex "(,")?.
So your code could look like
Scanner in = new Scanner(file).useDelimiter("\"(,\")?");
while(in.hasNext()){
System.out.println(in.next());
}
You can use this regular expression. It works for me:
Scanner in = new Scanner(file).useDelimiter("\"(,\")?");
while(in.hasNext()){
System.out.println(in.next());
}

Empty array after reading text file

It's fixed! Thanks to Edgar Boda.
I created a class that should read a text file and put that into an array:
private static String[] parts;
public static void Start() throws IOException{
InputStream instream = new FileInputStream("Storyline.txt");
InputStreamReader inputreader = new InputStreamReader(instream);
BufferedReader buffreader = new BufferedReader(inputreader);
int numberOfLines=0, numberOfActions;
String line = null, input="";
while((line=buffreader.readLine())!=null){
line=buffreader.readLine();
input+=line;
}
parts=input.split(";");
}
But, when I try and output the array, it only contains one string. The last from the file, that I put in.
Here's the file I read from:
0;0;
Hello!;
Welcome!To this.;
56;56;
So;
I think it's something in the loop; but trying to put parts[number] in there doesn't work... Any suggestions?
You want to read the whole file into an String first maybe:
String line = null;
String input = "";
while((line=buffreader.readLine())!=null){
input += line;
}
parts = input.split(";");
You are overwriting the string array parts in every iteration of your while loop, so that's why it only contains the last line.
To store the entire file contents, with fields split, you'll need a 2-dimensional array, not a 1-dimensional array. Assuming there are 5 lines in the file:
private static String[][] parts = new String[5][];
Then assign each split array to an element of parts each loop:
parts[i++]=line.split(";"); // Assuming you define "i" for the line number
Also, split by default discards trailing empty tokens. To retain them, use the two-arg overload of split that takes a limit parameter. Pass a negative number to retain all tokens.
parts[i++] = line.split(";", -1);
It will only contain the last line; you are reassigning parts every time:
parts = line.split(";");
This trashes the previous reference and reassigns a reference to a new array to it. A better way might be to use a StringBuilder and append the lines and then split later:
StringBuilder stringBuilder = new StringBuilder();
while((line=buffreader.readLine())!=null){
stringBuilder.append(line);
}
parts = stringBuilder.toString().split(";");
This way you will get everything you want in one array. If you want to split everything such that you have one array per line, you will need parts to be a two-dimensional array. But the drawback is that you will need to know how many lines will be there in the file. Instead, you can use List<String[]> to keep track of your arrays:
List<String[]> lineParts = new ArrayList<String[]>();
while((line=buffreader.readLine())!=null){
lineParts.add(line.split(";"));
}

My output is assuming the whole file is one line

public static void main(String args[]) throws FileNotFoundException
{
String inputFileName = "textfile.txt";
printFileStats(inputFileName);
}
public static void printFileStats(String fileName) throws FileNotFoundException
{
String outputFileName = "outputtextfile.txt";
File inputFile = new File(fileName);
Scanner in = new Scanner(inputFile);
PrintWriter out = new PrintWriter(outputFileName);
int lines = 0;
int words = 0;
int characters = 0;
while(in.hasNextLine())
{
lines++;
while(in.hasNext())
{
in.next();
words++;
}
}
out.println("Lines: " + lines);
out.println("Words: " + words);
out.println("Characters: " + characters);
in.close();
out.close();
}
I have a text file containing five lines
this is
a text
file
full of stuff
and lines
The code creates an output file
Lines: 1
Words: 10
Characters: 0
However, if I remove the capability for reading the number of words in the file, it correctly states the number of lines (5). Why is this happening?
Your inner while loop is gobbling up the whole file. You want to count the number of words in each line, right? Try this instead:
while (in.hasNextLine())
{
lines++;
String line = in.nextLine();
for (String word : line.split("\\s"))
{
words++;
}
}
Note that splitting on spaces is a very naive approach to tokenization (word-splitting) and will only work for simple examples like the one you have here.
Of course, you could also do words += line.split("\\s").length; instead of that inner loop.
in.hasNext() and in.next() treat all whitespace characters as word separators, including newline characters. Your inner loop is eating all the newlines as it's counting all the words.
This reads next Token, not the line :
in.next();
So it just read next and next and next and dont care about line ending. Space or \n is considered as white space usually, so methods like this one does not make any difference between them.
The reason is, that hasNext() does not care about line breaks.
So, you are entering the while(in.hasNextLine()) loop, but then you are consuming the whole file with the while(in.hasNext()) loop, resulting in 1 line and 10 words.
-> Check the token consumed by hasNext() for EOL-Characters, then increase line count.
OR:
Use String line = scanner.nextLine() to obtain exactly ONE line, and then use a second scanner to fetch all tokens of that line: scanner2 = new Scanner(line); while(scanner2.hasNext())

Categories

Resources