How to print specific numbers from txt file? - java

I have a text file written in the following texts:
18275440:Annette Nguyen:98
93840989:Mary Rochetta:87
23958632:Antoine Yung:79
23658231:Claire Coin:78
23967548:Emma Chung:69
23921664:Jung Kim:98
23793215:Harry Chiu:98
I want to extract last two digit numbers from each line. This is my written code:
for (int i = 3; i < 25; i++) {
line = inFile.nextLine();
String[] split = line.split(":");
System.out.println(split[2]);
}
And I am getting a runtime error.

Update the reading method, if you are using Scanner you can check if there are more lines left or not.
while(inFile.hasNextLine()) {
line = inFile.nextLine();
String[] split = line.split(":");
System.out.println(split[2]);
}

Why the complexity of the for loop specification? You don't use i, so why bother with all that. Don't you just want to read lines until there aren't any more? If you do that, assuming that inFile will let you read lines from it, your code to actually parse each line and extract the number at the end seems right. Here's a complete (minus the class definition) example that uses your parsing logic:
public static void main(String[] args) throws IOException {
// Open the input data file
BufferedReader inFile = new BufferedReader(new FileReader("/tmp/data.txt"));
while(true) {
// Read the next line
String line = inFile.readLine();
// Break out of our loop if we've run out of lines
if (line == null)
break;
// Strip off any whitespace on the beginning and end of the line
line = line.strip();
// If the line is empty, skip it
if (line.isEmpty())
continue;
// Parse the line, and print out the third component, the two digit number at the end of the line
String[] split = line.strip().split(":");
System.out.println(split[2]);
}
}
If there's a file named /tmp/data.txt with the contents you provide in your question, this is the output you get from this code:
98
87
79
78
69
98
98

Don't be so explicit with your loop criteria. Use a counter to acquire the data you want from the file, for example:
int lineCounter = 0;
String line;
while (inFile.hasNextLine()) {
line = inFile.nextLine();
lineCounter++;
if (lineCounter >=3 && lineCounter <= 24) {
String[] split = line.trim().split(":");
System.out.println(split[2]);
}
}

I don't know why your code gives error. If you had any unwanted lines (I see you have 3 such lines in your code) in the beginning just run an empty scanner over them.
Scanner scanner = new Scanner(new File("E:\\file.txt"));
String[] split;
// run an empty scanner
for (int i = 1; i <= 3; i++) scanner.nextLine();
while (scanner.hasNextLine()) {
split = scanner.nextLine().split(":");
System.out.println(split[2]);
}
In case you don't know of such lines and they would not comply to the rules of the lines, then you could use try...catch to eliminate them. I'm using a simple exception here. But you could throw an exception when your conditions doesn't meet.
Suppose your file looks like this:
1
2
3
18275440:Annette Nguyen:98
93840989:Mary Rochetta:87
23958632:Antoine Yung:79
bleh bleh bleh
23658231:Claire Coin:78
23967548:Emma Chung:69
23921664:Jung Kim:98
23793215:Harry Chiu:98
Then your code would be
Scanner scanner = new Scanner(new File("E:\\file.txt"));
String[] split;
// run an empty scanner
// for (int i = 1; i <= 3; i++) scanner.nextLine();
while (scanner.hasNextLine()) {
split = scanner.nextLine().split(":");
try {
System.out.println(split[2]);
} catch (ArrayIndexOutOfBoundsException e) {
}
}

Assuming you're using Java 8, you can take a simpler, less imperative approach by using BufferedReader's lines method, which returns a Stream:
BufferedReader reader = new BufferedReader(new FileReader("/tmp/data.txt"));
reader.lines()
.map(line -> line.split(":")[2])
.forEach(System.out::println);
But, come to think of it, you could avoid BufferedReader by using Files from Java's NIO API:
Files.lines(Paths.get("/tmp/data.txt"))
.map(line -> line.split(":")[2])
.forEach(System.out::println);

You can split on \d+:[\p{L}\s]+: and take the second element from the resulting array. The regex pattern, \d+:[\p{L}\s]+: means a string of digits (\d+) followed by a : which in turn is followed by a string of any combinations of letters and space which in turn is followed by a :
public class Main {
public static void main(String[] args) {
String line = "18275440:Annette Nguyen:98";
String[] split = line.split("\\d+:[\\p{L}\\s]+:");
String n = "";
if (split.length == 2) {
n = split[1].trim();
}
System.out.println(n);
}
}
Output:
98
Note that \p{L} specifies a letter.

Related

Why is the file reading the last line of a row and the first one of the second row

While reading a Excel CSV file using a scanner with a comma delimiter, its reading the last node in the first row but also reading the first node of the next row at the same time.
int counter = 0;
String[] u = new String[3];
for (int j = 1; j <= 3; j++) {
String a = in.next();
u[counter] = a;
counter++;
}
}
After using Debugger, I noticed when it reached to the last element it combined them making something like -14256\r\n-14323
-14256 = Last element of first row
-14323 = First element of next row
The scanner took only the comma as the delimiter. But you want it to accept also the end of a line as another delimiter.
I assume that you instantiate the Scanner like this, using Scanner::useDelimiter:
Scanner s = new Scanner( inputStream ).useDelimiter( "," );
If I get the Pattern definition right, it should be:
Scanner s = new Scanner( inputStream ).useDelimiter( ",|\\R" );
The \R stands for
Linebreak matcher: Any Unicode linebreak sequence, is equivalent to \u000D\u000A|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029]
Refer to the documentation for java.util.regex.Pattern for the details.
A CSV file contains lines of text where each line contains values separated by commas. Hence I suggest that you read the file line by line and then split each line on the commas. Something like...
java.io.FileReader fr = new java.io.FileReader("path to file");
java.io.BufferedReader br = new java.io.BufferedReader(fr);
String line = br.readLine();
while (line != null) {
String[] fields = line.split(",");
// Add code here to handle the "fields".
line = br.readLine();
}
Note that the above code is not a complete solution but a starting point. For instance, I haven't closed the BufferedReader.

Read Strings separated by newline using BufferedReader into a String array

I've read number of such questions but they are all about reading inputs from a txt file. I want to read input from user and not from the file.
I've input like following:
6 //number of total Strings to store in array
babu
anand
rani
aarti
nandu
rani
I've tried the following code to take such input in a String array:
int n = in.nextInt(); // n= 6 here
String[] s = new String[n]; //String array of size 6 here
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
try{
s = br.readLine().split("\\s");
}
catch(Exception e){
System.out.println(e);
}
Is the regex provided to the split() is correct or not? What I'm missing here? If this is not correct approach than what should I do for this problem?
Regex are using backslashes (\) while you used slashes //s, correct one is \\s.
But this split is not needed, you just need the readLine, and you will get what you need (assuming you don't want to split words in the line).
You should use a loop to read all the data (and get rid of Scanner, that you appear to have in the in variable):
String[] s = null
try (BufferedReader br = new BufferedReader(new InputStreamReader(System.in)) {
int n = Integer.parseInt(br.readLine());
for (int line = 0; line < n; line++) {
s[line] = br.readLine();
}
} catch(Exception e){
System.out.println(e);
}
Move the third line before the first one.
Then use this in your new second line:
int n = Integer.parseInt(br.readLine());
And, of course, you need a loop to put your input strings into an array.
This should help.

Read the each string text from file in java

I am new in java. I just wants to read each string in java and print it on console.
Code:
public static void main(String[] args) throws Exception {
File file = new File("/Users/OntologyFile.txt");
try {
FileInputStream fstream = new FileInputStream(file);
BufferedReader infile = new BufferedReader(new InputStreamReader(
fstream));
String data = new String();
while ((data = infile.readLine()) != null) { // use if for reading just 1 line
System.out.println(""+data);
}
} catch (IOException e) {
// Error
}
}
If file contains:
Add label abc to xyz
Add instance cdd to pqr
I want to read each word from file and print it to a new line, e.g.
Add
label
abc
...
And afterwards, I want to extract the index of a specific string, for instance get the index of abc.
Can anyone please help me?
It sounds like you want to be able to do two things:
Print all words inside the file
Search the index of a specific word
In that case, I would suggest scanning all lines, splitting by any whitespace character (space, tab, etc.) and storing in a collection so you can later on search for it. Not the question is - can you have repeats and in that case which index would you like to print? The first? The last? All of them?
Assuming words are unique, you can simply do:
public static void main(String[] args) throws Exception {
File file = new File("/Users/OntologyFile.txt");
ArrayList<String> words = new ArrayList<String>();
try {
FileInputStream fstream = new FileInputStream(file);
BufferedReader infile = new BufferedReader(new InputStreamReader(
fstream));
String data = null;
while ((data = infile.readLine()) != null) {
for (String word : data.split("\\s+") {
words.add(word);
System.out.println(word);
}
}
} catch (IOException e) {
// Error
}
// search for the index of abc:
for (int i = 0; i < words.size(); i++) {
if (words.get(i).equals("abc")) {
System.out.println("abc index is " + i);
break;
}
}
}
If you don't break, it'll print every index of abc (if words are not unique). You could of course optimize it more if the set of words is very large, but for a small amount of data, this should suffice.
Of course, if you know in advance which words' indices you'd like to print, you could forego the extra data structure (the ArrayList) and simply print that as you scan the file, unless you want the printings (of words and specific indices) to be separate in output.
Split the String received for any whitespace with the regex \\s+ and print out the resultant data with a for loop.
public static void main(String[] args) { // Don't make main throw an exception
File file = new File("/Users/OntologyFile.txt");
try {
FileInputStream fstream = new FileInputStream(file);
BufferedReader infile = new BufferedReader(new InputStreamReader(fstream));
String data;
while ((data = infile.readLine()) != null) {
String[] words = data.split("\\s+"); // Split on whitespace
for (String word : words) { // Iterate through info
System.out.println(word); // Print it
}
}
} catch (IOException e) {
// Probably best to actually have this on there
System.err.println("Error found.");
e.printStackTrace();
}
}
Just add a for-each loop before printing the output :-
while ((data = infile.readLine()) != null) { // use if for reading just 1 line
for(String temp : data.split(" "))
System.out.println(temp); // no need to concatenate the empty string.
}
This will automatically print the individual strings, obtained from each String line read from the file, in a new line.
And afterwards, I want to extract the index of a specific string, for
instance get the index of abc.
I don't know what index are you actually talking about. But, if you want to take the index from the individual lines being read, then add a temporary variable with count initialised to 0.
Increment it till d equals abc here. Like,
int count = 0;
for(String temp : data.split(" ")){
count++;
if("abc".equals(temp))
System.out.println("Index of abc is : "+count);
System.out.println(temp);
}
Use Split() Function available in Class String.. You may manipulate according to your need.
or
use length keyword to iterate throughout the complete line
and if any non- alphabet character get the substring()and write it to the new line.
List<String> words = new ArrayList<String>();
while ((data = infile.readLine()) != null) {
for(String d : data.split(" ")) {
System.out.println(""+d);
}
words.addAll(Arrays.asList(data));
}
//words List will hold all the words. Do words.indexOf("abc") to get index
if(words.indexOf("abc") < 0) {
System.out.println("word not present");
} else {
System.out.println("word present at index " + words.indexOf("abc"))
}

converting one line string into individual integers

if i have this line in a file: 2 18 4 3
and i want to read it as individual integers, how could i?
i'm using bufferreader:
BufferedReader(new FileReader("mp1.data.txt"));
i have tried to use:
BufferedReader(new RandomAccessFile("mp1.data.txt"));
so i can use the method
.readCahr();
but i got an error
if i use
int w = in.read();
it will read the ASCII, and i want it as it is(in dec.)
i was thinking to read it as a string first, but then could i separate each number?
also i was thinking to let each number in a line, but the file i have is long with numbers
Consider using a Scanner:
Scanner scan = new Scanner(new File("mp1.data.txt"));
You can then use scan.nextInt() (which returns an int, not a String) so long as scan.hasNextInt().
No need for that ugly splitting and parsing :)
However, note that this approach will continue reading integers past the first line (if that's not what you want, you should probably follow the suggestions outlined in the other answers for reading and handling only a single line).
Furthermore, hasNextInt() will return false as soon as a non-integer is encountered in the file. If you require a way to detect and handle invalid data, you should again consider the other answers.
It's important to approach larger problems in software engineering by breaking them into smaller ones. In this case, you've got three tasks:
Read a line from the file
Break it into individual parts (still strings)
Convert each part into an integer
Java makes each of these simple:
Use BufferedReader.readLine() to read the line as a string first
It looks like the splitting is as simple as splitting by a space with String.split():
String[] bits = line.split(" ");
If that's not good enough, you can use a more complicated regular expression in the split call.
Parse each part using Integer.parseInt().
Another option for the splitting part is to use the Splitter class from Guava. Personally I prefer that, but it's a matter of taste.
You can split() the String and then use the Integer.parseInt() method in order to convert all the elements to Integer objects.
try {
BufferedReader br = new BufferedReader(new FileReader("mp1.data.txt"));
String line = null;
while ((line = br.readLine()) != null) {
String[] split = line.split("\\s");
for (String element : split) {
Integer parsedInteger = Integer.parseInt(element);
System.out.println(parsedInteger);
}
}
}
catch (IOException e) {
System.err.println("Error: " + e);
}
Once you read the line using BufferedReader, you can use String.split(regex) method to split the string by space ("\\s").
for(String s : "2 18 4 3".split("\\s")) {
int i = Integer.parseInt(s);
System.out.println(i);
}
If you use Java 7+, you can use this utility method:
List<String> lines = Files.readAllLines(file, Charset.forName("UTF-8"));
for (String line: lines) {
String[] numbers = line.split("\\s+");
int firstNumber = Integer.parseInt(numbers[0]);
//etc
}
Try this;
try{
// Open the file that is the first
FileInputStream fstream = new FileInputStream("textfile.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
String strLine;
//Read File Line By Line
while ((strLine = br.readLine()) != null) {
//split line by whitespace
String[] ints = strLine.split(" ");
int[] integers = new int[ints.length];
// to convert from string to integers - Integer.parseInt ("123")
for ( int i = 0; i < ints.length; i++) {
integers[i] = Integer.parseInt(ints[i]);
}
// now do what you want with your integer
// ...
}
in.close();
} catch (Exception e) {//Catch exception if any
System.err.println("Error: " + e.getMessage());
}

Read String line by line

Given a string that isn't too long, what is the best way to read it line by line?
I know you can do:
BufferedReader reader = new BufferedReader(new StringReader(<string>));
reader.readLine();
Another way would be to take the substring on the eol:
final String eol = System.getProperty("line.separator");
output = output.substring(output.indexOf(eol + 1));
Any other maybe simpler ways of doing it? I have no problems with the above approaches, just interested to know if any of you know something that may look simpler and more efficient?
There is also Scanner. You can use it just like the BufferedReader:
Scanner scanner = new Scanner(myString);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
// process the line
}
scanner.close();
I think that this is a bit cleaner approach that both of the suggested ones.
You can also use the split method of String:
String[] lines = myString.split(System.getProperty("line.separator"));
This gives you all lines in a handy array.
I don't know about the performance of split. It uses regular expressions.
Since I was especially interested in the efficiency angle, I created a little test class (below). Outcome for 5,000,000 lines:
Comparing line breaking performance of different solutions
Testing 5000000 lines
Split (all): 14665 ms
Split (CR only): 3752 ms
Scanner: 10005
Reader: 2060
As usual, exact times may vary, but the ratio holds true however often I've run it.
Conclusion: the "simpler" and "more efficient" requirements of the OP can't be satisfied simultaneously, the split solution (in either incarnation) is simpler, but the Reader implementation beats the others hands down.
import java.io.BufferedReader;
import java.io.IOException;
import java.io.StringReader;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;
/**
* Test class for splitting a string into lines at linebreaks
*/
public class LineBreakTest {
/** Main method: pass in desired line count as first parameter (default = 10000). */
public static void main(String[] args) {
int lineCount = args.length == 0 ? 10000 : Integer.parseInt(args[0]);
System.out.println("Comparing line breaking performance of different solutions");
System.out.printf("Testing %d lines%n", lineCount);
String text = createText(lineCount);
testSplitAllPlatforms(text);
testSplitWindowsOnly(text);
testScanner(text);
testReader(text);
}
private static void testSplitAllPlatforms(String text) {
long start = System.currentTimeMillis();
text.split("\n\r|\r");
System.out.printf("Split (regexp): %d%n", System.currentTimeMillis() - start);
}
private static void testSplitWindowsOnly(String text) {
long start = System.currentTimeMillis();
text.split("\n");
System.out.printf("Split (CR only): %d%n", System.currentTimeMillis() - start);
}
private static void testScanner(String text) {
long start = System.currentTimeMillis();
List<String> result = new ArrayList<>();
try (Scanner scanner = new Scanner(text)) {
while (scanner.hasNextLine()) {
result.add(scanner.nextLine());
}
}
System.out.printf("Scanner: %d%n", System.currentTimeMillis() - start);
}
private static void testReader(String text) {
long start = System.currentTimeMillis();
List<String> result = new ArrayList<>();
try (BufferedReader reader = new BufferedReader(new StringReader(text))) {
String line = reader.readLine();
while (line != null) {
result.add(line);
line = reader.readLine();
}
} catch (IOException exc) {
// quit
}
System.out.printf("Reader: %d%n", System.currentTimeMillis() - start);
}
private static String createText(int lineCount) {
StringBuilder result = new StringBuilder();
StringBuilder lineBuilder = new StringBuilder();
for (int i = 0; i < 20; i++) {
lineBuilder.append("word ");
}
String line = lineBuilder.toString();
for (int i = 0; i < lineCount; i++) {
result.append(line);
result.append("\n");
}
return result.toString();
}
}
Using Apache Commons IOUtils you can do this nicely via
List<String> lines = IOUtils.readLines(new StringReader(string));
It's not doing anything clever, but it's nice and compact. It'll handle streams as well, and you can get a LineIterator too if you prefer.
Since Java 11, there is a new method String.lines:
/**
* Returns a stream of lines extracted from this string,
* separated by line terminators.
* ...
*/
public Stream<String> lines() { ... }
Usage:
"line1\nline2\nlines3"
.lines()
.forEach(System.out::println);
Solution using Java 8 features such as Stream API and Method references
new BufferedReader(new StringReader(myString))
.lines().forEach(System.out::println);
or
public void someMethod(String myLongString) {
new BufferedReader(new StringReader(myLongString))
.lines().forEach(this::parseString);
}
private void parseString(String data) {
//do something
}
You can also use:
String[] lines = someString.split("\n");
If that doesn't work try replacing \n with \r\n.
You can use the stream api and a StringReader wrapped in a BufferedReader which got a lines() stream output in java 8:
import java.util.stream.*;
import java.io.*;
class test {
public static void main(String... a) {
String s = "this is a \nmultiline\rstring\r\nusing different newline styles";
new BufferedReader(new StringReader(s)).lines().forEach(
(line) -> System.out.println("one line of the string: " + line)
);
}
}
Gives
one line of the string: this is a
one line of the string: multiline
one line of the string: string
one line of the string: using different newline styles
Just like in BufferedReader's readLine, the newline character(s) themselves are not included. All kinds of newline separators are supported (in the same string even).
Or use new try with resources clause combined with Scanner:
try (Scanner scanner = new Scanner(value)) {
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
// process the line
}
}
You can try the following regular expression:
\r?\n
Code:
String input = "\nab\n\n \n\ncd\nef\n\n\n\n\n";
String[] lines = input.split("\\r?\\n", -1);
int n = 1;
for(String line : lines) {
System.out.printf("\tLine %02d \"%s\"%n", n++, line);
}
Output:
Line 01 ""
Line 02 "ab"
Line 03 ""
Line 04 " "
Line 05 ""
Line 06 "cd"
Line 07 "ef"
Line 08 ""
Line 09 ""
Line 10 ""
Line 11 ""
Line 12 ""
The easiest and most universal approach would be to just use the regex Linebreak matcher \R which matches Any Unicode linebreak sequence:
Pattern NEWLINE = Pattern.compile("\\R")
String lines[] = NEWLINE.split(input)
#see https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/regex/Pattern.html

Categories

Resources