Can I peek on a BufferedReader? - java

Is there a way to check if in BufferedReader object is something to read? Something like C++ cin.peek(). Thanks.

You can use a PushbackReader. Using that you can read a character, then unread it. This essentially allows you to push it back.
PushbackReader pr = new PushbackReader(reader);
char c = (char)pr.read();
// do something to look at c
pr.unread((int)c); //pushes the character back into the buffer

You can try the "boolean ready()" method.
From the Java 6 API doc: "A buffered character stream is ready if the buffer is not empty, or if the underlying character stream is ready."
BufferedReader r = new BufferedReader(reader);
if(r.ready())
{
r.read();
}

The following code will look at the first byte in the Stream. Should act as a peek for you.
BufferedReader bReader = new BufferedReader(inputStream);
bReader.mark(1);
int byte1 = bReader.read();
bReader.reset();

The normal idiom is to check in a loop if BufferedReader#readLine() doesn't return null. If end of stream is reached (e.g. end of file, socket closed, etc), then it returns null.
E.g.
BufferedReader reader = new BufferedReader(someReaderSource);
String line = null;
while ((line = reader.readLine()) != null) {
// ...
}
If you don't want to read in lines (which is by the way the major reason a BufferedReader is been chosen), then use BufferedReader#ready() instead:
BufferedReader reader = new BufferedReader(someReaderSource);
while (reader.ready()) {
int data = reader.read();
// ...
}

BufferedReader br = new BufferedReader(reader);
br.mark(1);
int firstByte = br.read();
br.reset();

You could use a PushBackReader to read a character, and then "push it back". That way you know for sure that something was there, without affecting its overall state - a "peek".

The answer from pgmura (relying on the ready() method) is simple and works.
But bear in mind that it's because Sun's implementation of the method; which does not really agree with the documentation. I would not rely on that, if this behaviour is critical.
See here http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4090471
I'd rather go with the PushbackReader option.

my solution was.. extending BufferedReader and use queue as buf, then you can use peek method in queue.
public class PeekBufferedReader extends BufferedReader{
private Queue<String> buf;
private int bufSize;
public PeekBufferedReader(Reader reader, int bufSize) throws IOException {
super(reader);
this.bufSize = bufSize;
buf = Queues.newArrayBlockingQueue(bufSize);
}
/**
* readAheadLimit is set to 1048576. Line which has length over readAheadLimit
* will cause IOException.
* #throws IOException
**/
//public String peekLine() throws IOException {
// super.mark(1048576);
// String peekedLine = super.readLine();
// super.reset();
// return peekedLine;
//}
/**
* This method can be implemented by mark and reset methods. But performance of
* this implementation is better ( about 2times) than using mark and reset
**/
public String peekLine() throws IOException {
if (buf.isEmpty()) {
while (buf.size() < bufSize) {
String readLine = super.readLine();
if (readLine == null) {
break;
} else {
buf.add(readLine);
}
}
} else {
return buf.peek();
}
if (buf.isEmpty()) {
return null;
} else {
return buf.peek();
}
}
public String readLine() throws IOException {
if (buf.isEmpty()) {
while (buf.size() < bufSize) {
String readLine = super.readLine();
if (readLine == null) {
break;
} else {
buf.add(readLine);
}
}
} else {
return buf.poll();
}
if (buf.isEmpty()) {
return null;
} else {
return buf.poll();
}
}
public boolean isEmpty() throws IOException {
if (buf.isEmpty()) {
while (buf.size() < bufSize) {
String readLine = super.readLine();
if (readLine == null) {
break;
} else {
buf.add(readLine);
}
}
} else {
return false;
}
if (buf.isEmpty()) {
return true;
} else {
return false;
}
}
}

Related

java program to conditionally read lines from file

I'm new to coding in Java. I put together this piece of code to read all lines between the "Start" and "End" tag in the following text file.
Start
hi
hello
how
are
you
doing?
End
My program is as follows....
package test;
import java.io.*;
public class ReadSecurities {
public static int countLines(String filename) throws IOException {
InputStream is = new BufferedInputStream(new FileInputStream(filename));
try {
byte[] c = new byte[1024];
int count = 0;
int readChars = 0;
boolean empty = true;
while ((readChars = is.read(c)) != -1) {
empty = false;
for (int i = 0; i < readChars; ++i) {
if (c[i] == '\n') {
++count;
}
}
}
return (count == 0 && !empty) ? 1 : count;
} finally {
is.close();
}
}
public static void main(String[] args) {
// TODO Auto-generated method stub
try {
FileInputStream in = new FileInputStream("U:\\Read101.txt");
FileOutputStream out = new FileOutputStream("U:\\write101.txt");
BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(out));
BufferedReader br = new BufferedReader(new InputStreamReader(in));
for (int i=1; i<=countLines("U:\\Read101.txt"); i++) {
String line=br.readLine();
while (line.contains("Start")) {
for (int j=i; j<=countLines("U:\\Read101.txt"); j++) {
String line2=br.readLine();
System.out.println(line2);
if(line2.contains("End")) break;
else {
bw.write(line2);
bw.newLine();
}
bw.close();
} break;
}
}
br.close();
}
catch (Exception e) { }
finally { }
}
}
The program reads only the first two lines "hi hello" as though the if condition does not exist. I have a feeling the mistake is very basic, but please correct me.
String line;
do{ line = br.readLine(); }
while( null != line && !line.equals("Start"));
if ( line.equals("Start") ) { // in case of EOF before "Start" we have to skip the rest!
do{
line = br.readLine();
if ( line.equals("End") ) break;
// TODO write to other file
}while(null != line )
}
Should be as easy as that. I left out creation / destruction of resources and proper Exception handling for brevity.
But please do at least log exceptions!
EDIT:
If EOF is encountered before Start, you have to skip the copy step!
You make one crucial mistake in your code: you don't handle the exceptions correctly. Two things:
never catch Exception. Either catch just one type of Exception or specify a list of exceptions you want to catch. In your case, a simple IOException would suffice.
Never leave a catch-block empty. Either throw a new exception, return a value or - in your case - print the exception with e.printStackTrace().
When you do these two things, you will notice that your code throws an IOException, because you close your bw-Stream too early. Move the bw.close() down to where br.close() is.
Now, when you have done that, your code is almost working. The only thing is - you now get a NullPointerException. This is because you don't change your line after all entries are read. The easy fix to this is change from
while(line.equals("Start")) { ...
to
if(line.equals("Start")) { ...
Also, there are some other not-so-neat things in your code, but I will leave it for now - experience comes with time.
For Java 8:
List<String> stopWords = Arrays.asList("Start", "End");
try (BufferedReader reader = new BufferedReader(input))) {
List<String> lines = reader.lines()
.map(String::trim)
.filter(s -> !StringUtils.isEmpty(s) && !stopWords.contains(s))
.collect(Collectors.toList());
}

How to know bytes read(offset) of BufferedReader?

I want to read file line by line.
BufferedReader is much faster than RandomAccessFile or BufferedInputStream.
But the problem is that I don't know how many bytes I read.
How to know bytes read(offset)?
I tried.
String buffer;
int offset = 0;
while ((buffer = br.readLine()) != null)
offset += buffer.getBytes().length + 1; // 1 is for line separator
I works if file is small.
But, when the file becomes large, offset becomes smaller than actual value.
How can I get offset?
There is no simple way to do this with BufferedReader because of two effects: Character endcoding and line endings. On Windows, the line ending is \r\n which is two bytes. On Unix, the line separator is a single byte. BufferedReader will handle both cases without you noticing, so after readLine(), you won't know how many bytes were skipped.
Also buffer.getBytes() only returns the correct result when your default encoding and the encoding of the data in the file accidentally happens to be the same. When using byte[] <-> String conversion of any kind, you should always specify exactly which encoding should be used.
You also can't use a counting InputStream because the buffered readers read data in large chunks. So after reading the first line with, say, 5 bytes, the counter in the inner InputStream would return 4096 because the reader always reads that many bytes into its internal buffer.
You can have a look at NIO for this. You can use a low level ByteBuffer to keep track of the offset and wrap that in a CharBuffer to convert the input into lines.
Here's something that should work. It assumes UTF-8, but you can easily change that.
import java.io.*;
class main {
public static void main(final String[] args) throws Exception {
ByteCountingLineReader r = new ByteCountingLineReader(new ByteArrayInputStream(toUtf8("Hello\r\nWorld\n")));
String line = null;
do {
long count = r.byteCount();
line = r.readLine();
System.out.println("Line at byte " + count + ": " + line);
} while (line != null);
r.close();
}
static class ByteCountingLineReader implements Closeable {
InputStream in;
long _byteCount;
int bufferedByte = -1;
boolean ended;
// in should be a buffered stream!
ByteCountingLineReader(InputStream in) {
this.in = in;
}
ByteCountingLineReader(File f) throws IOException {
in = new BufferedInputStream(new FileInputStream(f), 65536);
}
String readLine() throws IOException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
if (ended) return null;
while (true) {
int c = read();
if (ended && baos.size() == 0) return null;
if (ended || c == '\n') break;
if (c == '\r') {
c = read();
if (c != '\n' && !ended)
bufferedByte = c;
break;
}
baos.write(c);
}
return fromUtf8(baos.toByteArray());
}
int read() throws IOException {
if (bufferedByte >= 0) {
int b = bufferedByte;
bufferedByte = -1;
return b;
}
int c = in.read();
if (c < 0) ended = true; else ++_byteCount;
return c;
}
long byteCount() {
return bufferedByte >= 0 ? _byteCount - 1 : _byteCount;
}
public void close() throws IOException {
if (in != null) try {
in.close();
} finally {
in = null;
}
}
boolean ended() {
return ended;
}
}
static byte[] toUtf8(String s) {
try {
return s.getBytes("UTF-8");
} catch (Exception __e) {
throw rethrow(__e);
}
}
static String fromUtf8(byte[] bytes) {
try {
return new String(bytes, "UTF-8");
} catch (Exception __e) {
throw rethrow(__e);
}
}
static RuntimeException rethrow(Throwable t) {
throw t instanceof RuntimeException ? (RuntimeException) t : new RuntimeException(t);
}
}
Try use RandomAccessFile
RandomAccessFile raf = new RandomAccessFile(filePath, "r");
while ((cur_line = raf.readLine()) != null){
System.out.println(curr_line);
// get offset
long rowIndex = raf.getFilePointer();
}
to seek by offset do:
raf.seek(offset);
I am wondering your final solution, however, I think using long type instead of int can meet the most situation in your code above.
If you want to read a file line by line, I would recommend this code:
import java.io.*;
class FileRead
{
public static void main(String args[])
{
try{
// Open the file that is the first
// command line parameter
FileInputStream fstream = new FileInputStream("textfile.txt");
// Use DataInputStream to read binary NOT text.
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
String strLine;
//Read File Line By Line
while ((strLine = br.readLine()) != null) {
// Print the content on the console
System.out.println (strLine);
}
//Close the input stream
in.close();
}catch (Exception e){//Catch exception if any
System.err.println("Error: " + e.getMessage());
}
}
}
I always used that method in the past, and works great!
Source: Here

Java - NullPointerException while simply reading a file and returning an int

I am trying to read a file and figure out how many times a String occurs in that file. Depending on how many times, it will display a different dialogue to the player (it's a game). Here is my code
/**
* Selects which dialogue to send depending on how many times the player has been jailed
* #return The dialogue ID
*/
public static int selectChat() {
System.err.println("Got to selectChar()");
FileUtil.stringOccurrences(c.playerName, // NULLPOINTER HAPPENS HERE
"./data/restrictions/TimesJailed.txt");
if (FileUtil.stringCount == 1)
return 495;
if (FileUtil.stringCount == 2)
return 496;
if (FileUtil.stringCount >= 3) {
return 497;
}
return 0;
}
And then this is the actual file reading method
public static int stringCount;
/**
* #param string
* #param filePath
* #return How many occurrences of the string there are in the file
*/
public static int stringOccurrences(String string, String filePath) {
int count = 0;
try {
FileInputStream fstream = new FileInputStream(filePath);
DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strLine;
while ((strLine = br.readLine()) != null) {
if (strLine.contains(string))
count++;
}
in.close();
}
catch (Exception e) { // Catch exception if any
System.err.println("Error: " + e.getMessage());
}
System.err.println(count);
stringCount = count;
return count;
}
Here is all I do with c
Client c;
public Jail(Client c) {
this.c = c;
}
Could someone please help me work out the problem.
It seems to me that c in c.playerName is null.
Also depending on what FileUtil you use, that might cause a NullPointerException as well, when playerName is null.
Your stringOccurrences method cannot throw an NPE as far as I can tell -- it makes no dereferences outside the try/catch block.
Where do you assign a value to c? If it is null, then your attempt to read c.playerName will produce a NullPointerException. To see for sure if this is happening, change your code to:
String myPlayerName = c.playerName; //now does NPE happen here ...
FileUtil.stringOccurrences(myPlayerName, // or here?
"./data/restrictions/TimesJailed.txt");
Please write your constructor as below;
Client c;
public Jail(Client c) {
if(c == null) {
c = new Client();
}
}

Read in N Lines of an Input Stream and print in reverse order without using array or list type structure?

Using the readLine() method of BufferedReader, can you print the first N lines of a stream in reverse order without using a list or an array?
I think you can do it through recursion with something like:
void printReversed(int n)
{
String line = reader.readLine();
if (n > 0)
printReversed(n-1);
System.out.println(line);
}
How about recursion to reverse the order?
Pseudo code:
reverse(int linesLeft)
if (linesLeft == 0)
return;
String line = readLine();
reverse(linesLeft - 1);
System.out.println(line);
Nice question. Here you have one solution based on coordinated threads. Although it's heavy on resources (1 thread/line of the buffer) it solves your problem within the given constrains. I'm curious to see other solutions.
public class ReversedBufferPrinter {
class Worker implements Runnable {
private final CountDownLatch trigger;
private final CountDownLatch release;
private final String line;
Worker(String line, CountDownLatch release) {
this.trigger = new CountDownLatch(1);
this.release = release;
this.line = line;
}
public CountDownLatch getTriggerLatch() {
return trigger;
}
public void run() {
try {
trigger.await();
} catch (InterruptedException ex) { } // handle
work();
release.countDown();
}
void work() {
System.out.println(line);
}
}
public void reversePrint(BufferedReader reader, int lines) throws IOException {
CountDownLatch initialLatch = new CountDownLatch(1);
CountDownLatch triggerLatch = initialLatch;
int count=0;
String line;
while (count++<lines && (line = reader.readLine())!=null) {
Worker worker = new Worker(line, triggerLatch);
triggerLatch = worker.getTriggerLatch();
new Thread(worker).start();
}
triggerLatch.countDown();
try {
initialLatch.await();
} catch (InterruptedException iex) {
// handle
}
}
public static void main(String [] params) throws Exception {
if (params.length<2) {
System.out.println("usage: ReversedBufferPrinter <file to reverse> <#lines>");
}
String filename = params[0];
int lines = Integer.parseInt(params[1]);
File file = new File(filename);
BufferedReader reader = new BufferedReader(new FileReader(file));
ReversedBufferPrinter printer = new ReversedBufferPrinter();
printer.reversePrint(reader, lines);
}
}
Here you have another alternative, based on BufferedReader & StringBuilder manipulations. More manageable in terms of computer resources needed.
public void reversePrint(BufferedReader bufReader, int lines) throws IOException {
BufferedReader resultBufferReader = null;
{
String line;
StringBuilder sb = new StringBuilder();
int count = 0;
while (count++<lines && (line = bufReader.readLine())!=null) {
sb.append('\n'); // restore new line marker for BufferedReader to consume.
sb.append(new StringBuilder(line).reverse());
}
resultBufferReader = new BufferedReader(new StringReader(sb.reverse().toString()));
}
{
String line;
while ((line = resultBufferReader.readLine())!=null) {
System.out.println(line);
}
}
}
it will also require implicit data structures, but you can spawn threads, run them inorder, and make each thread read a line and wait a decreasing amount of time. the result will be: the last thread will run first, and the first one will run last, each one printing its line. (the interval between them will have to be large enough to ensure large "safety margins")
I have no idea how, if any, that can be done with no explicit/implicit data storage.
Prepend each line you read to a string, and print the string. If you run out of lines to read, you just print what you have.
Alternatively, if you are certain of the number of lines you have, and you do not wish to use a string:
void printReversed(int n, BufferedReader reader)
{
LineNumberReader lineReader = new LineNumberReader(reader);
while (--i >= 0)
{
lineReader.setLineNumber(i);
System.out.println(lineReader.readLine());
}
}

Number of lines in a file in Java

I use huge data files, sometimes I only need to know the number of lines in these files, usually I open them up and read them line by line until I reach the end of the file
I was wondering if there is a smarter way to do that
This is the fastest version I have found so far, about 6 times faster than readLines. On a 150MB log file this takes 0.35 seconds, versus 2.40 seconds when using readLines(). Just for fun, linux' wc -l command takes 0.15 seconds.
public static int countLinesOld(String filename) throws IOException {
InputStream is = new BufferedInputStream(new FileInputStream(filename));
try {
byte[] c = new byte[1024];
int count = 0;
int readChars = 0;
boolean empty = true;
while ((readChars = is.read(c)) != -1) {
empty = false;
for (int i = 0; i < readChars; ++i) {
if (c[i] == '\n') {
++count;
}
}
}
return (count == 0 && !empty) ? 1 : count;
} finally {
is.close();
}
}
EDIT, 9 1/2 years later: I have practically no java experience, but anyways I have tried to benchmark this code against the LineNumberReader solution below since it bothered me that nobody did it. It seems that especially for large files my solution is faster. Although it seems to take a few runs until the optimizer does a decent job. I've played a bit with the code, and have produced a new version that is consistently fastest:
public static int countLinesNew(String filename) throws IOException {
InputStream is = new BufferedInputStream(new FileInputStream(filename));
try {
byte[] c = new byte[1024];
int readChars = is.read(c);
if (readChars == -1) {
// bail out if nothing to read
return 0;
}
// make it easy for the optimizer to tune this loop
int count = 0;
while (readChars == 1024) {
for (int i=0; i<1024;) {
if (c[i++] == '\n') {
++count;
}
}
readChars = is.read(c);
}
// count remaining characters
while (readChars != -1) {
for (int i=0; i<readChars; ++i) {
if (c[i] == '\n') {
++count;
}
}
readChars = is.read(c);
}
return count == 0 ? 1 : count;
} finally {
is.close();
}
}
Benchmark resuls for a 1.3GB text file, y axis in seconds. I've performed 100 runs with the same file, and measured each run with System.nanoTime(). You can see that countLinesOld has a few outliers, and countLinesNew has none and while it's only a bit faster, the difference is statistically significant. LineNumberReader is clearly slower.
I have implemented another solution to the problem, I found it more efficient in counting rows:
try
(
FileReader input = new FileReader("input.txt");
LineNumberReader count = new LineNumberReader(input);
)
{
while (count.skip(Long.MAX_VALUE) > 0)
{
// Loop just in case the file is > Long.MAX_VALUE or skip() decides to not read the entire file
}
result = count.getLineNumber() + 1; // +1 because line index starts at 0
}
The accepted answer has an off by one error for multi line files which don't end in newline. A one line file ending without a newline would return 1, but a two line file ending without a newline would return 1 too. Here's an implementation of the accepted solution which fixes this. The endsWithoutNewLine checks are wasteful for everything but the final read, but should be trivial time wise compared to the overall function.
public int count(String filename) throws IOException {
InputStream is = new BufferedInputStream(new FileInputStream(filename));
try {
byte[] c = new byte[1024];
int count = 0;
int readChars = 0;
boolean endsWithoutNewLine = false;
while ((readChars = is.read(c)) != -1) {
for (int i = 0; i < readChars; ++i) {
if (c[i] == '\n')
++count;
}
endsWithoutNewLine = (c[readChars - 1] != '\n');
}
if(endsWithoutNewLine) {
++count;
}
return count;
} finally {
is.close();
}
}
With java-8, you can use streams:
try (Stream<String> lines = Files.lines(path, Charset.defaultCharset())) {
long numOfLines = lines.count();
...
}
The answer with the method count() above gave me line miscounts if a file didn't have a newline at the end of the file - it failed to count the last line in the file.
This method works better for me:
public int countLines(String filename) throws IOException {
LineNumberReader reader = new LineNumberReader(new FileReader(filename));
int cnt = 0;
String lineRead = "";
while ((lineRead = reader.readLine()) != null) {}
cnt = reader.getLineNumber();
reader.close();
return cnt;
}
I tested the above methods for counting lines and here are my observations for Different methods as tested on my system
File Size : 1.6 Gb
Methods:
Using Scanner : 35s approx
Using BufferedReader : 5s approx
Using Java 8 : 5s approx
Using LineNumberReader : 5s approx
Moreover Java8 Approach seems quite handy :
Files.lines(Paths.get(filePath), Charset.defaultCharset()).count()
[Return type : long]
I know this is an old question, but the accepted solution didn't quite match what I needed it to do. So, I refined it to accept various line terminators (rather than just line feed) and to use a specified character encoding (rather than ISO-8859-n). All in one method (refactor as appropriate):
public static long getLinesCount(String fileName, String encodingName) throws IOException {
long linesCount = 0;
File file = new File(fileName);
FileInputStream fileIn = new FileInputStream(file);
try {
Charset encoding = Charset.forName(encodingName);
Reader fileReader = new InputStreamReader(fileIn, encoding);
int bufferSize = 4096;
Reader reader = new BufferedReader(fileReader, bufferSize);
char[] buffer = new char[bufferSize];
int prevChar = -1;
int readCount = reader.read(buffer);
while (readCount != -1) {
for (int i = 0; i < readCount; i++) {
int nextChar = buffer[i];
switch (nextChar) {
case '\r': {
// The current line is terminated by a carriage return or by a carriage return immediately followed by a line feed.
linesCount++;
break;
}
case '\n': {
if (prevChar == '\r') {
// The current line is terminated by a carriage return immediately followed by a line feed.
// The line has already been counted.
} else {
// The current line is terminated by a line feed.
linesCount++;
}
break;
}
}
prevChar = nextChar;
}
readCount = reader.read(buffer);
}
if (prevCh != -1) {
switch (prevCh) {
case '\r':
case '\n': {
// The last line is terminated by a line terminator.
// The last line has already been counted.
break;
}
default: {
// The last line is terminated by end-of-file.
linesCount++;
}
}
}
} finally {
fileIn.close();
}
return linesCount;
}
This solution is comparable in speed to the accepted solution, about 4% slower in my tests (though timing tests in Java are notoriously unreliable).
/**
* Count file rows.
*
* #param file file
* #return file row count
* #throws IOException
*/
public static long getLineCount(File file) throws IOException {
try (Stream<String> lines = Files.lines(file.toPath())) {
return lines.count();
}
}
Tested on JDK8_u31. But indeed performance is slow compared to this method:
/**
* Count file rows.
*
* #param file file
* #return file row count
* #throws IOException
*/
public static long getLineCount(File file) throws IOException {
try (BufferedInputStream is = new BufferedInputStream(new FileInputStream(file), 1024)) {
byte[] c = new byte[1024];
boolean empty = true,
lastEmpty = false;
long count = 0;
int read;
while ((read = is.read(c)) != -1) {
for (int i = 0; i < read; i++) {
if (c[i] == '\n') {
count++;
lastEmpty = true;
} else if (lastEmpty) {
lastEmpty = false;
}
}
empty = false;
}
if (!empty) {
if (count == 0) {
count = 1;
} else if (!lastEmpty) {
count++;
}
}
return count;
}
}
Tested and very fast.
A straight-forward way using Scanner
static void lineCounter (String path) throws IOException {
int lineCount = 0, commentsCount = 0;
Scanner input = new Scanner(new File(path));
while (input.hasNextLine()) {
String data = input.nextLine();
if (data.startsWith("//")) commentsCount++;
lineCount++;
}
System.out.println("Line Count: " + lineCount + "\t Comments Count: " + commentsCount);
}
I concluded that wc -l:s method of counting newlines is fine but returns non-intuitive results on files where the last line doesn't end with a newline.
And #er.vikas solution based on LineNumberReader but adding one to the line count returned non-intuitive results on files where the last line does end with newline.
I therefore made an algo which handles as follows:
#Test
public void empty() throws IOException {
assertEquals(0, count(""));
}
#Test
public void singleNewline() throws IOException {
assertEquals(1, count("\n"));
}
#Test
public void dataWithoutNewline() throws IOException {
assertEquals(1, count("one"));
}
#Test
public void oneCompleteLine() throws IOException {
assertEquals(1, count("one\n"));
}
#Test
public void twoCompleteLines() throws IOException {
assertEquals(2, count("one\ntwo\n"));
}
#Test
public void twoLinesWithoutNewlineAtEnd() throws IOException {
assertEquals(2, count("one\ntwo"));
}
#Test
public void aFewLines() throws IOException {
assertEquals(5, count("one\ntwo\nthree\nfour\nfive\n"));
}
And it looks like this:
static long countLines(InputStream is) throws IOException {
try(LineNumberReader lnr = new LineNumberReader(new InputStreamReader(is))) {
char[] buf = new char[8192];
int n, previousN = -1;
//Read will return at least one byte, no need to buffer more
while((n = lnr.read(buf)) != -1) {
previousN = n;
}
int ln = lnr.getLineNumber();
if (previousN == -1) {
//No data read at all, i.e file was empty
return 0;
} else {
char lastChar = buf[previousN - 1];
if (lastChar == '\n' || lastChar == '\r') {
//Ending with newline, deduct one
return ln;
}
}
//normal case, return line number + 1
return ln + 1;
}
}
If you want intuitive results, you may use this. If you just want wc -l compatibility, simple use #er.vikas solution, but don't add one to the result and retry the skip:
try(LineNumberReader lnr = new LineNumberReader(new FileReader(new File("File1")))) {
while(lnr.skip(Long.MAX_VALUE) > 0){};
return lnr.getLineNumber();
}
How about using the Process class from within Java code? And then reading the output of the command.
Process p = Runtime.getRuntime().exec("wc -l " + yourfilename);
p.waitFor();
BufferedReader b = new BufferedReader(new InputStreamReader(p.getInputStream()));
String line = "";
int lineCount = 0;
while ((line = b.readLine()) != null) {
System.out.println(line);
lineCount = Integer.parseInt(line);
}
Need to try it though. Will post the results.
It seems that there are a few different approaches you can take with LineNumberReader.
I did this:
int lines = 0;
FileReader input = new FileReader(fileLocation);
LineNumberReader count = new LineNumberReader(input);
String line = count.readLine();
if(count.ready())
{
while(line != null) {
lines = count.getLineNumber();
line = count.readLine();
}
lines+=1;
}
count.close();
System.out.println(lines);
Even more simply, you can use the Java BufferedReader lines() Method to return a stream of the elements, and then use the Stream count() method to count all of the elements. Then simply add one to the output to get the number of rows in the text file.
As example:
FileReader input = new FileReader(fileLocation);
LineNumberReader count = new LineNumberReader(input);
int lines = (int)count.lines().count() + 1;
count.close();
System.out.println(lines);
This funny solution works really good actually!
public static int countLines(File input) throws IOException {
try (InputStream is = new FileInputStream(input)) {
int count = 1;
for (int aChar = 0; aChar != -1;aChar = is.read())
count += aChar == '\n' ? 1 : 0;
return count;
}
}
On Unix-based systems, use the wc command on the command-line.
Only way to know how many lines there are in file is to count them. You can of course create a metric from your data giving you an average length of one line and then get the file size and divide that with avg. length but that won't be accurate.
If you don't have any index structures, you'll not get around the reading of the complete file. But you can optimize it by avoiding to read it line by line and use a regex to match all line terminators.
Best Optimized code for multi line files having no newline('\n') character at EOF.
/**
*
* #param filename
* #return
* #throws IOException
*/
public static int countLines(String filename) throws IOException {
int count = 0;
boolean empty = true;
FileInputStream fis = null;
InputStream is = null;
try {
fis = new FileInputStream(filename);
is = new BufferedInputStream(fis);
byte[] c = new byte[1024];
int readChars = 0;
boolean isLine = false;
while ((readChars = is.read(c)) != -1) {
empty = false;
for (int i = 0; i < readChars; ++i) {
if ( c[i] == '\n' ) {
isLine = false;
++count;
}else if(!isLine && c[i] != '\n' && c[i] != '\r'){ //Case to handle line count where no New Line character present at EOF
isLine = true;
}
}
}
if(isLine){
++count;
}
}catch(IOException e){
e.printStackTrace();
}finally {
if(is != null){
is.close();
}
if(fis != null){
fis.close();
}
}
LOG.info("count: "+count);
return (count == 0 && !empty) ? 1 : count;
}
Scanner with regex:
public int getLineCount() {
Scanner fileScanner = null;
int lineCount = 0;
Pattern lineEndPattern = Pattern.compile("(?m)$");
try {
fileScanner = new Scanner(new File(filename)).useDelimiter(lineEndPattern);
while (fileScanner.hasNext()) {
fileScanner.next();
++lineCount;
}
}catch(FileNotFoundException e) {
e.printStackTrace();
return lineCount;
}
fileScanner.close();
return lineCount;
}
Haven't clocked it.
if you use this
public int countLines(String filename) throws IOException {
LineNumberReader reader = new LineNumberReader(new FileReader(filename));
int cnt = 0;
String lineRead = "";
while ((lineRead = reader.readLine()) != null) {}
cnt = reader.getLineNumber();
reader.close();
return cnt;
}
you cant run to big num rows, likes 100K rows, because return from reader.getLineNumber is int. you need long type of data to process maximum rows..

Categories

Resources