Use BufferedReader and InputStream together - java

I use a BufferedReader to read lines from an InputStream. When I read something directly from the InputStream, the BufferedReader ignores my read and continues reading at the same location. Is it possible to prevent this behavior? If not what is a good practice to do this?
PS: Here's my code:
byte[] ba = new byte[1024*1024];
int off = 0;
int len = 0;
do {
len = Integer.parseInt(br.readLine());
in.read(ba, off, len);
br.readLine();
off += len;
} while(len > 0);
in is my inputstream and br my bufferedreader.

If not what is a good practice to do this?
This is not a good approach to read by 2 stream at a time for same file. You have to use just one stream.
BufferedReader is used for character stream whereas InputStream is used for binary stream.
A binary stream doesn't have readLine() method that is only available in character stream.

Reading from a BufferedReader and an InputStream at the same time is not possible. If you need binary data, you should use multiple readLine() calls.
Here's my new code:
byte[] ba = new byte[1024*1024];
int off = 0;
int len = 0;
do {
len = Integer.parseInt(br.readLine().split(";" , 2)[0],16);
for (int cur = 0; cur < len;) {
byte[] line0 = br.readLine().getBytes();
for (int i = 0; i < line0.length; i++) {
ba[off+cur+i] = line0[i];
}
cur += line0.length;
if(cur < len) {
ba[off+cur] = '\n';
cur++;
}
}
off += len;
} while(len > 0);

BufferedReader bufferedReader = null;
try
{
bufferedReader = new BufferedReader(new InputStreamReader(inputStream));
String line = null;
while((line = bufferedReader.readLine()) != null)
{
//process lines here
}
}
catch(IOException e)
{
e.printStackTrace();
}
finally
{
if(bufferedReader != null)
{
try
{
bufferedReader.close();
}
catch(IOException e)
{
}
}
}

Related

Java very basic encrypting a file

I want to excrypt a file in java very basically. Simply read line by line the file, and change the value of the chars to "char += key", where key is an integer.
The problem is that if I use a key larger or equal with 2, it doesn't work anymore.
public void encryptData(int key) {
System.out.println("Encrypt");
try {
BufferedReader br = new BufferedReader(new FileReader("encrypted.data"));
BufferedWriter out = new BufferedWriter(new FileWriter("temp_encrypted.data"));
String str;
while ((str = br.readLine()) != null) {
char[] str_array = str.toCharArray();
// Encrypt one line
for (int i = 0; i < str.length(); i++) {
str_array[i] += key;
}
// Put the line in temp file
str = String.valueOf(str_array);
out.write(str_array);
}
br.close();
out.close();
} catch (IOException e) {
System.out.println(e.getMessage());
}
}
The decrypt function is the same but with the input/output files interchanged and instead of adding the key value, I subtract it.
I check char by char and indeed, the header gets messed up when i use a key value > 1. Any ideas? Is it because of maximum value of the char being exceeded?
You're basically implementing a general-purpose Caesar cipher.
Adding a number to a character could change a character to newline, etc which will not work if using a BufferedReader to read it back in.
Best to manipulate the text as a byte stream which would correctly encode and decode newline and any non-ASCII characters.
public void encryptData(int key) {
System.out.println("Encrypt");
try {
BufferedInputStream in = new BufferedInputStream(new FileInputStream("raw-text.data"));
BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream("temp_encrypted.data"));
int ch;
while((ch = in.read()) != -1) {
// NOTE: write(int) method casts int to byte
out.write(ch + key);
}
out.close();
in.close();
} catch (IOException e) {
System.out.println(e.getMessage());
}
}
public void decryptData(int key) {
System.out.println("Decrypt");
try {
BufferedInputStream in = new BufferedInputStream(new FileInputStream("temp_encrypted.data"));
BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream("decrypted.data"));
int ch;
while((ch = in.read()) != -1) {
out.write(ch - key);
}
out.close();
in.close();
} catch (IOException e) {
System.out.println(e.getMessage());
}
}

Reading ascii file line by line - Java

I am trying to read an ascii file and recognize the position of newline character "\n" as to know which and how many characters i have in every line.The file size is 538MB. When i run the below code it never prints me anything.
I search a lot but i didn't find anything for ascii files. I use netbeans and Java 8. Any ideas??
Below is my code.
String inputFile = "C:\myfile.txt";
FileInputStream in = new FileInputStream(inputFile);
FileChannel ch = in.getChannel();
int BUFSIZE = 512;
ByteBuffer buf = ByteBuffer.allocateDirect(BUFSIZE);
Charset cs = Charset.forName("ASCII");
while ( (rd = ch.read( buf )) != -1 ) {
buf.rewind();
CharBuffer chbuf = cs.decode(buf);
for ( int i = 0; i < chbuf.length(); i++ ) {
if (chbuf.get() == '\n'){
System.out.println("PRINT SOMETHING");
}
}
}
Method to store the contents of a file to a string:
static String readFile(String path, Charset encoding) throws IOException
{
byte[] encoded = Files.readAllBytes(Paths.get(path));
return new String(encoded, encoding);
}
Here's a way to find the occurrences of a character in the entire string:
public static void main(String [] args) throws IOException
{
List<Integer> indexes = new ArrayList<Integer>();
String content = readFile("filetest", StandardCharsets.UTF_8);
int index = content.indexOf('\n');
while (index >= 0)
{
indexes.add(index);
index = content.indexOf('\n', index + 1);
}
}
Found here and here.
The number of characters in a line is the length of the string read by a readLine call:
try (BufferedReader br = new BufferedReader(new FileReader(file))) {
int iLine = 0;
String line;
while ((line = br.readLine()) != null) {
System.out.println( "Line " + iLine + " has " +
line.length() + " characters." );
iLine++;
}
} catch( IOException ioe ){
// ...
}
Note that the (system-dependent) line end marker has been stripped from the string by readLine.
If a very large file contains no newlines, it is indeed possible to run out of memory. Reading character by character will avoid this.
File file = new File( "Z.java" );
Reader reader = new FileReader(file);
int len = 0;
int c;
int iLine = 0;
while( (c = reader.read()) != -1) {
if( c == '\n' ){
iLine++;
System.out.println( "line " + iLine + " contains " +
len + " characters" );
len = 0;
} else {
len++;
}
}
reader.close();
You should user FileReader which is convenience class for reading character files.
FileInputStream javs docs clearly states
FileInputStream is meant for reading streams of raw bytes such as
image data. For reading streams of characters, consider using
FileReader.
Try below
try (BufferedReader br = new BufferedReader(new FileReader(file))) {
String line;
while ((line = br.readLine()) != null) {
for (int pos = line.indexOf("\n"); pos != -1; pos = line.indexOf("\n", pos + 1)) {
System.out.println("\\n at " + pos);
}
}
}

Using BufferedReader or Scanner to read a txt file. in my example only the Scanner works. WHY?

Using the BufferedReader method instead of the Scanner to read a txt file, containing all prime numbers between 2 and 10000.
I want to put the integers in an array tall.
With the Scanner method it works but not with the BufferedReader equivalent.
The try statement works with both methods, but when i'm closing the BufferedReader in the finally statement, it says my variables don't exist.
If i use the Scanner method, it's working.
Scanner (works):
public class Main {
public static void main(String[] arg) throws IOException {
File file = new File ("C:/Users/Victor/workspace/Bvijfb/Priem.txt");
file.getParentFile().mkdirs();
FileWriter xw = new FileWriter("Priem.txt");
PrintWriter yw = new PrintWriter(xw);
int i;
boolean isPriem;
int[] tall = new int[1240];
for (i = 2; i < 10000; i++) {
isPriem = true;
for (int j = 2; j < i; j++) {
if (i % j == 0) {
isPriem = false;
break;
}
}
if (isPriem == true)
yw.println("" + i);
}
yw.close();
Scanner sc = new Scanner(file);
int a = 0;
try {
while (sc.hasNextLine()) {
if(sc.hasNextInt()){
tall[a] = sc.nextInt();
a++;
}
else {
break;
}
}
} finally {
sc.close();
}
}
The try statement Using the BufferedReader method:
try{
InputStream is = new FileInputStream(file);
InputStreamReader isr = new InputStreamReader(is);
BufferedReader br = new BufferedReader(isr);
int a = 0;
int value = 0;
while ((value = br.read()) != -1) {
tall[a] = value;
a++;
}
}catch(Exception e){
e.printStackTrace();
}finally{
is.close();
isr.close();
br.close();
}
Your scanner code is looking for an int by looking for characters that define an int. So for instance, the three characters '1', '2', and '3', in a row, are read as the int value 123 by your Scanner code.
But your BufferedReader code is working at a lower level than that; it works at the individual character level. So when you do br.read() for the first time on my 123 example, you get the value 49 ('1' == 49).
BufferedReader doesn't implement the parsing functionality that Scanner does. If you want it, use Scanner (or implement your own custom if necessary).
Unrelated side note: On any vaguely-recent version of Java, your try in the BufferedReader example would be much better off as a try-with-resources:
try (
InputStream is = new FileInputStream(file);
InputStreamReader isr = new InputStreamReader(is);
BufferedReader br = new BufferedReader(isr)
) {
int a = 0;
int value = 0;
while ((value = br.read()) != -1) {
tall[a] = value;
a++;
}
} catch (Exception e){
e.printStackTrace(); // This is also generally poor practice, but for quick-and-dirty code...
}
Note that I never call close. I don't have to, try-with-resources handles it for me, and handles it property regardless of where or when an exception is thrown. In your code, for instance, if for some reason new InputStreamReader(is) threw, you'd leave is open.
For starters, when using a BufferedReader, you probably want to be using readLine() instead of read(). Read consumes a binary byte at a time; readLine() converts a line of bytes into a String and (assuming these files are one-line-per-number) that String will be the representation of a prime number.
If there is more than one number per line of text in the file, you'll need to split the String.
In order to convert that String to an int, you'll need to run Integer.parseInt() on it.
Also, you can remove some of the setup ceremony as follows:
FileReader fr = new FileReader(fileName);
BufferedReader br = new BufferedReader(fr);
Or even just
BufferedReader br = new BufferedReader(new FileReader(fileName));

How to read first five character from buffered reader?

I have this code
Process p =Runtime.getRuntime().exec("busybox");
InputStream a = p.getInputStream();
InputStreamReader read = new InputStreamReader(a);
BufferedReader in = new BufferedReader(read);
Running it from the terminal the first lines of oupout return the version of Busybox. If I wanted to take for example the first 5 characters as I do?
While the other answers should work well too, the following will exit and close the stream after reading five characters:
Process p = Runtime.getRuntime().exec("busybox");
InputStream a = p.getInputStream();
InputStreamReader read = new InputStreamReader(a);
StringBuilder firstFiveChars = new StringBuilder();
int ch = read.read();
while (ch != -1 && firstFiveChars.length() < 5) {
firstFiveChars.append((char)ch);
ch = read.read();
}
read.close();
a.close();
System.out.println(firstFiveChars);
try
String line = in.readLine();
if(line!=null && line.length() >5)
line = line.substring(0, 5);
Do this way
Process p;
try {
p = Runtime.getRuntime().exec("busybox");
InputStream a = p.getInputStream();
InputStreamReader read = new InputStreamReader(a);
BufferedReader in = new BufferedReader(read);
StringBuilder buffer = new StringBuilder();
String line = null;
try {
while ((line = in.readLine()) != null) {
buffer.append(line);
}
} finally {
read.close();
in.close();
}
String result = buffer.toString().substring(0, 15);
System.out.println("Result : " + result);
} catch (Exception e) {
e.printStackTrace();
}
Output
Result : BusyBox v1.13.3

Number of lines in a file in Java

I use huge data files, sometimes I only need to know the number of lines in these files, usually I open them up and read them line by line until I reach the end of the file
I was wondering if there is a smarter way to do that
This is the fastest version I have found so far, about 6 times faster than readLines. On a 150MB log file this takes 0.35 seconds, versus 2.40 seconds when using readLines(). Just for fun, linux' wc -l command takes 0.15 seconds.
public static int countLinesOld(String filename) throws IOException {
InputStream is = new BufferedInputStream(new FileInputStream(filename));
try {
byte[] c = new byte[1024];
int count = 0;
int readChars = 0;
boolean empty = true;
while ((readChars = is.read(c)) != -1) {
empty = false;
for (int i = 0; i < readChars; ++i) {
if (c[i] == '\n') {
++count;
}
}
}
return (count == 0 && !empty) ? 1 : count;
} finally {
is.close();
}
}
EDIT, 9 1/2 years later: I have practically no java experience, but anyways I have tried to benchmark this code against the LineNumberReader solution below since it bothered me that nobody did it. It seems that especially for large files my solution is faster. Although it seems to take a few runs until the optimizer does a decent job. I've played a bit with the code, and have produced a new version that is consistently fastest:
public static int countLinesNew(String filename) throws IOException {
InputStream is = new BufferedInputStream(new FileInputStream(filename));
try {
byte[] c = new byte[1024];
int readChars = is.read(c);
if (readChars == -1) {
// bail out if nothing to read
return 0;
}
// make it easy for the optimizer to tune this loop
int count = 0;
while (readChars == 1024) {
for (int i=0; i<1024;) {
if (c[i++] == '\n') {
++count;
}
}
readChars = is.read(c);
}
// count remaining characters
while (readChars != -1) {
for (int i=0; i<readChars; ++i) {
if (c[i] == '\n') {
++count;
}
}
readChars = is.read(c);
}
return count == 0 ? 1 : count;
} finally {
is.close();
}
}
Benchmark resuls for a 1.3GB text file, y axis in seconds. I've performed 100 runs with the same file, and measured each run with System.nanoTime(). You can see that countLinesOld has a few outliers, and countLinesNew has none and while it's only a bit faster, the difference is statistically significant. LineNumberReader is clearly slower.
I have implemented another solution to the problem, I found it more efficient in counting rows:
try
(
FileReader input = new FileReader("input.txt");
LineNumberReader count = new LineNumberReader(input);
)
{
while (count.skip(Long.MAX_VALUE) > 0)
{
// Loop just in case the file is > Long.MAX_VALUE or skip() decides to not read the entire file
}
result = count.getLineNumber() + 1; // +1 because line index starts at 0
}
The accepted answer has an off by one error for multi line files which don't end in newline. A one line file ending without a newline would return 1, but a two line file ending without a newline would return 1 too. Here's an implementation of the accepted solution which fixes this. The endsWithoutNewLine checks are wasteful for everything but the final read, but should be trivial time wise compared to the overall function.
public int count(String filename) throws IOException {
InputStream is = new BufferedInputStream(new FileInputStream(filename));
try {
byte[] c = new byte[1024];
int count = 0;
int readChars = 0;
boolean endsWithoutNewLine = false;
while ((readChars = is.read(c)) != -1) {
for (int i = 0; i < readChars; ++i) {
if (c[i] == '\n')
++count;
}
endsWithoutNewLine = (c[readChars - 1] != '\n');
}
if(endsWithoutNewLine) {
++count;
}
return count;
} finally {
is.close();
}
}
With java-8, you can use streams:
try (Stream<String> lines = Files.lines(path, Charset.defaultCharset())) {
long numOfLines = lines.count();
...
}
The answer with the method count() above gave me line miscounts if a file didn't have a newline at the end of the file - it failed to count the last line in the file.
This method works better for me:
public int countLines(String filename) throws IOException {
LineNumberReader reader = new LineNumberReader(new FileReader(filename));
int cnt = 0;
String lineRead = "";
while ((lineRead = reader.readLine()) != null) {}
cnt = reader.getLineNumber();
reader.close();
return cnt;
}
I tested the above methods for counting lines and here are my observations for Different methods as tested on my system
File Size : 1.6 Gb
Methods:
Using Scanner : 35s approx
Using BufferedReader : 5s approx
Using Java 8 : 5s approx
Using LineNumberReader : 5s approx
Moreover Java8 Approach seems quite handy :
Files.lines(Paths.get(filePath), Charset.defaultCharset()).count()
[Return type : long]
I know this is an old question, but the accepted solution didn't quite match what I needed it to do. So, I refined it to accept various line terminators (rather than just line feed) and to use a specified character encoding (rather than ISO-8859-n). All in one method (refactor as appropriate):
public static long getLinesCount(String fileName, String encodingName) throws IOException {
long linesCount = 0;
File file = new File(fileName);
FileInputStream fileIn = new FileInputStream(file);
try {
Charset encoding = Charset.forName(encodingName);
Reader fileReader = new InputStreamReader(fileIn, encoding);
int bufferSize = 4096;
Reader reader = new BufferedReader(fileReader, bufferSize);
char[] buffer = new char[bufferSize];
int prevChar = -1;
int readCount = reader.read(buffer);
while (readCount != -1) {
for (int i = 0; i < readCount; i++) {
int nextChar = buffer[i];
switch (nextChar) {
case '\r': {
// The current line is terminated by a carriage return or by a carriage return immediately followed by a line feed.
linesCount++;
break;
}
case '\n': {
if (prevChar == '\r') {
// The current line is terminated by a carriage return immediately followed by a line feed.
// The line has already been counted.
} else {
// The current line is terminated by a line feed.
linesCount++;
}
break;
}
}
prevChar = nextChar;
}
readCount = reader.read(buffer);
}
if (prevCh != -1) {
switch (prevCh) {
case '\r':
case '\n': {
// The last line is terminated by a line terminator.
// The last line has already been counted.
break;
}
default: {
// The last line is terminated by end-of-file.
linesCount++;
}
}
}
} finally {
fileIn.close();
}
return linesCount;
}
This solution is comparable in speed to the accepted solution, about 4% slower in my tests (though timing tests in Java are notoriously unreliable).
/**
* Count file rows.
*
* #param file file
* #return file row count
* #throws IOException
*/
public static long getLineCount(File file) throws IOException {
try (Stream<String> lines = Files.lines(file.toPath())) {
return lines.count();
}
}
Tested on JDK8_u31. But indeed performance is slow compared to this method:
/**
* Count file rows.
*
* #param file file
* #return file row count
* #throws IOException
*/
public static long getLineCount(File file) throws IOException {
try (BufferedInputStream is = new BufferedInputStream(new FileInputStream(file), 1024)) {
byte[] c = new byte[1024];
boolean empty = true,
lastEmpty = false;
long count = 0;
int read;
while ((read = is.read(c)) != -1) {
for (int i = 0; i < read; i++) {
if (c[i] == '\n') {
count++;
lastEmpty = true;
} else if (lastEmpty) {
lastEmpty = false;
}
}
empty = false;
}
if (!empty) {
if (count == 0) {
count = 1;
} else if (!lastEmpty) {
count++;
}
}
return count;
}
}
Tested and very fast.
A straight-forward way using Scanner
static void lineCounter (String path) throws IOException {
int lineCount = 0, commentsCount = 0;
Scanner input = new Scanner(new File(path));
while (input.hasNextLine()) {
String data = input.nextLine();
if (data.startsWith("//")) commentsCount++;
lineCount++;
}
System.out.println("Line Count: " + lineCount + "\t Comments Count: " + commentsCount);
}
I concluded that wc -l:s method of counting newlines is fine but returns non-intuitive results on files where the last line doesn't end with a newline.
And #er.vikas solution based on LineNumberReader but adding one to the line count returned non-intuitive results on files where the last line does end with newline.
I therefore made an algo which handles as follows:
#Test
public void empty() throws IOException {
assertEquals(0, count(""));
}
#Test
public void singleNewline() throws IOException {
assertEquals(1, count("\n"));
}
#Test
public void dataWithoutNewline() throws IOException {
assertEquals(1, count("one"));
}
#Test
public void oneCompleteLine() throws IOException {
assertEquals(1, count("one\n"));
}
#Test
public void twoCompleteLines() throws IOException {
assertEquals(2, count("one\ntwo\n"));
}
#Test
public void twoLinesWithoutNewlineAtEnd() throws IOException {
assertEquals(2, count("one\ntwo"));
}
#Test
public void aFewLines() throws IOException {
assertEquals(5, count("one\ntwo\nthree\nfour\nfive\n"));
}
And it looks like this:
static long countLines(InputStream is) throws IOException {
try(LineNumberReader lnr = new LineNumberReader(new InputStreamReader(is))) {
char[] buf = new char[8192];
int n, previousN = -1;
//Read will return at least one byte, no need to buffer more
while((n = lnr.read(buf)) != -1) {
previousN = n;
}
int ln = lnr.getLineNumber();
if (previousN == -1) {
//No data read at all, i.e file was empty
return 0;
} else {
char lastChar = buf[previousN - 1];
if (lastChar == '\n' || lastChar == '\r') {
//Ending with newline, deduct one
return ln;
}
}
//normal case, return line number + 1
return ln + 1;
}
}
If you want intuitive results, you may use this. If you just want wc -l compatibility, simple use #er.vikas solution, but don't add one to the result and retry the skip:
try(LineNumberReader lnr = new LineNumberReader(new FileReader(new File("File1")))) {
while(lnr.skip(Long.MAX_VALUE) > 0){};
return lnr.getLineNumber();
}
How about using the Process class from within Java code? And then reading the output of the command.
Process p = Runtime.getRuntime().exec("wc -l " + yourfilename);
p.waitFor();
BufferedReader b = new BufferedReader(new InputStreamReader(p.getInputStream()));
String line = "";
int lineCount = 0;
while ((line = b.readLine()) != null) {
System.out.println(line);
lineCount = Integer.parseInt(line);
}
Need to try it though. Will post the results.
It seems that there are a few different approaches you can take with LineNumberReader.
I did this:
int lines = 0;
FileReader input = new FileReader(fileLocation);
LineNumberReader count = new LineNumberReader(input);
String line = count.readLine();
if(count.ready())
{
while(line != null) {
lines = count.getLineNumber();
line = count.readLine();
}
lines+=1;
}
count.close();
System.out.println(lines);
Even more simply, you can use the Java BufferedReader lines() Method to return a stream of the elements, and then use the Stream count() method to count all of the elements. Then simply add one to the output to get the number of rows in the text file.
As example:
FileReader input = new FileReader(fileLocation);
LineNumberReader count = new LineNumberReader(input);
int lines = (int)count.lines().count() + 1;
count.close();
System.out.println(lines);
This funny solution works really good actually!
public static int countLines(File input) throws IOException {
try (InputStream is = new FileInputStream(input)) {
int count = 1;
for (int aChar = 0; aChar != -1;aChar = is.read())
count += aChar == '\n' ? 1 : 0;
return count;
}
}
On Unix-based systems, use the wc command on the command-line.
Only way to know how many lines there are in file is to count them. You can of course create a metric from your data giving you an average length of one line and then get the file size and divide that with avg. length but that won't be accurate.
If you don't have any index structures, you'll not get around the reading of the complete file. But you can optimize it by avoiding to read it line by line and use a regex to match all line terminators.
Best Optimized code for multi line files having no newline('\n') character at EOF.
/**
*
* #param filename
* #return
* #throws IOException
*/
public static int countLines(String filename) throws IOException {
int count = 0;
boolean empty = true;
FileInputStream fis = null;
InputStream is = null;
try {
fis = new FileInputStream(filename);
is = new BufferedInputStream(fis);
byte[] c = new byte[1024];
int readChars = 0;
boolean isLine = false;
while ((readChars = is.read(c)) != -1) {
empty = false;
for (int i = 0; i < readChars; ++i) {
if ( c[i] == '\n' ) {
isLine = false;
++count;
}else if(!isLine && c[i] != '\n' && c[i] != '\r'){ //Case to handle line count where no New Line character present at EOF
isLine = true;
}
}
}
if(isLine){
++count;
}
}catch(IOException e){
e.printStackTrace();
}finally {
if(is != null){
is.close();
}
if(fis != null){
fis.close();
}
}
LOG.info("count: "+count);
return (count == 0 && !empty) ? 1 : count;
}
Scanner with regex:
public int getLineCount() {
Scanner fileScanner = null;
int lineCount = 0;
Pattern lineEndPattern = Pattern.compile("(?m)$");
try {
fileScanner = new Scanner(new File(filename)).useDelimiter(lineEndPattern);
while (fileScanner.hasNext()) {
fileScanner.next();
++lineCount;
}
}catch(FileNotFoundException e) {
e.printStackTrace();
return lineCount;
}
fileScanner.close();
return lineCount;
}
Haven't clocked it.
if you use this
public int countLines(String filename) throws IOException {
LineNumberReader reader = new LineNumberReader(new FileReader(filename));
int cnt = 0;
String lineRead = "";
while ((lineRead = reader.readLine()) != null) {}
cnt = reader.getLineNumber();
reader.close();
return cnt;
}
you cant run to big num rows, likes 100K rows, because return from reader.getLineNumber is int. you need long type of data to process maximum rows..

Categories

Resources