I need to do processes on a file ,first count the number of lines and compare with a value.
The next is one to read thru the file line by line and do validations.
if first one passes only i need to do second process.
I read the same file using FTP.
When i try to create a different input stream...ftp is busy reading the current file.
like this :
(is1 = ftp.getFile(feedFileName);)
below is the remaining :
InputStream is = null;
LineNumberReader lin = null;
LineNumberReader lin1 = null;
is = ftp.getFile(feedFileName);
lin = new LineNumberReader(new InputStreamReader(is));
so can i just use like below:
is1=is;
Will both streams be having the file contents from start to finish or the second object will become null as soon as the first stream object is read.
So is the only option left is to create a new ftp object to read a stream seperately ?
It can, but you would need to "rewind" the InputStream. First you need to call mark() method on it, and then reset. Here are docs: http://docs.oracle.com/javase/6/docs/api/java/io/InputStream.html#reset()
After you are done with the LineNumberReader, close the InputStream is. Then re-request the file from FTP, it will not be busy then anymore. You cannot 'just' read from the same InputStream, as that one is probably exhausted by the time the LineNumberReader is done. Furthermore, not all InputStreams support the mark() and reset() methods.
However I'd suggest that doing the second process only when the first one succeeds might not be the right way. As you're streaming the data anyways, why not stream it into a temporary data structure and then count the lines and then operate on the same data structure.
if you file is not big, you can save data to a String.
liek:
StringBuilder sb = new StringBuilder();
byte[] buffer = new byte[1024];
int len;
while((len = is.read(buffer))!=-1)
sb.append(buffer, 0, len);
String data = sb.toString();
then you can do further thing in the String
like:
int lineNumber = data.split("\n").length;
Related
How to read file twice eihher using buffer reader or using stream twice ???
That I need manipulate large amounts of data in the code, so the performance needs to be considered.
Sample code 1 below, gives exception "stream closed" -
Url url = 'www.google.com'
InputStream in = url.openStream();
BufferReader br = new BufferReader(in);
Stream<String> ss = br.lines; // read all the lines
List ll = ss.collect();
br.close();
BufferReader br = new BufferReader(in); //exception occurs
Sample code 2 below, gives exception "stream closed/being used" -
Url url = 'www.google.com'
InputStream in = url.openStream();
BufferReader br = new BufferReader(in);
Supplier<Stream<String>> ss = br.lines; // read all the lines
List ll = ss.collect();
List xx = ss.collect();. // Exception occurs
Please ignore the syntax, it's just a draft code.
Kindly suggest.
Here have an example below. You could use it to read as many times as you wish.
BufferedReader br = new BufferedReader(new FileReader( "users/desktop/xxx.txt" ));
String strLine;
List<String> ans= new ArrayList<String>();
// Read rows
while ((strLine = br.readLine()) != null) {
System.out.println(strLine);
ans.add(strLine);
}
// Read again
for (String result: ans) {
System.out.println(result);
}
reference
https://www.dreamincode.net/forums/topic/272652-reading-from-same-file-twice/
You cannot. A stream is just like its real-life watery counterpart. You can observe the water going under the bridge you're standing on, but you can't instruct the water to go back to the top of the hill so that you can observe it again.
Either have each consumers process each line before moving on to the next line, or if that is not possible then you will need to create your own "buffer" of the entire thing: i.e. store each line to Collection<String>, which the second (and third, and fourth...) consumer can iterate over. The potential problem with this is that it's a bigger memory overhead. The HTML of most websites is not likely to prove to be much of a problem in this regard.
Your last example can be trivially fixed by copying the list.
List ll = ss.collect();
List xx = new ArrayList(ll);
In terms of use a stream is somewhat analogous to an iterator in that it can only be used once.
If you want to use the contents of the same stream again you need to create a new stream as you did the first.
As of Java 12, you can pass values of the same stream into two branches by using the Collectors.teeing() method.
List.stream().collect(Collectors.teeing(
Collector1, // do something with the stream
Collector2, // do something else with the stream
BiFunction, use to merge results)
You can also do this.
Supplier<Stream<String>> ss1 = br.lines; // read all the lines
Supplier<Stream<String>> ss2 = br.lines; // read all the lines
Now you can use ss1 and ss2 as two separate streams.
I need to pipe data into another process. The data is an array of strings that I concatenated into one large string. The external process accepts a text file. Currently, I am writing the string into a ByteArrayOutputStream but is there a better way to do this?
public OutputStream generateBoxFile() throws IOException {
OutputStream boxStream = new ByteArrayOutputStream();
for (String boxLine : boxLines) {
boxLine += "\n";
boxStream.write(boxLine.getBytes(Charset.forName("UTF-8")));
}
return boxStream;
}
EDIT: For further clarifications, I am launching a program called trainer which accepts a text file. So I would invoke this program like this in the shell ./trainer textfile. However, I want to do everything in memory, so I'm looking for a good way to write data into a temporary file that is not on disk and then feed this into trainer.
The simplest way to write a collection String to a file is to use a PrintWriter
public static void writeToFile(String filename, Iterable<String> strings) {
try (PrintWriter pw = new PrintWriter(filename)) {
for(String str : strings)
pw.println(str);
}
}
If you need to write UTF-8 you can change the encoding with
try (PrintWriter pw = new PrintWriter(
new OutputStreamWriter(new FileOutputStream(filename), "UTF-8")) {
You can easily pipe data to a process you've launched through its standard input stream. In the parent process, you can access the child's standard input stream through Process.getOutputStream().
This does require your child process to accept data through standard input rather than a file. Your child process currently gets its input from a file. Fortunately, you note in a comment that you own the code of the child process.
Can I share an InputStream or OutputStream?
For example, let's say I first have:
DataInputStream incoming = new DataInputStream(socket.getInputStream()));
...incoming being an object variable. Later on I temporarily do:
BufferedReader dataReader = new BufferedReader(new InputStreamReader(socket.getInputStream()));
I understand that the stream is concrete and reading from it will consume its input, no matter from where it's done... But after doing the above, can I still access both incoming and dataReader simultaneously or is the InputStream just connected to ONE object and therefore incoming loses its input once I declare dataReader? I understand that if I close the dataReader then I will close the socket as well and I will refrain from this but I'm wondering whether I need to "reclaim" the InputStream somehow to incoming after having "transferred" it to dataReader? Do I have to do:
incoming = new DataInputStream(socket.getInputStream());
again after this whole operation?
You are using a teaspoon and a shovel to move dirt from a hole.
I understand that the stream is concrete and reading from it will
consume its input, no matter from where it's done
Correct. The teaspoon and shovel both move dirt from the hole. If you are removing dirt asynchronously (i.e. concurrently) you could get into fights about who has what dirt - so use concurrent construct to provide mutually exclusive access. If access is not concurrent, in other words ...
1) move one or more teaspoons of dirt from the hole
2) move one or more shovels of dirt from the hole
3) move one or more teaspoons of dirt from the hole
...
No problem. Teaspoon and shovel both remove dirt. But once dirt gets removed, it's removed, they do not get the same dirt. Hope this helps. Let's start shovelling, I'll use the teaspoon. :)
As fast-reflexes found, be very careful about sharing streams, particularly buffered readers since they can gobble up a lot more bytes off the stream than they need, so when you go back to your other input stream (or reader) it may look like a whole bunch of bytes have been skipped.
Proof you can read from same input stream:
import java.io.*;
public class w {
public static void main(String[] args) throws Exception {
InputStream input = new FileInputStream("myfile.txt");
DataInputStream b = new DataInputStream(input);
int data, count = 0;
// read first 20 characters with DataInputStream
while ((data = b.read()) != -1 && ++count < 20) {
System.out.print((char) data);
}
// if prematurely interrupted because of count
// then spit out last char grabbed
if (data != -1)
System.out.print((char) data);
// read remainder of file with underlying InputStream
while ((data = input.read()) != -1) {
System.out.print((char) data);
}
b.close();
}
}
Input file:
hello OP
this is
a file
with some basic text
to see how this
works when moving dirt
from a hole with a teaspoon
and a shovel
Output:
hello OP
this is
a file
with some basic text
to see how this
works when moving dirt
from a hole with a teaspoon
and a shovel
Proof to show BufferedReader is NOT gauranteed to work as it gobbles up lots of chars from the stream:
import java.io.*;
public class w {
public static void main(String[] args) throws Exception {
InputStream input = new FileInputStream("myfile.txt");
BufferedReader b = new BufferedReader(new InputStreamReader(input));
// read three lines with BufferedReader
String line;
for (int i = 0; (line = b.readLine()) != null && i < 3; ++i) {
System.out.println(line);
}
// read remainder of file with underlying InputStream
int data;
while ((data = input.read()) != -1) {
System.out.print((char) data);
}
b.close();
}
}
Input file (same as above):
hello OP
this is
a file
with some basic text
to see how this
works when moving dirt
from a hole with a teaspoon
and a shovel
Output:
hello OP
this is
a file
This will be disastrous. Both streams will have corrupted data. How could Java possibly know which data to send to which Stream?
If you need to do two different things with the same data, you're better off storing it somewhere (possibly copying it into two Queue<String>), and then reading it that way.
Ok, I solved this myself.. interesting links:
http://www.coderanch.com/t/276168//java/InputStream-multiple-Readers
Multiple readers for InputStream in Java
Basically... the InputStream can be connected to multiple objects reading from it and consuming it. However, a BufferedReader reads ahead, so when involving one of those, it might be a good idea to implement some sort of signal when you're switching from for example a BufferedReader to a DataInputStream (that is you want to use the DataInputStream to process the InputStream all of a sudden instead of the BufferedReader). Therefore I stop sending data to the InputStream once I know that all data has been sent that is for the BufferedReader to handle. After this, I wait for the other part to process what it should with the BufferedReader. It then sends a signal to show that it's ready for new input. The sending part should be blocking until it receives the signal input and then it can start sending data again. If I don't use the BufferedReader after this point, it won't have a chance to buffer up all the input and "steal" it from the DataInputStream and everything works very well :) But be careful, one read operation from the BufferedReader and you will be back in the same situation... Good to know!
I need to read lines in URL html page from specific line.
For now, I have the following code:
u = new URL("http://s.ll/message/" + counter);
is = u.openStream(); // throws an IOException
dis = new DataInputStream(new BufferedInputStream(is));
while ((s = dis.readLine()) != null) {
if (s.contains('%')
...
}
I know that this content will not be before the 50th line.
How can I read just from this line?
And is it the quickest way to read URLs?
How can I read just from this line?
Count the lines and ignore the line when the count is below 50. There's no magic way to go straight to line 50 other than just reading the stream and counting the lines. The stream has to be read in anyway.
And is it the quickest way to read URLs?
Depends. However, a more common approach is BufferedReader + InputStreamReader wherein you specify the charset the webpage is encoded in to avoid mojibake.
You're on the right track. To read data from URLs, the simplest way is to just use the URL object. For more complicated HTTP communication tasks you might consider HTTPClient.
The method you're using DataInputStream.readLine() is deprecated since you can't provide the character set used when converting from bytes to string.
I'd do like this:
u = new URL("http://s.ll/message/" + counter);
is = u.openStream(); // throws an IOException
// XXX notice the charset set to utf-8 here.
BufferedReader reader = new BufferedReader(new InputStreamReader(is, "utf-8"));
while ((s = reader.readLine()) != null) {
if (s.contains('%')
...
}
Finding the 50th line requires you to skip to it. Since you can't know at which byte offset into the stream the 50th '\n' (or '\r' or '\r\n' depending on Unix, Mac or Windows line breaks) is - you simply have to count from the beginning.
I currently use the following function to do a simple HTTP GET.
public static String download(String url) throws java.io.IOException {
java.io.InputStream s = null;
java.io.InputStreamReader r = null;
//java.io.BufferedReader b = null;
StringBuilder content = new StringBuilder();
try {
s = (java.io.InputStream)new URL(url).getContent();
r = new java.io.InputStreamReader(s);
//b = new java.io.BufferedReader(r);
char[] buffer = new char[4*1024];
int n = 0;
while (n >= 0) {
n = r.read(buffer, 0, buffer.length);
if (n > 0) {
content.append(buffer, 0, n);
}
}
}
finally {
//if (b != null) b.close();
if (r != null) r.close();
if (s != null) s.close();
}
return content.toString();
}
I see no reason to use the BufferedReader since I am just going to download everything in sequence. Am I right in thinking there is no use for the BufferedReader in this case?
In this case, I would do as you are doing (use a byte array for buffering and not one of the stream buffers).
There are exceptions, though. One place you see buffers (output this time) is in the servlet API. Data isn't written to the underlying stream until flush() is called, allowing you to buffer output but then dump the buffer if an error occurs and write an error page instead. You might buffer input if you needed to reset the stream for rereading using mark(int) and reset(). For example, maybe you'd inspect the file header before deciding on which content handler to pass the stream to.
Unrelated, but I think you should rewrite your stream handling. This pattern works best to avoid resource leaks:
InputStream stream = new FileInputStream("in");
try { //no operations between open stream and try block
//work
} finally { //do nothing but close this one stream in the finally
stream.close();
}
If you are opening multiple streams, nest try/finally blocks.
Another thing your code is doing is making the assumption that the returned content is encoded in your VM's default character set (though that might be adequate, depending on the use case).
You are correct, if you use BufferedReader for reading HTTP content and headers you will want InputStreamReader so you can read byte for byte.
BufferedReader in this scenario sometimes does weird things...escpecially when it comes to reading HTTP POST headers, sometimes you will be unable to read the POST data, if you use the InputStreamReader you can read the content length and read that many bytes...
Each invocation of one of an InputStreamReader's read() methods may cause one or more bytes to be read from the underlying byte-input stream. To enable the efficient conversion of bytes to characters, more bytes may be read ahead from the underlying stream than are necessary to satisfy the current read operation.
My gut tells me that since you're already performing buffering by using the byte array, it's redundant to use the BufferedReader.