I am trying to read text from a web document using a BufferedReader over an InputStreamReader on an URL (to the file on some Apache server).
String result = "";
URL url = new URL("http://someserver.domain/somefile");
BufferedReader in = null;
in = new BufferedReader(new InputStreamReader(url.openStream(), "iso-8859-1"));
result += in.readLine();
Now this works just fine. But Obviously I'd like the reader not to just read one line, but as many as there are in the file.
Looking at the BufferedReader API the following code should do just that:
while (in.ready()) {
result += in.readLine();
}
I.e. read all lines while there are more lines, stop when no more lines are there. This code does not work however - the reader just never reports ready() = true!
I can even print the ready() value right before reading a line (which reads the correct string from the file) but the reader will report 'false'.
Am I doing something wrong? Why does the BufferedReader return 'false' on ready when there is actually stuff to read?
ready() != has more
ready() does not indicate that there is more data to be read. It only shows if a read will could block the thread. It is likely that it will return false before you read all data.
To find out if there is no more data check if readLine() returns null.
String line = in.readLine();
while(line != null){
...
line = in.readLine();
}
Another way you can do this that bypasses the in.ready() is something like:
while ((nextLine = in.readLine()) != null) {
result += nextLine;
}
You will just continue reading until you are done. This way you do not need to worry about the problem with in.ready().
I think the standard way to write this is to just attempt to read the line and verify that it returned sometime. Something like this:
while ((String nextLine = in.readLine()) != null) {
//System.out.println(nextLine);
result += nextLine;
}
So you just continue to go until you get null returned from the stream. See here for extra information:
http://download.oracle.com/javase/1.5.0/docs/api/java/io/BufferedReader.html#readLine()
The BufferedReader.ready() method is behaving as specified:
The Reader.ready() javadoc says the following:
[Returns] true if the next read() is guaranteed not to block for input, false otherwise. Note that returning false does not guarantee that the next read will block.
Then the BufferedReader.ready() javadoc says the following:
Tells whether this stream is ready to be read. A buffered character stream is ready if the buffer is not empty, or if the underlying character stream is ready.
If you put these two together, it is clear that BufferedReader.ready() can return false in situations where are characters available. In short, you shouldn't rely on ready() to test for logical end-of-file or end-of-stream.
This is what we have been using consistently for years - not sure if it is the "standard" method. I'd like to hear comments about the pros and cons of using URL.openURLStream() directly, and if that is causing the OP's problems. This code works for both HTTP and HTTPS connections.
URL getURL = new URL (servletURL.toString() + identifier+"?"+key+"="+value);
URLConnection uConn = getURL.openConnection();
BufferedReader br = new BufferedReader (new
InputStreamReader (uConn.getInputStream()));
for (String s = br.readLine() ; s != null ; s = br.readLine()) {
System.out.println ("[ServletOut] " + s);
// do stuff with s
}
br.close();
Basically the BufferedReader.ready() method can be used for checking whether the underlying stream is ready for providing data to the method caller.... else we can wait the thread for some time till it becomes ready.
But the real problem is that after we completely read the data stream, it will throw false..
so we didn't know whether the stream is fully read OR underlying stream is busy....
If you want to use in.ready(), the following worked for me well:
for (int i = 0; i < 10; i++) {
System.out.println("is InputStreamReader ready: " + in.ready());
if (!in.ready()) {
Thread.sleep(1000);
} else {
break;
}
}
Related
I use this code snippet to read text from a webpage aand save it to a string?
I would like the readline() function to start from the beggining. So it would read content of the webpage again. How Can I do that
if (response == httpURLConnection.HTTP_OK) {
in = httpURLConnection.getInputStream();
isr = new InputStreamReader(in);
br = new BufferedReader(isr);
while ((line = br.readLine()) != null) {
fullText += line;
}
// I want to go through a webpage source again, but
// I can't because br.readLine() = null. How can I put
// put a marker on the beginning of the page?
while ((line1 = br.readLine()) != null) {
fullText1 += line1;
// It will not go into this loop
}
You can only mark a position for a Reader (and return to it with reset()) if markSupported returns true, and I very much doubt that the stream returned by httpURLConnection.getInputStream() supports marks.
The best option, I think, is to read the response into a buffer and then you can create as many readers as you like over that buffer. You will need to include the line termination characters (which you are currently discarding) to preserve the line structure. (Alternatively, you can read the response into a List<String> rather than into a single String.)
From InputStream will not reset to beginning
your stream inside a BufferedInputStream object like:
with the markSupported() method if your InputStream actually support using mark. According to the API the InputStream class doesn't, but the java.io.BufferedInputStream class does. Maybe you should embed your stream inside a BufferedInputStream object like:
InputStream data = new BufferedInputStream(realResponse.getEntity().getContent());
// data.markSupported() should return "true" now
data.mark(some_size);
// work with "data" now
...
data.reset();
I am trying to parse HTML from a website to get very specific data. The following method reads the source and outputs it as a string to be processed by other methods.
StringBuilder source = new StringBuilder();
URL url = new URL(urlIn);
URLConnection spoof;
spoof = url.openConnection();
spoof.setRequestProperty( "User-Agent", "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0; H010818)" );
BufferedReader in = new BufferedReader(new InputStreamReader(spoof.getInputStream()));
String strLine = "";
while ((strLine = in.readLine()) != null){
source.append(strLine);
}
return source.toString();
The problem that I'm having is that since I call this method multiple times with a different urlIn argument each time, sometimes the method gets stuck at the readLine command. I read that this is because readLine looks for a line break and if the BufferedReader object does not contain one for whatever reason, it will be stuck indefinitely.
Is there a way to check whether my BufferedReader object contains a line break before I run the readLine command. I tried using an if (in.toString().contains("\n")) but that always returns false. Alternatively, could I add a "\n" at the end of my Buffered Reader "in" object every time just so that the while loop would break and not hang up indefinitely?
Any help would be appreciated.
Okay, this here should be what you are looking for.
fis = new FileInputStream("C:/sample.txt");
reader = new BufferedReader(new InputStreamReader(fis));
System.out.println("Reading File line by line using BufferedReader");
String line = reader.readLine();
while(line != null){
System.out.println(line);
line = reader.readLine();
}
Read more: http://javarevisited.blogspot.com/2012/07/read-file-line-by-line-java-example-scanner.html#ixzz3g4RHvy6V
Edit, in your case, since it seems like you are doing webapp testing, I do believe WebDriverWait may work for your needs.
This is not true. BufferedReader.readLine() will not block if the underlying stream has reached the end of input. It will return null. See http://docs.oracle.com/javase/7/docs/api/java/io/BufferedReader.html#readLine().
If your method is getting stuck there is another explanation.
Carefully check all of your exception handling and stream closing logic.
I added Findbugs plugin to my project and I suddenly started getting the following bug: Dereference of the result of readLine() without nullcheck
I have the following code which reads the http request line by line:
InputStream input = clientSocket.getInputStream();
String line;
while (!(line = in.readLine()).equals("")) {
...
}
I tried rewriting this into some other for with nullcheck:
String line = "";
while (line != null) {
line = in.readLine();
if (line.equals("")) return;
}
But this gets stuck forever (so it is not rewritten correctly). I am sorry for such a basic question but I can't seem to get it right...
Another thing that is marked as bug is Found reliance on default encoding in ..InputStream...
How can I specify encoding in InputStreamReader?
The fixed loop looks like so:
InputStream input = clientSocket.getInputStream();
String line;
while (null != (line = in.readLine())) {
if("".equals(line)) break;
...
}
Why? First of all, of the remote side (the client) closes the connection, readLine() will return null. That what the outer check guards against.
readLine() won't return at all if the client just stops sending data. So as long as the client keeps the connection open, your "fixed" loop hangs.
When comparing string literals, I always put them first:
"".equals(line))
never fails, even when line is null. It's also often more readable since you often want to know what you're comparing against; the variable which you want to check is less "informative".
Apparently readLine can return null, so you have to check it after the line = in.readLine();
Your updated code could still throw a NullPointerException, if readLine returned null.
I doubt that your change will work, since the check is being made on the previous value of line, thus, if your previous line was valid (but you where reading the last line) any subsequent calls can potentially yield a NullPointerException.
The go around this, usually the following pattern is applied:
InputStream input = clientSocket.getInputStream();
String line = "";
while ((line = in.readLine()) != null) {
...
}
I have following piece of code :
fis = new FileInputStream(new File(st[0]));
br = new BufferedReader(new InputStreamReader(fis));
while(fis.available()!=-1)
{
System.out.println(br.readLine());
System.out.println(fis.available());
}
The first println statement prints whole of my file but alongside second println statement always shows 0. why when there is actual content to read, is it showing 0 ?
and what should i put as end condition over here.
You want to stop when readLine() returns null, something like this:
String sCurrentLine;
br = new BufferedReader(new FileReader("C:\\testing.txt"));
while ((sCurrentLine = br.readLine()) != null) {
System.out.println(sCurrentLine);
}
The first println statement prints whole of my file but alongside second println statement always shows 0.
You're checking available() twice. After you've read some data, it's no longer available to read, so the available() value printed is different to the one used for the loop condition above.
Secondly, you're reading from the BufferedReader, which does its own buffering of the data from the input stream. That means it's wrong to then sneak around the reader's back to call the available method of the underlying input stream!
Try this:
for (;;) {
String line = br.readLine();
if (line == null) break;
System.out.println(line);
}
availabe() is returning the amount of bytes that can be read for that InputStream when it is not blocking. your readLine() is blocking that InputStream.
I tried to ask this question earlier, but I was unclear in my question. Java BufferedReader action on character?
Here is my problem.. I have a BufferedReader set to read from a device. It is reading well. I have it set to
if (Status.reader.ready()) {
Lines = Status.reader.readLine();
}
if (Lines.contains(">")) {
log.level1("ready to send data")
}
Buffered reader does not report the > until I've sent more data to the device. The problem is that when reader contains > it is not reporting ready. It holds onto the > until I input more data.
I tried the following and it returns nothing. It does not even return the log.level0()
Lines = ""
try {
Lines = Status.reader.readLine();
} catch (IOException e) {
Log.level0("Attempted to read blank line");
}
Here is the actual data sent:
^M^M01 02 F3^M00 01 F3 3E^M>
But BufferedReader ignores the > until more data has been sent then get a result like this:
>0102
When I check the actual data from the device from the command prompt, it returns what I'd expect, the > is present.
BufferedReader will not give me the >. Is there some way I can check for this char otherwise?
The BufferedReader.readLine() method reads data a line at a time. That is, it will attempt to read characters until it sees an end-of-line sequence (e.g. "\n", "\r" or "\r\n") or the end of stream.
If your input data is not line oriented, then you should not be using readLine() to read it. I suggest that you do your own record / message extraction; e.g.
BufferedReader br = ...
StringBuilder sb = new StringBuilder(...);
int ch = br.read();
while (ch != -1 && ch != '>') {
sb.append((char) ch);
ch = br.read();
}
String record = sb.toString();
Check this:
http://download.oracle.com/docs/cd/E17476_01/javase/1.5.0/docs/api/java/io/BufferedReader.html
I recommend that you use the function public int read() instead.
At google you can find a lot of examples1
With those F3s in there it looks to me like your data isn't even character-oriented let alone line-oriented. Is your device really Unicode-compliant?
I would use a BufferedInputStream.