InputStreamReader don't limit returned length - java

I am working on learning Java and am going through the examples on the Android website. I am getting remote contents of an XML file. I am able to get the contents of the file, but then I need to convert the InputStream into a String.
public String readIt(InputStream stream, int len) throws IOException, UnsupportedEncodingException {
InputStreamReader reader = null;
reader = new InputStreamReader(stream, "UTF-8");
char[] buffer = new char[len];
reader.read(buffer);
return new String(buffer);
}
The issue I am having is I don't want the string to be limited by the len var. But, I don't know java well enough to know how to change this.
How can I create the char without a length?

Generally speaking it's bad practice to not have a max length on input strings like that due to the possibility of running out of available memory to store it.
That said, you could ignore the len variable and just loop on reader.read(...) and append the buffer to your string until you've read the entire InputStream like so:
public String readIt(InputStream stream, int len) throws IOException, UnsupportedEncodingException {
String result = "";
InputStreamReader reader = null;
reader = new InputStreamReader(stream, "UTF-8");
char[] buffer = new char[len];
while(reader.read(buffer) >= 0)
{
result = result + (new String(buffer));
buffer = new char[len];
}
return result;
}

Related

URLConnection doesn't read whole page

In my app I need to download some web page. I do it in a way like this
URL url = new URL(myUrl);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setReadTimeout(5000000);//5 seconds to download
conn.setConnectTimeout(5000000);//5 seconds to connect
conn.setRequestMethod("GET");
conn.setDoInput(true);
conn.connect();
int response = conn.getResponseCode();
is = conn.getInputStream();
String s = readIt(is, len);
System.out.println("got: " + s);
My readIt function is:
public String readIt(InputStream stream) throws IOException {
int len = 10000;
Reader reader;
reader = new InputStreamReader(stream, "UTF-8");
char[] buffer = new char[len];
reader.read(buffer);
return new String(buffer);
}
The problem is that It doesn't dowload the whole page. For example, if myUrl is "https://wikipedia.org", then the output is
How can I download the whole page?
Update
Second answer from here Read/convert an InputStream to a String solved my problem. The problem is in readIt function. You should read response from InputStream like this:
static String convertStreamToString(java.io.InputStream is) {
java.util.Scanner s = new java.util.Scanner(is).useDelimiter("\\A");
return s.hasNext() ? s.next() : "";
}
There are a number of mistakes your code:
You are reading into a character buffer with a fixed size.
You are ignoring the result of the read(char[]) method. It returns the number of characters actually read ... and you need to use that.
You are assuming that read(char[]) will read all of the data. In fact, it is only guaranteed to return at least one character ... or zero to indicate that you have reached the end of stream. When you reach from a network connection, you are liable to only get the data that has already been sent by the other end and buffered locally.
When you create the String from the char[] you are assuming that every position in the character array contains a character from your stream.
There are multiple ways to do it correctly, and this is one way:
public String readIt(InputStream stream) throws IOException {
Reader reader = new InputStreamReader(stream, "UTF-8");
char[] buffer = new char[4096];
StringBuilder builder = new StringBuilder();
int len;
while ((len = reader.read(buffer) > 0) {
builder.append(buffer, 0, len);
}
return builder.toString();
}
Another way to do it is to look for an existing 3rd-party library method with a readFully(Reader) method.
You need to read in a loop till there are no more bytes left in the InputStream.
while (-1 != (len = in.read(buffer))) { //do stuff here}
You are reading only 10000 bytes from the input stream.
Use a BufferedReader to make your life easier.
public String readIt(InputStream stream) throws IOException {
BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
StringBuilder out = new StringBuilder();
String newLine = System.getProperty("line.separator");
String line;
while ((line = reader.readLine()) != null) {
out.append(line);
out.append(newLine);
}
return out.toString();
}

can't work with BufferedInputStream and BufferedReader together

I'm trying to read first line from socket stream with BufferedReader from BufferedInputStream, it reads the first line(1), this is size of some contents(2) in this content i have the size of another content(3)
Reads correctly... ( with BufferedReader, _bin.readLine() )
Reads correctly too... ( with _in.read(byte[] b) )
Won't read, seems there's more content than my size read in (2)
I think problem is that I'm trying to read using BufferedReader and then BufferedInputStream... can anyone help me ?
public HashMap<String, byte[]> readHead() throws IOException {
JSONObject json;
try {
HashMap<String, byte[]> map = new HashMap<>();
System.out.println("reading header");
int headersize = Integer.parseInt(_bin.readLine());
byte[] parsable = new byte[headersize];
_in.read(parsable);
json = new JSONObject(new String(parsable));
map.put("id", lTob(json.getLong(SagConstants.KEY_ID)));
map.put("length", iTob(json.getInt(SagConstants.KEY_SIZE)));
map.put("type", new byte[]{(byte)json.getInt(SagConstants.KEY_TYPE)});
return map;
} catch(SocketException | JSONException e) {
_exception = e.getMessage();
_error_code = SagConstants.ERROR_OCCOURED_EXCEPTION;
return null;
}
}
sorry for bad english and for bad explanation, i tried to explain my problem, hope you understand
file format is so:
size1
{json, length is given size1, there is size2 given}
{second json, length is size2}
_in is BufferedInputStream();
_bin is BufferedReader(_in);
with _bin, i read first line (size1) and convert to integer
with _in, i read next data, where is size2 and length of this data is size1
then im trying to read the last data, its size is size2
something like this:
byte[] b = new byte[secondSize];
_in.read(b);
and nothing happens here, program is paused...
can't work with BufferedInputStream and BufferedReader together
That's correct. If you use any buffered stream or reader on a socket [or indeed any data source], you can't use any other stream or reader with it whatsoever. Data will get 'lost', that is to say read-ahead, in the buffer of the buffered stream or reader, and will not be available to the other stream/reader.
You need to rethink your design.
You create one BufferedReader _bin and BufferedInputStream _in and read a file both of them, but their cursor position is different so second read start from beginning because you use 2 object to read it. You should read size1 with _in too.
int headersize = Integer.parseInt(readLine(_in));
byte[] parsable = new byte[headersize];
_in.read(parsable);
Use below readLine to read all data with BufferedInputStream.
private final static byte NL = 10;// new line
private final static byte EOF = -1;// end of file
private final static byte EOL = 0;// end of line
private static String readLine(BufferedInputStream reader,
String accumulator) throws IOException {
byte[] container = new byte[1];
reader.read(container);
byte byteRead = container[0];
if (byteRead == NL || byteRead == EOL || byteRead == EOF) {
return accumulator;
}
String input = "";
input = new String(container, 0, 1);
accumulator = accumulator + input;
return readLine(reader, accumulator);
}

How to choose the buffer size when reading from a URL

Aim : To read a Url which containing information in Json.
Question: I got a code of reading Url Which is given Below. I have a complete Understanding what code is doing but I do not have any idea why the size of char array is 1024 not 2048 or something else . How to decide what character size array is good at the time of reading Url ?
private static String readUrl(String urlString) throws Exception {
BufferedReader reader = null;
try {
URL url = new URL(urlString);
reader = new BufferedReader(new InputStreamReader(url.openStream()));
StringBuffer buffer = new StringBuffer();
int read;
char[] chars = new char[1024]; ???
while ((read = reader.read(chars)) != -1)
buffer.append(chars, 0, read);
return buffer.toString();
} finally {
if (reader != null)
reader.close();
}
}
As the BufferedReader already has an internal buffer of 4096 characters, implementation-dependent, and as the socket already has a considerably larger receive buffer, it really doesn't make much difference what value you choose. The returns on buffering diminish geometrically with size.

How to use ByteStream to read 1Mb of a file into a string

What I have now is using FileInputStream
int length = 1024*1024;
FileInputStream fs = new FileInputStream(new File("foo"));
fs.skip(offset);
byte[] buf = new byte[length];
int bufferSize = fs.read(buf, 0, length);
String s = new String(buf, 0, bufferSize);
I'm wondering how can I realize the same result by using ByteStreams in guava library.
Thanks a lot!
Here's how you could do it with Guava:
byte[] bytes = Files.asByteSource(new File("foo"))
.slice(offset, length)
.read();
String s = new String(bytes, Charsets.US_ASCII);
There are a couple of problems with your code (though it may work fine for files, it won't necessarily for any type of stream):
fs.skip(offset);
This doesn't necessarily skip all offset bytes. You have to either check the number of bytes it skipped in the return value until you've skipped the full amount or use something that does that for you, such as ByteStreams.skipFully.
int bufferSize = fs.read(buf, 0, length);
Again, this won't necessarily read all length bytes, and the number of bytes it does read can be an arbitrary amount--you can't rely on it in general.
String s = new String(buf, 0, bufferSize);
This implicitly uses the system default Charset, which usually isn't a good idea--and when you do want it, it's best to make it explicit with Charset.defaultCharset().
Also note that in general, a certain number of bytes may not translate to a legal sequence of characters depending on the Charset being used (i.e. if it's ASCII you're fine, if it's Unicode, not so much).
Why try to use Guava when it's not necessary ?
In this case, it looks like you're looking exactly for a RandomAccessFile.
File file = new File("foo");
long offset = ... ;
try (RandomAccessFile raf = new RandomAccessFile(file, "r")) {
byte[] buffer = new byte[1014*1024];
raf.seek(offset);
raf.readFully(buffer);
return new String(buffer, Charset.defaultCharset());
}
I'm not aware of a more elegant solution:
public static void main(String[] args) throws IOException {
final int offset = 20;
StringBuilder to = new StringBuilder();
CharStreams.copy(CharStreams.newReaderSupplier(new InputSupplier<InputStream>() {
#Override
public InputStream getInput() throws IOException {
FileInputStream fs = new FileInputStream(new File("pom.xml"));
ByteStreams.skipFully(fs, offset);
return fs;
}
}, Charset.defaultCharset()), to);
System.out.println(to);
}
The only advantage is that you can save some GC time when your String is really big by avoiding conversion into String.

Convert InputStream to String with encoding given in stream data

My input is a InputStream which contains an XML document. Encoding used in XML is unknown and it is defined in the first line of XML document.
From this InputStream, I want to have all document in a String.
To do this, I use a BufferedInputStream to mark the beginning of the file and start reading first line. I read this first line to get encoding and then I use an InputStreamReader to generate a String with the correct encoding.
It seems that it is not the best way to achieve this goal because it produces an OutOfMemory error.
Any idea, how to do it?
public static String streamToString(final InputStream is) {
String result = null;
if (is != null) {
BufferedInputStream bis = new BufferedInputStream(is);
bis.mark(Integer.MAX_VALUE);
final StringBuilder stringBuilder = new StringBuilder();
try {
// stream reader that handle encoding
final InputStreamReader readerForEncoding = new InputStreamReader(bis, "UTF-8");
final BufferedReader bufferedReaderForEncoding = new BufferedReader(readerForEncoding);
String encoding = extractEncodingFromStream(bufferedReaderForEncoding);
if (encoding == null) {
encoding = DEFAULT_ENCODING;
}
// stream reader that handle encoding
bis.reset();
final InputStreamReader readerForContent = new InputStreamReader(bis, encoding);
final BufferedReader bufferedReaderForContent = new BufferedReader(readerForContent);
String line = bufferedReaderForContent.readLine();
while (line != null) {
stringBuilder.append(line);
line = bufferedReaderForContent.readLine();
}
bufferedReaderForContent.close();
bufferedReaderForEncoding.close();
} catch (IOException e) {
// reset string builder
stringBuilder.delete(0, stringBuilder.length());
}
result = stringBuilder.toString();
}else {
result = null;
}
return result;
}
The call to mark(Integer.MAX_VALUE) is causing the OutOfMemoryError, since it's trying to allocate 2GB of memory.
You can solve this by using an iterative approach. Set the mark readLimit to a reasonable value, say 8K. In 99% of cases this will work, but in pathological cases, e.g 16K spaces between the attributes in the declaration, you will need to try again. Thus, have a loop that tries to find the encoding, but if it doesn't find it within the given mark region, it tries again, doubling the requested mark readLimit size.
To be sure you don't advance the input stream past the mark limit, you should read the InputStream yourself, upto the mark limit, into a byte array. You then wrap the byte array in a ByteArrayInputStream and pass that to the constructor of the InputStreamReader assigned to 'readerForEncoding'.
You can use this method to convert inputstream to string. this might help you...
private String convertStreamToString(InputStream input) throws Exception{
BufferedReader reader = new BufferedReader(new InputStreamReader(input));
StringBuilder sb = new StringBuilder();
String line = null;
while ((line = reader.readLine()) != null) {
sb.append(line);
}
input.close();
return sb.toString();
}

Categories

Resources