Response size limit when using Apache HttpComponents - java

I am converting some code from the Http Client 3.x library over to the Http Components 4.x library. The old code contains a check to make sure that the response is not over a certain size. This is fairly easy to do in Http Client 3.x since you can get back a stream from the response using the getResponseBodyAsStream() method and determine when the size has been exceeded. I can't find a similar way in Http Components.
Here's the old code as an example of what I'm trying to do:
private static final long RESPONSE_SIZE_LIMIT = 1024 * 1024 * 10;
private static final int READ_BUFFER_SIZE = 16384;
private static ByteArrayOutputStream readResponseBody(HttpMethodBase method)
throws IOException {
int len;
byte buff[] = new byte[READ_BUFFER_SIZE];
ByteArrayOutputStream out = null;
InputStream in = null;
long byteCount = 0;
in = method.getResponseBodyAsStream();
out = new ByteArrayOutputStream(READ_BUFFER_SIZE);
while ((len = in.read(buff)) != -1 && byteCount <= RESPONSE_SIZE_LIMIT) {
byteCount += len;
out.write(buff, 0, len);
}
if (byteCount >= RESPONSE_SIZE_LIMIT) {
throw new IOException(
"Size limited exceeded reading from HTTP input stream");
}
return (out);
}

You can use HttpEntity.getContent() to get an InputStream to read from yourself.

Related

OOM while uploading large file

I need to upload a very large file from my machine to a server. (a few GB)
Currently, I tried the below approach but I keep getting.
Caused by: java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3236)
I can increase the memory but this is not something I want to do because not sure where my code will run. I want to read a few MB/kb send them to the server and release the memory and repeat. tried other approaches like Files utils or IOUtils.copyLarge but I get the same problem.
URL serverUrl =
new URL(url);
HttpURLConnection urlConnection = (HttpURLConnection) serverUrl.openConnection();
urlConnection.setConnectTimeout(Configs.TIMEOUT);
urlConnection.setReadTimeout(Configs.TIMEOUT);
File fileToUpload = new File(file);
urlConnection.setDoOutput(true);
urlConnection.setRequestMethod("POST");
urlConnection.addRequestProperty("Content-Type", "application/octet-stream");
urlConnection.connect();
OutputStream output = urlConnection.getOutputStream();
FileInputStream input = new FileInputStream(fileToUpload);
upload(input, output);
//..close streams
private static long upload(InputStream input, OutputStream output) throws IOException {
try (
ReadableByteChannel inputChannel = Channels.newChannel(input);
WritableByteChannel outputChannel = Channels.newChannel(output)
) {
ByteBuffer buffer = ByteBuffer.allocateDirect(10240);
long size = 0;
while (inputChannel.read(buffer) != -1) {
buffer.flip();
size += outputChannel.write(buffer);
buffer.clear();
}
return size;
}
}
I think it has something to do with this but I can't figure out what I am doing wrong.
Another approach was but I get the same issue:
private static long copy(InputStream source, OutputStream sink)
throws IOException {
long nread = 0L;
byte[] buf = new byte[10240];
int n;
int i = 0;
while ((n = source.read(buf)) > 0) {
sink.write(buf, 0, n);
nread += n;
i++;
if (i % 10 == 0) {
log.info("flush");
sink.flush();
}
}
return nread;
}
Use setFixedLengthStreamingMode as per this answer on the duplicate question Denis Tulskiy linked to:
conn.setFixedLengthStreamingMode((int) fileToUpload.length());
From the docs:
This method is used to enable streaming of a HTTP request body without internal buffering, when the content length is known in advance.
At the moment, your code is attempting to buffer the file into Java's heap memory in order to compute the Content-Length header on the HTTP request.

How to read http request properly?

How to read HTTP request using InputStream? I used to read it like this:
InputStream in = address.openStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
StringBuilder result = new StringBuilder();
String line;
while((line = reader.readLine()) != null) {
result.append(line);
}
System.out.println(result.toString());
But reader.readLine() could be blocked, because there is no guarantee that null line will be reached. Of course I can read Content-Length header and then read request in a loop:
for (int i = 0; i < contentLength; i++) {
int a = br.read();
body.append((char) a);
}
But if Content-Length is set too big (I guess it could be set manually for purpose), br.read() will be blocked.
I try to read bytes directly from InputStream like this:
byte[] bytes = getBytes(is);
public static byte[] getBytes(InputStream is) throws IOException {
int len;
int size = 1024;
byte[] buf;
if (is instanceof ByteArrayInputStream) {
size = is.available();
buf = new byte[size];
len = is.read(buf, 0, size);
} else {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
buf = new byte[size];
while ((len = is.read(buf, 0, size)) != -1)
bos.write(buf, 0, len);
buf = bos.toByteArray();
}
return buf;
}
But it waits forever. What do?
If you are implementing HTTP server you should detect the end of the request according to HTTP specification. Wiki - https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol
First of all, you should read a request line, it is always a single line.
Then read all request headers. You read them until you have an empty line (i.e. two line endings - <CR><LF>).
After you have a status line and headers you should decide do you need to read body or no because not all requests might have a body - summary table
Then, if you need a body, you should parse your headers (which you already got) and get Content-Length. If it is - just read as many bytes from the stream as it is specified.
When Content-Length is missing the length is determined in other ways. Chunked transfer encoding uses a chunk size of 0 to mark the end of the content. Identity encoding without Content-Length reads content until the socket is closed.
Create a request wrapper which extends HttpServletRequestWrapper, which will override the getInputStream() which in turn return ServletInputStream , which has the safe read method. try that

Netty 4.0 HTTP Chunks memory leaks?

I'm trying to make HTTP Transfer Encoding Chunked work with Netty 4.0.
I had success with it so far. It works well with small payloads.
Then I tried with large data, it started to hang.
I suspect there might be a problem with my code, or maybe a leak with ByteBuf.copy().
I stripped down my code to the bare minimum to be sure that I had no other source of leak or side effect and I've ended down to write this test. The complete code is here.
Basically it sends 1GB of 0x0 when you connect with wget to port 8888. I reproduce the problem when I connect with
wget http://127.0.0.1:8888 -O /dev/null
Here's the handler :
protected void channelRead0(ChannelHandlerContext ctx, FullHttpMessage msg) throws Exception {
DefaultHttpResponse response = new DefaultHttpResponse(HTTP_1_1, OK);
HttpHeaders.setTransferEncodingChunked(response);
response.headers().set(CONTENT_TYPE, "application/octet-stream");
ctx.write(response);
ByteBuf buf = Unpooled.buffer();
int GIGABYTE = (4 * 1024 * 1024); // multiply 256B = 1GB
for (int i = 0; i < GIGABYTE; i++) {
buf.writeBytes(CONTENT_256BYTES_ZEROED);
ctx.writeAndFlush(new DefaultHttpContent(buf.copy()));
buf.clear();
}
ctx.writeAndFlush(LastHttpContent.EMPTY_LAST_CONTENT).addListener(ChannelFutureListener.CLOSE);
}
Is there anything wrong with my approach?
EDIT :
With VisualVM I've found that there is a memory leak in the ChannelOutboundBuffer.
The Entry[] buffer keeps growing, addCapacity() is called multiple times. The Entry array seems to contains copies of the buffers that are (or should) be written to the wire.
I see with wireshark data coming in...
Here's a Dropbox link to the heapdump
I have found what I was doing wrong.
The for loop that writeAndFlush() was not working well and is likely to be cause of the leak.
I tried various things (see many revisions in the gist link). See the gist version at the time of writing.
I have found out that the best way to achieve what I wanted to do without memory leaks was to extends InputStream and write to the context (not using writeAndFlush()) the InputStream wrapped in an io.netty.handler.stream.ChunkedStream.
DefaultHttpResponse response = new DefaultHttpResponse(HTTP_1_1, OK);
HttpHeaders.setTransferEncodingChunked(response);
response.headers().set(CONTENT_TYPE, "application/octet-stream");
ctx.write(response);
InputStream is = new InputStream() {
int offset = -1;
byte[] buffer = null;
#Override
public int read() throws IOException {
if (offset == -1 || (buffer != null && offset == buffer.length)) {
fillBuffer();
}
if (buffer == null || offset == -1) {
return -1;
}
while (offset < buffer.length) {
int b = buffer[offset];
offset++;
return b;
}
return -1;
}
// this method simulates an application that would write to
// the buffer.
// ONE GB (max size for the test;
int sz = 1024 * 1024 * 1024;
private void fillBuffer() {
offset = 0;
if (sz <= 0) { // LIMIT TO ONE GB
buffer = null;
return;
}
buffer = new byte[1024];
System.arraycopy(CONTENT_1KB_ZEROED, 0,
buffer, 0,
CONTENT_1KB_ZEROED.length);
sz -= 1024;
}
};
ctx.write(new ChunkedStream(new BufferedInputStream(is), 8192));
ctx.writeAndFlush(LastHttpContent.EMPTY_LAST_CONTENT).addListener(ChannelFutureListener.CLOSE);
The code is writing 1GB of data to the client in 8K chunks. I was able to run 30 simultaneous connection without memory or hanging problems.

Reading char[] from web-service by HttpGet — strange behavior

I am developing an Android application, it is going to fetch a big chunk of JSON data in stream. Calling the web service is OK, but I have a little problem. In my old version I was using Gson for reading the stream then I've tried to insert data to database, it was OK without any problem except performance. So I tried to change approach of loading data, I am trying to read data to char[] first then insert them to database.
This is my new code:
HttpEntity responseEntity = response.getEntity();
final int contentLength = (int) responseEntity.getContentLength();
InputStream stream = responseEntity.getContent();
InputStreamReader reader = new InputStreamReader(stream);
int readCount = 10 * 1024;
int hasread = 0;
char[] buffer = new char[contentLength];
int mustWrite = 0;
int hasread2 = 0;
while (hasread < contentLength) {
// problem is here
hasread += reader.read(buffer, hasread, contentLength - hasread);
}
Reader reader2 = new CharArrayReader(buffer);
The problem is that the reader starts reading correctly but at near of the end of stream, the hasread variable value decreases (by 1) instead of increasing. Very strange to me, and then the while loop never finishes. What's wrong with this code?
You should use a fixed size for the buffer, not the size of whole data (the contentLength). And an important note: the length of a char[] array is different to byte[] array's. The char data type is a single 16-bit Unicode character. While the byte data type is an 8-bit signed two's complement integer.
Also your while loop is wrong, you can fix it as:
import java.io.BufferedInputStream;
private static final int BUF_SIZE = 10 * 1024;
// ...
HttpEntity responseEntity = response.getEntity();
final int contentLength = (int) responseEntity.getContentLength();
InputStream stream = responseEntity.getContent();
BufferedInputStream reader = new BufferedInputStream(stream);
int hasread = 0;
byte[] buffer = new byte[BUF_SIZE];
while ((hasread = reader.read(buffer, 0, BUF_SIZE)) > 0) {
// For example, convert the buffer to a String
String data = new String(buffer, 0, hasread, "UTF-8");
}
Make sure to use your own charset ("UTF-8", "UTF-16"…).

How to measure upload bitrate using Java+Google Data API

I'm writing a Java client application which uses the Google Data API to upload things to youtube. I'm wondering how I would go about tracking the progress of an upload, using the Google Data API library I simply call service.insert to insert a new video, which blocks until it is complete.
Has anyone else come up with a solution to monitor the status of the upload and count the bytes as they are sent?
Thanks for any ideas
Link:
http://code.google.com/apis/youtube/2.0/developers_guide_java.html#Direct_Upload
Extend com.google.gdata.data.media.MediaSource writeTo() to include a counter of bytesRead:
public static void writeTo(MediaSource source, OutputStream outputStream)
throws IOException {
InputStream sourceStream = source.getInputStream();
BufferedOutputStream bos = new BufferedOutputStream(outputStream);
BufferedInputStream bis = new BufferedInputStream(sourceStream);
long byteCounter = 0L;
try {
byte [] buf = new byte[2048]; // Transfer in 2k chunks
int bytesRead = 0;
while ((bytesRead = bis.read(buf, 0, buf.length)) >= 0) {
// byte counter
byteCounter += bytesRead;
bos.write(buf, 0, bytesRead);
}
bos.flush();
} finally {
bis.close();
}
}
}

Categories

Resources