BufferedInputStream to ByteArrayOutputStream very slow - java

I have a problem very similar to the link below:
PDF to byte array and vice versa
The main difference being I am trying to interpret a Socket connection via a ServerSocket containing Binary, rather than a file.
This works as expected.
However, the problem I am having is that this process is taking quite a long time to read into memory, about 1 minute 30 seconds for 500 bytes (although the size of each stream will vary massively)
Here's my code:
BufferedInputStream input = new BufferedInputStream(theSocket.getInputStream());
byte[] buffer = new byte[8192];
int bytesRead;
ByteArrayOutputStream output = new ByteArrayOutputStream();
while ((bytesRead = input.read(buffer)) != -1)
{
output.write(buffer, 0, bytesRead);
}
byte[] outputBytes = output.toByteArray();
//Continue ... and eventually close inputstream
If I log it's progress within the while loop within the terminal it seems to log all the bytes quite quickly (i.e. reaches the end of the stream), but then seems to pause for a time before breaking out of the while loop and continuing.
Hope that makes sense.

Well you're reading until the socket is closed, basically - that's when read will return -1.
So my guess is that the other end of the connection is holding it open for 90 seconds before closing it. Fix that, and you'll fix your problem.

ByteArrayOutputStream(int size);
By default the size is 32 bytes so it increses like this: 32->64->128->256->...
So initialize it with a bigger capacity.

You can time how long it takes to copy data between a BufferedInputStream and a ByteArrayOutputStream.
int size = 256 << 20; // 256 MB
ByteArrayInputStream bais = new ByteArrayInputStream(new byte[size]);
long start = System.nanoTime();
BufferedInputStream input = new BufferedInputStream(bais);
byte[] buffer = new byte[8192];
int bytesRead;
ByteArrayOutputStream output = new ByteArrayOutputStream();
while ((bytesRead = input.read(buffer)) != -1) {
output.write(buffer, 0, bytesRead);
}
byte[] outputBytes = output.toByteArray();
long time = System.nanoTime() - start;
System.out.printf("Took %.3f seconds to copy %,d MB %n", time / 1e9, size >> 20);
prints
Took 0.365 seconds to copy 256 MB
It will be much faster for smaller messages i.e. << 256 MB.

Related

Fastest way to write multiple files in java

I have requirement where I need to write multiple input streams to a temp file in java. I have the below code snippet for the logic. Is there a better way to do this in an efficient manner?
final String tempZipFileName = "log" + "_" + System.currentTimeMillis();
File tempFile = File.createTempFile(tempZipFileName, "zip");
final FileOutputStream oswriter = new FileOutputStream(tempFile);
for (final InputStream inputStream : readerSuppliers) {
byte[] buffer = new byte[102400];
int bytesRead = 0;
while ((bytesRead = inputStream.read(buffer)) > 0) {
oswriter.write(buffer, 0, bytesRead);
}
buffer = null;
oswriter.write(System.getProperty("line.separator").getBytes());
inputStream.close();
}
I have multiple files of size ranging from 45 to 400 mb, for a typical 45mb and 360 mb files this method is taking around 3 mins on average. Can this be further improved?
You could try a BufferedInputStream
As #StephenC replied is it unrelevant in this case to use a BufferedInputStream because the buffer is big enough.
I reproduced the behaviour on my computer (with an SSD drive). I took a 100MB file.
It took 110ms to create the new file with this example.
With an InputStreamBuffer and an OutputStream = 120 ms.
With an InputStream and an OutputStreamBuffer = 120 ms.
With an InputStreamBuffer and an
OutputStreamBuffer = 110 ms.
I don't have a so long execution time as your's.
Maybe the problem comes from your readerSuppliers ?

Stop HtmlUnit download after specified file size is reached

I'm stuck trying to stop a download initiated with HtmlUnit after a certain size was reached. The InputStream
InputStream input = button.click().getWebResponse().getContentAsStream();
downloads the complete file correctly. However, seems like using
OutputStream output = new FileOutputStream(fileName);
int bytesRead;
int total = 0;
while ((bytesRead = input.read(buffer)) != -1 && total < MAX_SIZE) {
output.write(buffer, 0, bytesRead);
total += bytesRead;
System.out.print(total + "\n");
}
output.flush();
output.close();
input.close();
somehow downloads the file to a different location (unknown to me) and once finished copies the max size into the file "fileName". No System.out is printed during this process. Interestingly, while running the debugger in Netbeans and going slowly step-by-step, the total is printed and I get the MAX_SIZE file.
Varying the buffer size in a range between 1024 to 102400 didn't make any difference.
I also tried Commons'
BoundedInputStream b = new BoundedInputStream(button.click().getWebResponse().getContentAsStream(), MAX_SIZE);
without success.
There's this 2,5 years old post, but I couldn't figure out how to implement the proposed solution.
Is there something I'm missing in order to stop the download at MAX_SIZE?
(Exceptions handling and other etcetera omitted for brevity)
There is no need to use HTMLUnit for this. Actually, using it to such a simple task is a very overkill solution and will make things slow. The best approach I can think of is the following:
final String url = "http://yoururl.com";
final String file = "/path/to/your/outputfile.zip";
final int MAX_BYTES = 1024 * 1024 * 5; // 5 MB
URLConnection connection = new URL(url).openConnection();
InputStream input = connection.getInputStream();
byte[] buffer = new byte[4096];
int pendingRead = MAX_BYTES;
int n;
OutputStream output = new FileOutputStream(new File(file));
while ((n = input.read(buffer)) >= 0 && (pendingRead > 0)) {
output.write(buffer, 0, Math.min(pendingRead, n));
pendingRead -= n;
}
input.close();
output.close();
In this case I've set a maximum download size of 5 MB and a buffer of 4 KB. The file will be written to disk in every iteration of the while loop, which seems to be what you're looking for.
Of course, make sure you handle all the needed exceptions (eg: FileNotFoundException).

Buffered Input Stream does not load file correctly

I have the following code to download a List of files. After downloading I compare the md5 of the online File with the downloaded.
They are similar when the download size is lower than 1024 bytes. For all over 1024bytes, there is an different md5 sum.
Now I don't know the reason. I think, it depends on the Array-Size with 1024 bytes? Maybe it writes on every time the full 1024 bytes to the file but then the question is, why does it work with files lower than 1kb??
String fileUrl= url_str;
URL url = new URL(fileUrl);
BufferedInputStream bufferedInputStream = new BufferedInputStream(url.openStream());
FileOutputStream fileOutputStream =new FileOutputStream(target);
BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(fileOutputStream, 1024);
byte data[] = new byte[1024];
while(bufferedInputStream.read(data, 0, 1024) >0 )
{
bufferedOutputStream.write(data);
}
bufferedOutputStream.close();
bufferedInputStream.close();
This is broken:
while(bufferedInputStream.read(data, 0, 1024) >0 )
{
bufferedOutputStream.write(data);
}
You're assuming that every read call fills up the entire buffer. You should use the return value of read:
int bytesRead;
while((bytesRead = bufferedInputStream.read(data, 0, 1024)) >0 )
{
bufferedOutputStream.write(data, 0, bytesRead);
}
(Additionally, you should be closing all your streams in finally blocks, but that's another matter.)
After the first read the data[] will be containing bytes. So during the last read the array will contain the last n bytes, and some bytes from the previous read. Actually you should check the return of the read. It indicates how many bytes has been read into the array, and write just that many bytes out.

Calculating Download Speed

I am downloading a file but trying to also determine the download speed in KBps. I came up with an equation, but it is giving strange results.
try (BufferedInputStream in = new BufferedInputStream(url.openStream());
FileOutputStream out = new FileOutputStream(file)) {
byte[] buffer = new byte[4096];
int read = 0;
while (true) {
long start = System.nanoTime();
if ((read = in.read(buffer)) != -1) {
out.write(buffer, 0, read);
} else {
break;
}
int speed = (int) ((read * 1000000000.0) / ((System.nanoTime() - start) * 1024.0));
}
}
It's giving me anywhere between 100 and 300,000. How can I make this give the correct download speed? Thanks
You are not checking your currentAmmount and previousAmount of file downloading.
example
int currentAmount = 0;//set this during each loop of the download
/***/
int previousAmount = 0;
int firingTime = 1000;//in milliseconds, here fire every second
public synchronyzed void run(){
int bytesPerSecond = (currentAmount-previousAmount)/(firingTime/1000);
//update GUI using bytesPerSecond
previousAmount = currentAmount;
}
First, you are calculating read() time + write() time in very short intervals and the result will vary depending on the (disk cache) flushing of the writes().
Put the calculation right after the read()
Second, your buffer size (4096) probably does not match the tcp buffer size (yours is eventually smaller), and because of that some reads will be very fast (because it is read from the local TCP buffer). Use Socket.getReceiveBufferSize()
and set the size of your buffer accordingly (let say 2* the size of TCP recv buf size) and fill it in a nested loop until full before calculating.

java servlet serving a file over HTTP connection

I have the following code(Server is Tomcat/Linux).
// Send the local file over the current HTTP connection
FileInputStream fin = new FileInputStream(sendFile);
int readBlockSize;
int totalBytes=0;
while ((readBlockSize=fin.available())>0) {
byte[] buffer = new byte[readBlockSize];
fin.read(buffer, 0, readBlockSize);
outStream.write(buffer, 0, readBlockSize);
totalBytes+=readBlockSize;
}
With some files of type 3gp
When i attach the debugger, in line:
outStream.write(buffer, 0, readBlockSize);
it breaks out the while with the following error;
ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) line:299
And the file is not served.
Any clues?
Thanks
A.K.
You can't guarantee that InputStream.read(byte[], int, int) will actually read the desired number of bytes: it may read less. Even your call to available() will not provide that guarantee. You should use the return value from fin.read to find out how many bytes were actually read and only write that many to the output.
I would guess that the problem you see could be related to this. If the block read is less than the available size then your buffer will be partially filled and that will cause problems when you write too many bytes to the output.
Also, don't allocate a new array every time through the loop! That will result in a huge number of needless memory allocations that will slow your code down, and will potentially cause an OutOfMemoryError if available() returns a large number.
Try this:
int size;
int totalBytes = 0;
byte[] buffer = new byte[BUFFER_SIZE];
while ((size = fin.read(buffer, 0, BUFFER_SIZE)) != -1) {
outStream.write(buffer, 0, size);
totalBytes += size;
}
Avoiding these types of problems is why I start with Commons IO. If that's an option, your code would be as follows.
FileInputStream fin = new FileInputStream(sendFile);
int totalBytes = IOUtils.copy(fin, outStream);
No need reinventing the wheel.
It is possible that the .read() call returns less bytes than you requested. This means you need to use te returnvalue of .read() as argument to the .write() call:
int bytesRead = fin.read(buffer, 0, readBlockSize);
outStream.write(buffer, 0, bytesRead);
apart from this, it is better to pre-allocate a buffer and use it (your could could try to use a 2Gb buffer if your file is large :-))
byte[] buffer = new byte[4096]; // define a constant for this max length
while ((readBlockSize=fin.available())>0) {
if (4096 < readBlockSize) {
readBlockSise = 4096;
}

Categories

Resources