Fastest way to download file on site using java - java

I am writing a Java code to download large amount of zip files on site using http protocol, and each file is around 1MB(1024KB) size.
I know there are a lot of ways to doing that. I am just wandering which is the fastest, and I would like to know the progress of each downloading like showing a percentage number or something.
I am just giving my version of code , any ideas on how to improve it?
Thanks All.
public static void downloadFile(String downloadUrl , String fileName) throws Exception {
URL url=new URL(downloadUrl);
URLConnection connection = url.openConnection();
int filesize = connection.getContentLength();
float totalDataRead=0;
java.io.BufferedInputStream in = new java.io.BufferedInputStream(connection.getInputStream());
java.io.FileOutputStream fos = new java.io.FileOutputStream(fileName);
java.io.BufferedOutputStream bout = new BufferedOutputStream(fos,1024);
byte[] data = new byte[1024];
int i=0;
while((i=in.read(data,0,1024))>=0) {
totalDataRead=totalDataRead+i;
bout.write(data,0,i);
float Percent=(totalDataRead*100)/filesize;
System.out.println((int)Percent);
}
bout.close();
in.close();
}

You are optimizing prematurely. The network bandwidth bottleneck is likely going to far outweigh any processing you are doing.
You don't need to wrap the InputStream in a BufferedInputStream. You may want to favor larger read buffer sizes, but that may have minimal effect depending on the underlying implementation of the InputStream returned by the connection, kernel level buffering, etc.
For a progress bar, take what you've read so far and divide it by connection.getContentLength(), but note that getContentLength() may return -1 if the length is unknown (it simply gives you the value of the Content-length header). As you're reading the data, pass the progress info along to whatever you choose to do to display it to the user.

I don't know, mine took 8 hours. To reduce it from 24 hours I cancelled all other downloads, didn't use the internet, and killed all other background tasks.

Related

Send a stream of images using ImageIO?

I have a ServerSocket and a Socket set up so the ServerSocket sends a stream of images using ImageIO.write(....) and the Socket tries to read them and update a JFrame with them. So I wondered if ImageIO could detect the end of an image. (I have absolutely no knowledge of the JPEG format, so I tested it instead)
Apparently, not.
On the server side, I sent images continuously by using ImageIO.write(...) in loop with some sleeping in between. On the client side, ImageIO read the first image no problem, but on the next one it returned null. This is confusing. I was expecting it to either block on reading the first image (because it thinks the next image is still part of the same image), or succeed at reading all of them (because it works). What is going on? It looks like ImageIO detects the end of the first image, but not the second one. (The images, by the way, are similar to each other roughly) Is there an easy way to stream images like this or do I have to make my own mechanism that reads the bytes into a buffer until it reaches a specified byte or sequence of bytes, at which point it reads the image out of the buffer?
This is the useful part of my server code:
while(true){
Socket sock=s.accept();
System.out.println("Connection");
OutputStream out=sock.getOutputStream();
while(!socket.isClosed()){
BufferedImage img=//get image
ImageIO.write(img, "jpg", out);
Thread.sleep(100);
}
System.out.println("Closed");
}
And my client code:
Socket s=new Socket(InetAddress.getByName("localhost"), 1998);
InputStream in=s.getInputStream();
while(!s.isClosed()){
BufferedImage img=ImageIO.read(in);
if(img==null)//this is what happens on the SECOND image
else // do something useful with the image
}
ImageIO.read(InputStream) creates an ImageInputStream and calls read(ImageInputStream) internally. That latter method is documented to close the stream when it's done reading the image.
So, in theory, you can just get the ImageReader, create an ImageInputStream yourself, and have the ImageReader read from the ImageInputStream repeatedly.
Except, it appears an ImageInputStream is designed to work with one and only one image (which may or may not contain multiple frames). If you call ImageReader.read(0) more than once, it will rewind to the beginning of the (cached) stream data each time, giving you the same image over and over. ImageReader.read(1) will look for a second frame in a multi-frame image, which of course makes no sense with a JPEG.
So, maybe we can create an ImageInputStream, have the ImageReader read from it, and then create a new ImageInputStream to handle subsequent image data in the stream, right? Except, it appears ImageInputStream does all sorts of caching, read-ahead and pushback, which makes it quite difficult to know the read position of the wrapped InputStream. The next ImageInputStream will start reading data from somewhere, but it's not at the end of the first image's data like we would expect.
The only way to be certain of your underlying stream's position is with mark and reset. Since images can be large, you'll probably need a BufferedInputStream to allow a large readLimit.
This worked for me:
private static final int MAX_IMAGE_SIZE = 50 * 1024 * 1024;
static void readImages(InputStream stream)
throws IOException {
stream = new BufferedInputStream(stream);
while (true) {
stream.mark(MAX_IMAGE_SIZE);
ImageInputStream imgStream =
ImageIO.createImageInputStream(stream);
Iterator<ImageReader> i =
ImageIO.getImageReaders(imgStream);
if (!i.hasNext()) {
logger.log(Level.FINE, "No ImageReaders found, exiting.");
break;
}
ImageReader reader = i.next();
reader.setInput(imgStream);
BufferedImage image = reader.read(0);
if (image == null) {
logger.log(Level.FINE, "No more images to read, exiting.");
break;
}
logger.log(Level.INFO,
"Read {0,number}\u00d7{1,number} image",
new Object[] { image.getWidth(), image.getHeight() });
long bytesRead = imgStream.getStreamPosition();
stream.reset();
stream.skip(bytesRead);
}
}
While perhaps not the optimal way to do this the following code would get you past the issue your having. As a previous answer noted the ImageIO is not leaving the stream at the end of the image, this will find it's way to the next image.
int imageCount = in.read();
for (int i = 0; i < imageCount; i ++){
BufferedImage img = ImageIO.read(in);
while (img == null){img = ImageIO.read(in);}
//Do what ever with img
}
I hit the same problem and found this post. The comment of #VGR inspired me to dig into the problem, an eventually I realized that the ImageIO can not deal with a set of images in the same stream. So I've created the solution (in Scala, sorry) and wrote the blog post with some details and internals.
http://blog.animatron.com/post/80779366767/a-fix-for-imageio-making-animated-gifs-from-streaming
perhaps it will help somebody as well.

How to efficiently download large csv file using java

I need to provide a feature where user can download reports in excel/csv format in my web application. Once i made a module in web application which creates excel and then read it and sent to browser. It was working correctly. This time i don't want to generate excel file, as i don't have that level of control over file systems. I guess one way is to generate appropriate code in StringBuffer and set correct contenttype(I am not sure about this approach). Other team also has this feature but they are struggling when data is very large. What is the best way to provide this feature considering size of data could be very huge. Is it possible to send data in chunk without client noticing(except delay in downloading).
One issue i forgot to add is when there is very large data, it also creates problem in server side (cpu utilization and memory consumption). Is it possible that i read fixed amount of records like 500, send it to client, then read another 500 till completed.
You can also generate HTML instead of CSV and still set the content type to Excel. This is nice for colouring and styled text.
You can also use gzip compression when the client accepts that compression. Normally there are standard means, like a servlet filter.
Never a StringBuffer or the better StringBuilder. Better streaming it out. If you do not (cannot) call setContentength, the output goes chunked (without predictive progress).
URL url = new URL("http://localhost:8080/Works/images/address.csv");
response.setHeader("Content-Type", "text/csv");
response.setHeader("Content-disposition", "attachment;filename=myFile.csv");
URLConnection connection = url.openConnection();
InputStream stream = connection.getInputStream();
BufferedOutputStream outs = new BufferedOutputStream(response.getOutputStream());
int len;
byte[] buf = new byte[1024];
while ((len = stream.read(buf)) > 0) {
outs.write(buf, 0, len);
}
outs.close();

FileOutputStream is really slow

I am downloading databases from the network, which are between 100 Kbytes and 500 Kbytes large. Here is my code (removed useless code):
URLConnection uConnection = downloadUrl.openConnection();
InputStream iS = uConnection.getInputStream();
BufferedInputStream bIS = new BufferedInputStream(iS);
byte[] buffer = new byte[1024];
FileOutputStream fOS = new FileOutputStream(db);
int bufferLength = 0;
while ((bufferLength = bIS.read(buffer)) > 0) {
fOS.write(buffer, 0, bufferLength);
}
fOS.close();
My problem is, that it takes a long time for him to finish the while-statement. Have I messed up the code somewhere? It shouldn't take that long for such small files, shouldn't it? I'm talking about 1 minute, for three files not larger than 1 MB altogether... Thanks in advance!
"Slow" is really rather ambiguous. That being said, considering what you're trying to do you shouldn't be using a BufferedInputStream and your buffer is way too small.
The buffered wrappers are for optimizing small reads/writes. Since all you're doing is trying to read a ton of data as fast as you can, you should just read directly from the InputStream, and use a large buffer (Say, 64k since the underlying native code is probably going to chunk at that size anyway).
byte[] buffer = new byte[65536];
...
while ((bufferLength = iS.read(buffer, 0, buffer.length) > 0) {
...
I've found the real solution in Jdk 1.7, which is made by reliable, fast, simple and almost definitively will spawn a pity veil on older java.io solutions.Despite the web is still plenty full of examples of copying files in java using In/out Streams I'll warmely suggest everyone to use a simple method : java.nio.Files.copy(Path origin, Path destination) with optional parameters for replacing destination,migrate metadata file attributes and even try a transactional move of files (if permitted by the underlying O.S.). That's a really good Job, waited for so long! You can easily convert code from copy(File file1, File file2) by appending a ".toPath()" to the File instance (e.g. file1.toPath(), file2.toPath(). Note also that the boolean method isSameFile(file1.toPath(), file2.toPath()), is already used inside the above copy method but easily usable in every case you want. For every case you can't upgrade to 1.7 using community libraries from Apache (commons-io) or Google (guava commons) is still suggested.

Java - Fastest way, and best code to load a URL and get a response from the server

I was curious as to what was the best and FASTEST way to get a response from the server, say if I used a for loop to load a url that returned an XML file, which way could I use to load the url get the response 10 times in a row? speed is the most important thing. I know it can only go as fast as your internet but I need a way to load the url as fast as my internet will allow and then put the who output of the url in a string so i can append to JTextArea.. This is the code Ive been using but seek faster alternatives if possible
int times = Integer.parseInt(jTextField3.getText());
for(int abc = 0; abc!=times; abc++){
try {
URL gameHeader = new URL(jTextField2.getText());
InputStream in = gameHeader.openStream();
byte[] buffer = new byte[1024];
try {
for(int cwb; (cwb = in.read(buffer)) != -1;){
jTextArea1.append(new String(buffer, 0, cwb));
}
} catch (IOException e) {}
} catch (MalformedURLException e) {} catch (IOException e) {}
}
is there anything that would be faster than this?
Thanks
-CLUEL3SS
This seems like a job for Java NIO (Non-blocking IO). This article is from Java 1.4 but still will give you a good understanding of how to setup NIO. Since then NIO have evolved a lot and you may need to look up the API for Java 6 or Java 7 to find out whats new.
This solution is probably best as an async option. Basically it will allow you to load 10 URLs without waiting for each one to be complete before moving on and loading an other.
You can't load text this way as the 1024 byte boundary could break an encoded character in two.
Copy all the data to ByteArrayInputStream and use toString() on it or read Text as Text using BufferedReader.
Use a BufferedReader; use a much larger buffer size than 1024; don't swallow exceptions. You could also try re-using the same URL object instead of creating a new one each time, might help with connection pooling.
But why would you want to read the same URL 10 times in a row?

Copy binary data from URL to file in Java without intermediate copy

I'm updating some old code to grab some binary data from a URL instead of from a database (the data is about to be moved out of the database and will be accessible by HTTP instead). The database API seemed to provide the data as a raw byte array directly, and the code in question wrote this array to a file using a BufferedOutputStream.
I'm not at all familiar with Java, but a bit of googling led me to this code:
URL u = new URL("my-url-string");
URLConnection uc = u.openConnection();
uc.connect();
InputStream in = uc.getInputStream();
ByteArrayOutputStream out = new ByteArrayOutputStream();
final int BUF_SIZE = 1 << 8;
byte[] buffer = new byte[BUF_SIZE];
int bytesRead = -1;
while((bytesRead = in.read(buffer)) > -1) {
out.write(buffer, 0, bytesRead);
}
in.close();
fileBytes = out.toByteArray();
That seems to work most of the time, but I have a problem when the data being copied is large - I'm getting an OutOfMemoryError for data items that worked fine with the old code.
I'm guessing that's because this version of the code has multiple copies of the data in memory at the same time, whereas the original code didn't.
Is there a simple way to grab binary data from a URL and save it in a file without incurring the cost of multiple copies in memory?
Instead of writing the data to a byte array and then dumping it to a file, you can directly write it to a file by replacing the following:
ByteArrayOutputStream out = new ByteArrayOutputStream();
With:
FileOutputStream out = new FileOutputStream("filename");
If you do so, there is no need for the call out.toByteArray() at the end. Just make sure you close the FileOutputStream object when done, like this:
out.close();
See the documentation of FileOutputStream for more details.
I don't know what you mean with "large" data, but try using the JVM parameter
java -Xmx 256m ...
which sets the maximum heap size to 256 MByte (or any value you like).
If you need the Content-Length and your web-server is somewhat standard conforming, then it should provide you a "Content-Length" header.
URLConnection#getContentLength() should give you that information upfront so that you are able to create your file. (Be aware that if your HTTP server is misconfigured or under control of an evil entity, that header may not match the number of bytes received. In that case, why dont you stream to a temp-file first and copy that file later?)
In addition to that: A ByteArrayInputStream is a horrible memory allocator. It always doubles the buffer size, so if you read a 32MB + 1 byte file, then you end up with a 64MB buffer. It might be better to implement a own, smarter byte-array-stream, like this one:
http://source.pentaho.org/pentaho-reporting/engines/classic/trunk/core/source/org/pentaho/reporting/engine/classic/core/util/MemoryByteArrayOutputStream.java
subclassing ByteArrayOutputStream gives you access to the buffer and the number of bytes in it.
But of course, if all you want to do is to store de data into a file, you are better off using a FileOutputStream.

Categories

Resources