How to efficiently download large csv file using java

How to efficiently download large csv file using java - java

I need to provide a feature where user can download reports in excel/csv format in my web application. Once i made a module in web application which creates excel and then read it and sent to browser. It was working correctly. This time i don't want to generate excel file, as i don't have that level of control over file systems. I guess one way is to generate appropriate code in StringBuffer and set correct contenttype(I am not sure about this approach). Other team also has this feature but they are struggling when data is very large. What is the best way to provide this feature considering size of data could be very huge. Is it possible to send data in chunk without client noticing(except delay in downloading).
One issue i forgot to add is when there is very large data, it also creates problem in server side (cpu utilization and memory consumption). Is it possible that i read fixed amount of records like 500, send it to client, then read another 500 till completed.

You can also generate HTML instead of CSV and still set the content type to Excel. This is nice for colouring and styled text.
You can also use gzip compression when the client accepts that compression. Normally there are standard means, like a servlet filter.
Never a StringBuffer or the better StringBuilder. Better streaming it out. If you do not (cannot) call setContentength, the output goes chunked (without predictive progress).

URL url = new URL("http://localhost:8080/Works/images/address.csv");
response.setHeader("Content-Type", "text/csv");
response.setHeader("Content-disposition", "attachment;filename=myFile.csv");
URLConnection connection = url.openConnection();
InputStream stream = connection.getInputStream();
BufferedOutputStream outs = new BufferedOutputStream(response.getOutputStream());
int len;
byte[] buf = new byte[1024];
while ((len = stream.read(buf)) > 0) {
outs.write(buf, 0, len);
}
outs.close();

Related

Is there any way to start reading from specific position of an URL byte stream?

My idea is to divide a big response text into small parts to load them concurrently.
The following code helps me open stream from an URL but I want to load its whole content from multithreads to optimize performance, then I will merge them into a single result. However, the method return a ReadableByteChannel which cannot specify the start position and I have to transfer it linearly:
URL url = new URL("link");
InputStream fromStream = url.openStream();
ReadableByteChannel fromChannel = Channels.newChannel(fromStream);
Is there any way to specify the position like SeekableByteChannel (seem likes this interface only works with file)? Thanks you :D

If you can manipulate the request before it's a stream then yes, you would use the Range http header to specify the chunk of data you wanted...
See Downloading a portion of a File using HTTP Requests
If not then you will manually have to read past the data you don't need.
See Given a Java InputStream, how can I determine the current offset in the stream?

Not able to generate multiple documents using ServletOutputStream in Java [duplicate]

For example, i would like to download one zip file and one csv file in one response. Is there any way other than compressing these two files in one zip file.

Although ServletResponse is not meant to do this, we could programmatically tweak it to send multiple files, which all client browsers except IE seems to handle properly. A sample code snippet is given below.
response.setContentType("multipart/x-mixed-replace;boundary=END");
ServletOutputStream out = response.getOutputStream();
out.println("--END");
for(File f:files){
FileInputStream fis = new FileInputStream(file);
BufferedInputStream fif = new BufferedInputStream(fis);
int data = 0;
out.println("--END");
while ((data = fif.read()) != -1) {
out.write(data);
}
fif.close();
out.println("--END");
out.flush();
}
out.flush();
out.println("--END--");
out.close();
This will not work in IE browsers.
N.B - Try Catch blocks not included

Code developed by Jason Hunter to handle servlet request and response having multiple parts has been the defacto since years. You can find it at servlets.com

No you can not do that. The reason is that whenever you want to sent any data in request you use steam available in request and retrive this data using request.getRequestParameter("streamParamName").getInputStream(), also please make a note if you have already consumed this stream once you will not be able to get it again.
The example mentioned above is a tweak that google also uses in sending multipart email with multiple attachments. To achieve that they define boundaries for each attachment and client have to take care of these boundaries while retrieving this information and rendering it.

Can i attach multiple attachments in one HttpServletResponse

For example, i would like to download one zip file and one csv file in one response. Is there any way other than compressing these two files in one zip file.

Although ServletResponse is not meant to do this, we could programmatically tweak it to send multiple files, which all client browsers except IE seems to handle properly. A sample code snippet is given below.
response.setContentType("multipart/x-mixed-replace;boundary=END");
ServletOutputStream out = response.getOutputStream();
out.println("--END");
for(File f:files){
FileInputStream fis = new FileInputStream(file);
BufferedInputStream fif = new BufferedInputStream(fis);
int data = 0;
out.println("--END");
while ((data = fif.read()) != -1) {
out.write(data);
}
fif.close();
out.println("--END");
out.flush();
}
out.flush();
out.println("--END--");
out.close();
This will not work in IE browsers.
N.B - Try Catch blocks not included

Code developed by Jason Hunter to handle servlet request and response having multiple parts has been the defacto since years. You can find it at servlets.com

No you can not do that. The reason is that whenever you want to sent any data in request you use steam available in request and retrive this data using request.getRequestParameter("streamParamName").getInputStream(), also please make a note if you have already consumed this stream once you will not be able to get it again.
The example mentioned above is a tweak that google also uses in sending multipart email with multiple attachments. To achieve that they define boundaries for each attachment and client have to take care of these boundaries while retrieving this information and rendering it.

Sending images through a socket using qt and read it using java

I'm trying to send an image upload in a Qt server trough the socket and visualize it in a client created using Java. Until now I have only transferred strings to communicate on both sides, and tried different examples for sending images but with no results.
The code I used to transfer the image in qt is:
QImage image;
image.load("../punton.png");
qDebug()<<"Image loaded";
QByteArray ban; // Construct a QByteArray object
QBuffer buffer(&ban); // Construct a QBuffer object using the QbyteArray
image.save(&buffer, "PNG"); // Save the QImage data into the QBuffer
socket->write(ban);
In the other end the code to read in Java is:
BufferedInputStream in = new BufferedInputStream(socket.getInputStream(),1);
File f = new File("C:\\Users\\CLOUDMOTO\\Desktop\\JAVA\\image.png");
System.out.println("Receiving...");
FileOutputStream fout = new FileOutputStream(f);
byte[] by = new byte[1];
for(int len; (len = in.read(by)) > 0;){
fout.write(by, 0, len);
System.out.println("Done!");
}
The process in Java gets stuck until I close the Qt server and after that the file generated is corrupt.
I'll appreciate any help because it's neccessary for me to do this and I'm new to programming with both languages.
Also I've used the following commands that and the receiving process now ends and show a message, but the file is corrupt.
socket->write(ban+"-1");
socket->close(); in qt.
And in java:
System.out.println(by);
String received = new String(by, 0, by.length, "ISO8859_1");
System.out.println(received);
System.out.println("Done!");

You cannot transport file over socket in such simple way. You are not giving the receiver any clue, what number of bytes is coming. Read javadoc for InputStream.read() carefully. Your receiver is in endless loop because it is waiting for next byte until the stream is closed. So you have partially fixed that by calling socket->close() at the sender side. Ideally, you need to write the length of ban into the socket before the buffer, read that length at receiver side and then receive only that amount of bytes. Also flush and close the receiver stream before trying to read the received file.
I have absolutely no idea what you wanted to achieve with socket->write(ban+"-1"). Your logged output starts with %PNG which is correct. I can see there "-1" at the end, which means that you added characters to the image binary file, hence you corrupted it. Why so?
And no, 1x1 PNG does not have size of 1 byte. It does not have even 4 bytes (red,green,blue,alpha). PNG needs some things like header and control checksum. Have a look at the size of the file on filesystem. This is your required by size.

Copy binary data from URL to file in Java without intermediate copy

I'm updating some old code to grab some binary data from a URL instead of from a database (the data is about to be moved out of the database and will be accessible by HTTP instead). The database API seemed to provide the data as a raw byte array directly, and the code in question wrote this array to a file using a BufferedOutputStream.
I'm not at all familiar with Java, but a bit of googling led me to this code:
URL u = new URL("my-url-string");
URLConnection uc = u.openConnection();
uc.connect();
InputStream in = uc.getInputStream();
ByteArrayOutputStream out = new ByteArrayOutputStream();
final int BUF_SIZE = 1 << 8;
byte[] buffer = new byte[BUF_SIZE];
int bytesRead = -1;
while((bytesRead = in.read(buffer)) > -1) {
out.write(buffer, 0, bytesRead);
}
in.close();
fileBytes = out.toByteArray();
That seems to work most of the time, but I have a problem when the data being copied is large - I'm getting an OutOfMemoryError for data items that worked fine with the old code.
I'm guessing that's because this version of the code has multiple copies of the data in memory at the same time, whereas the original code didn't.
Is there a simple way to grab binary data from a URL and save it in a file without incurring the cost of multiple copies in memory?

Instead of writing the data to a byte array and then dumping it to a file, you can directly write it to a file by replacing the following:
ByteArrayOutputStream out = new ByteArrayOutputStream();
With:
FileOutputStream out = new FileOutputStream("filename");
If you do so, there is no need for the call out.toByteArray() at the end. Just make sure you close the FileOutputStream object when done, like this:
out.close();
See the documentation of FileOutputStream for more details.

I don't know what you mean with "large" data, but try using the JVM parameter
java -Xmx 256m ...
which sets the maximum heap size to 256 MByte (or any value you like).

If you need the Content-Length and your web-server is somewhat standard conforming, then it should provide you a "Content-Length" header.
URLConnection#getContentLength() should give you that information upfront so that you are able to create your file. (Be aware that if your HTTP server is misconfigured or under control of an evil entity, that header may not match the number of bytes received. In that case, why dont you stream to a temp-file first and copy that file later?)
In addition to that: A ByteArrayInputStream is a horrible memory allocator. It always doubles the buffer size, so if you read a 32MB + 1 byte file, then you end up with a 64MB buffer. It might be better to implement a own, smarter byte-array-stream, like this one:
http://source.pentaho.org/pentaho-reporting/engines/classic/trunk/core/source/org/pentaho/reporting/engine/classic/core/util/MemoryByteArrayOutputStream.java

subclassing ByteArrayOutputStream gives you access to the buffer and the number of bytes in it.
But of course, if all you want to do is to store de data into a file, you are better off using a FileOutputStream.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to efficiently download large csv file using java - java

Related

Is there any way to start reading from specific position of an URL byte stream?

Not able to generate multiple documents using ServletOutputStream in Java [duplicate]

Can i attach multiple attachments in one HttpServletResponse

Sending images through a socket using qt and read it using java

Copy binary data from URL to file in Java without intermediate copy

Categories

Resources