Broken PDF document when downloading from web application - java

I have a "simple" problem downloading a PDF, Zip or XLSX auto-generated file in Spring MVC.
I have an Exporter component that is capable of exporting a dataset into CSV,PDF,XLSX (plain or compressed) formats directly to a generic OutputStream.
Running it in JUnit with temporary files as target OutputStreams succeeds. However, if I run that component using Spring MVC's auto-wired response OutputStream then the response gets corrupted
#RequestMapping("/export")
public void export(#RequestBody ..., HttpServletResponse response)
throws IOException, ExportException
{
webExportService.export(MyEntity.class, exportRequest, sessionFactory, response.getOutputStream(), response);
}
public <T> void export(Class<T> clazz, RequestBean exportInfo, SessionFactory sessionFactory, OutputStream outputStream,
HttpServletResponse httpResponse) {
httpResponse.setContentType(getMimeType());
httpResponse.setHeader("Content-Disposition", "attachment; filename=" + getFileName() + "");
httpResponse.setHeader("Pragma", "public");
httpResponse.setHeader("Cache-control", "must-revalidate");
export(clazz, exportInfo, sessionFactory, outputStream); //this will write to the outputStream
}
[Edit] MAJOR REPHRASE REQUIRED. Sorry now that I re-read the question I found I phrased it bad
When I try to open files of ZIP, PDF and XLSX types, those files cannot be opened by their default editors. Opening them with Notepad++, compared to a temporary folder copy, shows something interesting. The binary characters are different in the two files, like in the example below:
Good
Bad
Output stream is type org.apache.catalina.connector.CoyoteOutputStream.
It looks like it is a content encoding problem. How do I find out the correct encoding for those files, if leaving to default is not working?
Side note: I initially posted that the files were "truncated", because opening them (with Notepad++, not mentioned in early question) showed up just a few dozen lines. That looked being correct, I was misleaded by the small size of the file and thought that a double newline was cutting the response entity

Related

Tomcat HttpServletResponse writing to getOutputStream() always causes OutOfMemory error for large file downloads [duplicate]

I am using this code to download an existing file from the server on Liferay (6.2) into a local pc:
`
File file = getFile(diskImage.getImageType(), diskImage.getId());
HttpServletRequest httpReq = PortalUtil.getHttpServletRequest(request);
HttpServletResponse httpResp = PortalUtil.getHttpServletResponse(response);
httpResp.setContentType("application/octet-stream");
httpResp.setHeader("Content-Transfer-Encoding", "binary");
httpResp.setHeader("Content-Length", String.valueOf(file.length()));
httpResp.setHeader("Content-Disposition", "attachment; filename=" + file.getName());
try (InputStream input = new FileInputStream(file)) {
ServletResponseUtil.sendFile(httpReq, httpResp, file.getName(), input, "application/octet-stream");
} catch (Exception e) {
throw new FilesManagerException(e);
}
}
`
This code works fine only for small files. But downloading large files (cca 2GB) throws javax.portlet.PortletException: Error occurred during request processing: Java heap space.
How to fix this code so it works properly for larger files as well?
I guess that the suitable approach would be to use some kind of a buffer for large files and I try it but it wouldn't work even for the smaller files afterwards.
First of all: I'm assuming you're doing this in a render method - and this is just plain wrong. Sooner or later this will break, because you don't have control over the output stream: It might already be committed and transmit data to the browser when your portlet starts to render. In render you always must generate a portlet's HTML code.
Instead, you should go to the resource serving phase of a portlet. With the ResourceRequest and ResourceResponse, you have a very similar support for setting mimetypes as with HttpServletResponse.
And for exactly that reason, ServletResponseUtil is indeed the wrong place to look for. If you use anything from Liferay, you should look for PortletResponseUtil. There are various sendFile methods that accept a byte[], others accept a stream or a file. I'd recommend to try these, if they still fail, look at the implementation you are ending up with. In the worst case, just don't use any of the Util methods. Copying content from one stream to another is not too bad. (Actually, you give no clue about the static type of your variable input in the question: If that's a byte[], there's your solution)
You might want to file an issue with Liferay, if indeed the pure stream-transfer does read the whole file into memory, but your quickfix (in case this is indeed a bug) would be to copy the data yourself.
Thanks for thoughts, finally I used PortletResponseUtil.sendFile(...); method and changed actionURL to responseURL in .jsp file. So that I implemented serveResource()with above mentioned method. It seems that everything is working fine.
ServletResponseUtil.sendFile(httpReq, httpResp, file.getName(), input, "application/octet-stream"); what's this?
Don't read a file once time.Use a buffer.
response.reset();
response.setContentType("application/x-download");
response.addHeader("Content-Disposition","attachment;filename="+new String(filename.getBytes(),"utf-8"));
response.addHeader("Content-Length",""+file.length());
OutputStream toClient=new BufferedOutputStream(response.getOutputStream());
response.setContentType("application/octet-stream");
byte[] buffer=new byte[1024*1024*4];
int i=-1;
while((i=fis.read(buffer))!=-1){
toClient.write(buffer,0,i);
}
fis.close();
toClient.flush();
toClient.close();

Why does Apache POI XSSF fail to write to servlet response outputStream?

I've created a servlet which creates an XSSFWorkbook and writes it to the response's outputStream. Strangely enough, when I try to test the functionality in the browser (Chrome v54.0.2840.98) I'm only able to get the xlsx file once (the file opens up without any formatting issues and has the expected content as well) but if I navigate away from the page where this feature is available with the 'back' button in the browser and go immediately back to same page and try to get the same file again I'm not getting anything in the response. Additionally, my other servlets stop working too until I open a new tab. I've given it a shot in a different browser (Safari v9.1.2 (11601.7.7)) and everything is working as expected, no issues whatsoever.
Here's the code that I use:
#Override
public void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
DateTime now = new DateTime();
Workbook workbook = createWorkbook(); //creates an XSSFWorkbook
response.setContentType("application/vnd.ms-excel");
response.setHeader(
"Content-Disposition",
"attachment; filename=\"excel-export-" + now.toString("yyyy-MMM-dd") + ".xlsx\""
);
BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(response.getOutputStream());
workbook.write(bufferedOutputStream);
}
When I'm running the code in the development env I don't get any exception, the status is 200 but still nothing gets downloaded. Ocassionally I get a
org.apache.poi.openxml4j.exceptions.OpenXML4JRuntimeException:Fail to save: an error occurs while saving the package : The part /docProps/core.xml fail to be saved in the stream with marshaller org.apache.poi.openxml4j.opc.internal.marshallers.ZipPackagePropertiesMarshaller
Which, after extensive debugging, I can reproduce by passing a null to the workbook.write() function:
workbook.write(null);
Any help is appreciated, thank you for reading!
Javax Servlet API v2.5
Apache-POI v3.15
Java 8 JDK(1.8.0_111)
UPDATE
If I get an exception it looks like this(stacktrace):
https://gist.githubusercontent.com/darkstar85/b151e53b64498e1fb476d0f6f8ea4eaf/raw/ffb078c54b850922fcd4e467a6ebf9695aeb7354/gistfile1.txt
When looking at the code of Apache POI, this can only happen if StreamHelper.saveXmlInStream(xmlDoc, out) returns false. Additionally this only returns false if XML-Transformation fails at the line trans.transform(xmlSource, outputTarget);.
However it just does a identity-transformation (i.e. a simple copy) here, so this can only fail, if the XML Parser that is available in your application somehow does not work correctly.
Therefore I would check which JDK you are using and if there are any additional XML Parsers added in your application, e.g. Xerces or any other and see if you can remove them.

Java file download code design problem

I have a Java project which is used as a component in a webapp. This java code writes an xls file in a specific folder. I want to provide a download functionality for this file which should be triggered as soon as file writing is done.
The problem is - without a server environment, how can write a download functionality?
Don't write to file in a specific folder. Just write to the HTTP response body immediately. The downloading job should just be done in the webapp's code. I assume that you're using Servlets. If you set the HTTP response Content-Disposition header to attachment, then the browser will pop a Save as dialogue. If you also set the Content-Type header, then the browser will understand what to do with it (e.g. it will then be able to ask Do you want to open it in Excel or to save? and so on).
response.setHeader("Content-Type", "application/vnd.ms-excel");
response.setHeader("Content-Disposition", "attachment;filename=\"" + filename + "\"");
// Now write xls to response.getOutputStream() instead of FileOutputStream.
If the API of that Java project is well designed, then you should have a method something like this:
public void writeXls(OutputStream output) throws IOException {
// Do your job to write xls to output. E.g. if you were using POI HSSF:
// WritableWorkbook workBook = Workbook.createWorkbook(output);
// ...
}
This way you can call it in the servlet as follows after setting the aforementioned headers:
yourClass.writeXls(response.getOutputStream());
Even more, it could easily be reused/tested in a plain vanilla Java application like follows:
yourClass.writeXls(new FileOutputStream("/path/to/foo.xls"));
This is how i do it. I show a download sql in my page.
response.setHeader("Content-Disposition", "attachment; " +
"filename=ContactPurge.sql");
response.setContentType("application/x-sql-data");
response.getWriter().write(procsql);
response.getWriter().write(sql);
response.flushBuffer();

How to extract a single file from a remote archive file?

Given
URL of an archive (e.g. a zip file)
Full name (including path) of a file inside that archive
I'm looking for a way (preferably in Java) to create a local copy of that file, without downloading the entire archive first.
From my (limited) understanding it should be possible, though I have no idea how to do that. I've been using TrueZip, since it seems to support a large variety of archive types, but I have doubts about its ability to work in such a way. Does anyone have any experience with that sort of thing?
EDIT: being able to also do that with tarballs and zipped tarballs is also important for me.
Well, at a minimum, you have to download the portion of the archive up to and including the compressed data of the file you want to extract. That suggests the following solution: open a URLConnection to the archive, get its input stream, wrap it in a ZipInputStream, and repeatedly call getNextEntry() and closeEntry() to iterate through all the entries in the file until you reach the one you want. Then you can read its data using ZipInputStream.read(...).
The Java code would look something like this:
URL url = new URL("http://example.com/path/to/archive");
ZipInputStream zin = new ZipInputStream(url.getInputStream());
ZipEntry ze = zin.getNextEntry();
while (!ze.getName().equals(pathToFile)) {
zin.closeEntry(); // not sure whether this is necessary
ze = zin.getNextEntry();
}
byte[] bytes = new byte[ze.getSize()];
zin.read(bytes);
This is, of course, untested.
Contrary to the other answers here, I'd like to point out that ZIP entries are compressed individually, so (in theory) you don't need to download anything more than the directory and the entry itself. The server would need to support the Range HTTP header for this to work.
The standard Java API only supports reading ZIP files from local files and input streams. As far as I know there's no provision for reading from random access remote files.
Since you're using TrueZip, I recommend implementing de.schlichtherle.io.rof.ReadOnlyFile using Apache HTTP Client and creating a de.schlichtherle.util.zip.ZipFile with that.
This won't provide any advantage for compressed TAR archives since the entire archive is compressed together (beyond just using an InputStream and killing it when you have your entry).
Since TrueZIP 7.2, there is a new client API in the module TrueZIP Path. This is an implementation of an NIO.2 FileSystemProvider for JSE 7. Using this API, you can access HTTP URI as follows:
Path path = new TPath(new URI("http://acme.com/download/everything.tar.gz/README.TXT"));
try (InputStream in = Files.newInputStream(path)) {
// Read archive entry contents here.
...
}
I'm not sure if there's a way to pull out a single file from a ZIP without downloading the whole thing first. But, if you're the one hosting the ZIP file, you could create a Java servlet which reads the ZIP file and returns the requested file in the response:
public class GetFileFromZIPServlet extends HttpServlet{
#Override
public void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException{
String pathToFile = request.getParameter("pathToFile");
byte fileBytes[];
//get the bytes of the file from the ZIP
//set the appropriate content type, maybe based on the file extension
response.setContentType("...");
//write file to the response
response.getOutputStream().write(fileBytes);
}
}

Java output a file to the screen

I know this is a little broad, but here's the situation:
I am using JSP and Java. I have a file located on my server. I would like to add a link to the screen that, when clicked, would open the file for the user to view. The file can either appear in a window in the web browser, or pop up the program needed to open the file (similar to when you are outputting with iText to the screen, where Adobe opens to display the file). I know my output stream already, but how can I write the file to the output stream? Most of what I have read has only dealt with text files, but I might be dealing with image files, etc., as well.
Any help is appreciated!
Thanks!
You need to add certain fields to the response. For a text/csv, you'd do:
response.setContentType("text/csv"); // set MIME type
response.setHeader("Content-Disposition", "attachment; filename=\"" strExportFileName "\"");
Here's a forum on sun about it.
Here's a simple implementation on how to achieve it:
protected void doPost(final HttpServletRequest request,
final HttpServletResponse response) throws ServletException,
IOException {
// extract filename from request
// TODO use a whitelist to avoid [path-traversing][1]
File file = new File(getFileName(request));
InputStream input = new FileInputStream(file);
response.setContentLength((int) file.length());
// TODO map your file to the appropriate MIME
response.setContentType(getMimeType(file));
OutputStream output = response.getOutputStream();
byte[] bytes = new byte[BUFFER_LENGTH];
int read = 0;
while (read != -1) {
read = input.read(bytes, 0, BUFFER_LENGTH);
if (read != -1) {
output.write(bytes, 0, read);
output.flush();
}
}
input.close();
output.close();
}
You need to create a 'download' servlet which writes the file to the response output stream with correct mime types. You can not reliably do this from within a .jsp file.
We usually do it with a 'download servlet' which we set the servletmapping to /downloads, then append path info to identify the asset to serve. The servlet verifies the request is valid, sets the mime header then delivers the file to the output stream. It's straightforward, but keep the J2EE javadocs handy while doing it.

Categories

Resources