Blank PDF while downloading - java

I am facing a very strange issue, I am trying to send the PDF file as attachment from my struts application using below code,
JasperReport jrReport = (JasperReport) JRLoader.loadObject(jasperReport);
JasperPrint jasperPrint = JasperFillManager.fillReport(jrReport, parameters, dataSource);
jasperPrint.setName(fileNameTobeGivenToExportedReport);
response.reset();
response.setContentType("application/pdf");
response.setHeader("Content-Disposition", "attachment; filename=\"" + fileNameTobeGivenToExportedReport + ".pdf" + "\"");
response.setHeader("Cache-Control", "private");
JasperExportManager.exportReportToPdfStream(jasperPrint, response.getOutputStream());
but the PDF that is being downloaded is coming with no data, means it is showing the blank page.
When in the above code I added the below line to save the PDF file in my D: directory
File pdf = new File("D:\\sample22.pdf");
JasperExportManager.exportReportToPdfStream(jasperPrint, new FileOutputStream(pdf));
The file that is getting generated is proper, mean with all the data. One thing that I noticed that the file that is downloading from browser and "sample22.pdf" have same size.
I read an article that says that it might be an issue with server configuration as our server might be corrupting the output stream. This is the article that I read Creating PDF from Servlet.
This article says
This can happen when your server flattens all bytes with a value higher than 127. Consult your web (or application) server manual to find out how to make sure binary data is sent correctly to the browser.
I am using struts 1.x, jBoss6, iReport 1.2

Suppose that you have a simple "Hello World" PDF document:
When you open this document, you see that the file structure uses ASCII characters, but that the actual content of the page is compressed to a binary stream:
You don't see the words "Hello World" anywhere, they are compressed along with the PDF syntax that contains info needed to draw these words on the page into this stream:
xœ+är
á26S°00SIá2PÐ5´ 1ôÝBÒ¸4<RsròÂó‹rR5C²€j#*\C¸¹ Çq°
Now suppose that a process shave all the non-ASCII characters into ASCII. I've done this manually as you can see in the next screen shot:
I can still open the document, because I didn't change anything to the file structure: there is still a /Pages three with a single /Page dictionary. From the syntactical point of view, the file looks OK, so I can open it in Adobe Reader:
As you can see, the words "Hello World" are gone. The stream containing the syntax to render these words were corrupted (in my case manually, in your case by the server, or by Struts, or by whatever process you are using that thinks you are creating plain text instead of a binary file).
What you need to do, is to find the place where this happens. Maybe Struts is the culprit. Maybe you are (unintentionally) using Struts as if you were creating a plain text file. It is hard to tell remotely. This is a typical problem caused by a configuration issue. Only somebody with access to your configuration can solve this.

Related

File download error only in file name with Comma

In my file download API case an error like this.
org.apache.catalina.connector.ClientAbortException: java.io.IOException: Broken pipe
at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:380)
at org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:420)
at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:345)
at org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:405)
at org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:393)
at org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:96)
at org.springframework.util.StreamUtils.copy(StreamUtils.java:128)
at org.springframework.util.FileCopyUtils.copy(FileCopyUtils.java:109)
at
I notice that the error only occurs when trying to download a file with a name containing comma(,) otherwise it works perfectly.
In my API I set the response like this:
response.setContentType("application/octet-stream");
response.setHeader(Constants.CONTENT_DISPOSITION, "attachment; filename= " + fileSeedName);
System.out.println(file.exists());
FileCopyUtils.copy(new BufferedInputStream(new FileInputStream(file)), response.getOutputStream());
response.flushBuffer();
Can anyone please help me.
Wrap the filepath in "Double Quotes"
the filename need double quotes to work
header('Content-Disposition:attachment;filename="' . $fileName . '.pdf"');
This is a known issue specific to Google Chrome specifically related to the Content-Disposition header. According to numerous references (just Google “Chrome content-disposition comma”), this is caused by the fact that chrome doesn't properly handle escaping of commas while Firefox, IE, etc. do. According to a few sites, this was introduced relatively recently and Google doesn't plan on fixing it.
Reference link

How to get only the name of my PDF file

I'm developing a project for college which consist reading a CSV file and converting that to a PDF file. That part is fine, I have already done that.
In the end I need to show the name of the PDF file without the full path of where it was created. In other words, I just want the to show the name.
I search a lot to see if there is a simple method that show the name like Java has to show only the name of the File like
file.getName();
Whenever you use iText to create a PDF file, your code sets the target which usually is an OutputStream. If you use a FileOutputStream there, you know the file it writes to.
Thus, all you have to do to to show the name of the PDF File is to inspect your own code and check which target it sets.
Use getBaseName in Apache Commons IO.
getBaseName
public static String getBaseName(String filename)
Gets the base name, minus the full path and extension, from a full
filename.
This method will handle a file in either Unix or Windows format. The
text after the last forward or backslash and before the last dot is
returned.
a/b/c.txt --> c
a.txt --> a
a/b/c --> c
a/b/c/ --> ""
The output will be the same irrespective of the machine that the code
is running on.
Parameters:
filename - the filename to query, null returns null
Returns:
the name of the file without the path, or an empty string if none exists. Null bytes inside string will be removed
Source: https://commons.apache.org/proper/commons-io/javadocs/api-2.5/org/apache/commons/io/FilenameUtils.html#getBaseName(java.lang.String)
If you also need the extension, use getExtension. Which would probably always be .pdf, but you know, it's perfectly valid to have a PDF file without the .pdf filename extension. No sane person would do that but it is better to be prepared for insane users.

Importing file to Alfresco programatically (through java backed webscript)

I am having problem when importing document (PDF) into Alfresco repository inside java backed webscript. I am using writer of ContentService.
If I use
ContentWriter writer = ContentService.getWriter(nodeRef, ContentModel.PROP_CONTENT, true);
writer.setEncoding("UTF-8");
writer.setMimetype("application/pdf");
writer.putContent(new String(byte []) );
or
writer.putContent(new String(byte [], "UTF-8") );
my document is not previewable (I see blank PDF file, tried with few small PDF files, don't know what would happen in case of other/larger files).
But if I use another putContent method which takes File as argument I'll successfully import the document.
writer.setEncoding("UTF-8");
writer.setMimetype("application/pdf");
writer.putContent(File);
I don't want to import file from disk since I get the file as Base64 encoded String but I don't know what am I missing.
You could use an InputStream as a parameter for ContentWriter::putContent. So you will prevent the String to byte array (and vice versa) conversions, which leads to difficulties with the encoding.
writer.putContent(new ByteArrayInputStream(Base64.decodeBase64("yourBase64EncodedString")))

How to detect the encoding of a PPTX file?

My question is, how can I get the encoding of a pptx file in Java?
(I'm using apache poi)
File f = new File(filename);
XMLSlideShow ppt = new XMLSlideShow(new FileInputStream(f));
The reason why I need to know the encoing is that later on, I post some data of the file which I have saved in a json string and It is at this stage my problem occurs.
When doing a http POST the encoding is changed, and I figured this problem could be solved If I knew the encoding of the data in my json string. Then I could set this encoding in my http POST.
EDIT/CLARIFICATION:
The problem is the swedish letters å,ä and ö.
å becomes Ã¥
ä becomes ä
ö becomes ö
Java and POI aside, to get to the encoding of a PowerPoint PPTX file, you have to examine the underlying XML for the slides:
Unzip the pptx file (for manually looking, any zip utility like 7-zip will do).
Under the zip root, find the ppt/slides directory.
Typically each slide is slide#.xml; open the one you want to examine.
Read the first line: <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
In most cases, I would expect the encoding to be the same across all slides (meaning that you could probably use the root-level "[Content_Types].xml" file as a proxy for encoding of the entire archive).

Running a JavaScript command from MATLAB to fetch a PDF file

I'm currently writing some MATLAB code to interact with my company's internal reports database. So far I can access the HTML abstract page using code which looks like this:
import com.mathworks.mde.desk.*;
wb=com.mathworks.mde.webbrowser.WebBrowser.createBrowser;
wb.setCurrentLocation(ReportURL(8:end));
pause(1);
s={};
while isempty(s)
s=char(wb.getHtmlText);
pause(.1);
end
desk=MLDesktop.getInstance;
desk.removeClient(wb);
I can extract out various bits of information from the HTML text which ends up in the variable s, however the PDF of the report is accessed via what I believe is a JavaScript command (onClick="gotoFulltext('','[Report Number]')").
Any ideas as to how I execute this JavaScript command and get the contents of the PDF file into a MATLAB variable?
(MATLAB sits on top of Java, so I believe a Java solution would work...)
I think you should take a look at the JavaScript that is being called and see what the final request to the webserver looks like.
You can do this quite easily in Firefox using the FireBug plugin.
https://addons.mozilla.org/en-US/firefox/addon/1843
Once you have found the real server request then you can just request this URL or post to this URL instead of trying to run the JavaScript.
Once you have gotten the correct URL (a la the answer from pjp), your next problem is to "get the contents of the PDF file into a MATLAB variable". Whether or not this is possible may depend on what you mean by "contents"...
If you want to get the raw data in the PDF file, I don't think there is a way currently to do this in MATLAB. The URLREAD function was the first thing I thought of to read content from a URL into a string, but it has this note in the documentation:
s = urlread('url') reads the content
at a URL into the string s. If the
server returns binary data, s will
be unreadable.
Indeed, if you try to read a PDF as in the following example, s contains some text intermingled with mostly garbage:
s = urlread('http://samplepdf.com/sample.pdf');
If you want to get the text from the PDF file, you have some options. First, you can use URLWRITE to save the contents of the URL to a file:
urlwrite('http://samplepdf.com/sample.pdf','temp.pdf');
Then you should be able to use one of two submissions on The MathWorks File Exchange to extract the text from the PDF:
Extract text from a PDF document by Dimitri Shvorob
PDF Reader by Tom Gaudette
If you simply want to view the PDF, you can just open it in Adobe Acrobat with the OPEN function:
open('temp.pdf');
wb=com.mathworks.mde.webbrowser.WebBrowser.createBrowser;
wb.executeScript('javascript:alert(''Some code from a link'')');
desk=com.mathworks.mde.desk.MLDesktop.getInstance;
desk.removeClient(wb);

Categories

Resources