Fop - streaming PDF - java

I want to make pdf stream. Without trying to add special characters it use to works, but with this part it doesn't streaming my pdf from third line
DefaultConfigurationBuilder cfgBuilder = new DefaultConfigurationBuilder();
File file1 = new File("fonts.xsl");
Configuration cfg = cfgBuilder.buildFromFile(file1);
fopFactory.setUserConfig(cfg);
Here is link to this fonts.xsl
And class with that code FOPdpf

Related

Add OCR layer to existing PDF without the need to write to file system

I'm trying to take a scanned PDF document and add a OCR layer on top. I can get the following code to achieve this:
public void ocrFile(PDDocument pdDocument, File file) throws TesseractException, IOException {
PDFTextStripper pdfStripper = new PDFTextStripper();
String text = pdfStripper.getText(pdDocument);
Tesseract instance = new Tesseract(); // JNA Interface Mapping
File tessDataFolder = LoadLibs.extractTessResources("tessdata");
instance.setDatapath(tessDataFolder.getAbsolutePath());
List<RenderedFormat> list = new ArrayList<RenderedFormat>();
list.add(RenderedFormat.PDF);
String outputFileName = FilenameUtils.removeExtension(file.getAbsolutePath());
instance.createDocuments(file.getAbsolutePath(), outputFileName, list);
}
This will output the PDF with the OCR layer in place to a specific location on disk. I'm trying to change this so the application does not need to write any files to disk. I'm not sure if this can be done?
Ideally I'd like to change the File input of ocrFile with a MultipartFile and have that be returned from this method, negating the need for involving the file system. Is this achievable?
No, it cannot be done. Tesseract's TessResultRenderer API outputs to physical files, hence the required outputbase input parameter to specify the name of output file.

How to attach csv file(s) to a newly created PDF in Java using iText 5?

After reading through web pages and posts for days, I am still baffled on how to add an “csv” file attachment to a PDF file has been created with “iText 5.3.1” in Java.
In my inherited Java executable, multiple PDF files get created and then concatenated together into one PDF file.
Now, a “csv” file(s) needs to be attached to this single PDF document. An example, in chapter 16 of the book “iText in Action” (listing 16.6), uses PdfFileSpecification class and the fileEmbedded method to attach an “xml” file.
PdfFileSpecification fs = PdfFileSpecification. fileEmbedded(writer, null, “Kubrick.xml”,
txt.toByteArray());
writer.addFileAttachment(fs);
I understand the parameters to “fileEmbedded” except the “writer” parameter that the author does not define in the code snippet.
The question, using the “PdfFileSpecification” class how do you declare the “writer” in order to attach a “csv” file to the already created PDF file or is there a better way?
Here is the section of the code that concatenates several PDF files together and now needs to attach the “csv” file(s), I believe as a “document-level” attachment.
Tried using “PdfFileSpecification” class and “fileEmbedded” method. Do not how to define the parameters correctly to attach csv file(s) to newly created PDF. Especially the "writer" parameter.
aPDFFiles = (String[])vFileList.toArray(aPDFFiles);
Document document = new Document();
PdfCopy copy = new PdfCopy(document, new FileOutputStream(sFinalFile));
document.open();
PdfReader reader;
int n;
// loop over the documents you want to concatenate
for (int i = 0; i < aPDFFiles.length; i++) {
reader = new PdfReader(aPDFFiles[i]);
// loop over the pages in that document
n = reader.getNumberOfPages();
for (int page = 0; page < n; ) {
copy.addPage(copy.getImportedPage(reader, ++page));
}
copy.freeReader(reader);
document.close();
// Add a “csv” file(s) as an attachment?
PdfFileSpecification fs = PdfFileSpecification.fileEmbedded(writer,
sMainDir + "TestAttachment.csv", "TestAttachment.csv", null);
Turns out in this instance to be quite straightforward. I have the "PdfCopy" object of the final "pdf" file (after concatenating several PDF files into one) and the "PdfCopy" class has a "addFileAttachment" method that does just that.
copy.addFileAttachment("TestAttachment.csv", null, sMainDir +
"TestAttachment.csv", "TestAttachment.csv");

Java - Convert a docx to a pdf document

I am trying to convert a docx document containing a logo to a pdf document.
I have tried this :
FileInputStream in=new FileInputStream(fileInput);
XWPFDocument document=new XWPFDocument(in);
File outFile=new File(fileOutput);
OutputStream out=new FileOutputStream(outFile);
PdfOptions options=null;
PdfConverter.getInstance().convert(document,out,options);
But in the pdf document the logo is not at the right place.
Is there a way to create a PDF that is exactly the same as the docx document?
Could be document4j an option? It delegates the convertion to the native application.
This is achieved by delegating the conversion to any native application which understands the conversion of the given file into the desired target format.
File wordFile = new File( ... );
File target = new File( ... );
IConverter converter = ... ;
Future<Boolean> conversion = converter
.convert(wordFile).as(DocumentType.MS_WORD)
.to(target).as(DocumentType.PDF)
.prioritizeWith(1000) // optional
.schedule();
You can quickly test if the convertion fit your requirement with the "Local demo" on a Windows machine with Word and Excel installed:
git clone https://github.com/documents4j/documents4j.git
cd documents4j
cd documents4j-local-demo
mvn jetty:run
Then go to http://localhost:8080
See the full documentation here :
http://documents4j.com/#/

Apache Commons Configuration2 how to read data from InputStream

How can I read the data from InputStream by using Apache Commons Configuration2?
FileBasedConfigurationBuilder<XMLConfiguration> builder =
new FileBasedConfigurationBuilder<XMLConfiguration>(XMLConfiguration.class)
.configure(
new Parameters()
.xml()
.setFileName("")
.setExpressionEngine(new XPathExpressionEngine())
);
XMLConfiguration config = builder.getConfiguration();
config.read(sourceJarFile.getInputStream(sourcePropertiesEntry))
Gives the above code, I will get the below exception if the setFileName is given empty string.
org.apache.commons.configuration2.ex.ConfigurationException: Could not locate: org.apache.commons.configuration2.io.FileLocator#61dc03ce[fileName=tmp.xml,basePath=<null>,sourceURL=,encoding=<null>,fileSystem=<null>,locationStrategy=<null>]
at org.apache.commons.configuration2.io.FileLocatorUtils.locateOrThrow(FileLocatorUtils.java:346)
at org.apache.commons.configuration2.io.FileHandler.load(FileHandler.java:972)
at org.apache.commons.configuration2.io.FileHandler.load(FileHandler.java:702)
at org.apache.commons.configuration2.builder.FileBasedConfigurationBuilder.initFileHandler(FileBasedConfigurationBuilder.java:312)
at org.apache.commons.configuration2.builder.FileBasedConfigurationBuilder.initResultInstance(FileBasedConfigurationBuilder.java:291)
at org.apache.commons.configuration2.builder.FileBasedConfigurationBuilder.initResultInstance(FileBasedConfigurationBuilder.java:60)
at org.apache.commons.configuration2.builder.BasicConfigurationBuilder.createResult(BasicConfigurationBuilder.java:421)
at org.apache.commons.configuration2.builder.BasicConfigurationBuilder.getConfiguration(BasicConfigurationBuilder.java:285)
at com.test.installer.App.getXMLConfigurationProperties(App.java:185)
If I give null or just not call setFileName(); I will get the unable to load configuration exception at the read() line.
org.apache.commons.configuration2.ex.ConfigurationException: Unable to load the configuration
at org.apache.commons.configuration2.XMLConfiguration.load(XMLConfiguration.java:986)
at org.apache.commons.configuration2.XMLConfiguration.read(XMLConfiguration.java:954)
at com.test.installer.App.updateExistedProperties(App.java:84)
From the example in the API documentation:
Set up your file parameters (encoding and such):
FileBasedBuilderParameters fileparams = ...
FileBasedConfigurationBuilder<PropertiesConfiguration> builder =
new FileBasedConfigurationBuilder<>(PropertiesConfiguration.class).configure(fileparams);
and then:
FileBasedConfiguration config = builder.getConfiguration();
FileHandler fileHandler = new FileHandler(config);
Inputstream istream = ...
fileHandler.load(istream);
Note that you cannot use autosave with this. To save you'd probably need to provide an OutputStream. Something like:
fh.save(ostream)
Proper way of loading XML configuration data from Input Stream (in commons-collections 2.x) is as follows:
XMLConfiguration cfg = new BasicConfigurationBuilder<>(XMLConfiguration.class).configure(new Parameters().xml()).getConfiguration();
FileHandler fh = new FileHandler(cfg);
fh.load(inputStream);
After calling load() cfg will contain loaded configuration.
Also note, that using XMLConfiguration.read() method should not be used, as this method is designed for internal use and probably will be renamed to _read() in future (see: https://issues.apache.org/jira/browse/CONFIGURATION-641).
You can use XMLConfiguration.read(InputStream in) , but as far as I know, you need to have a XML file somewhere. The reason is that when you either get the configuration from the builder or call the read method above, there are a few checks in the private load method (line 963 in the XMLConfiguration.java in the source files).
Parameters params = new Parameters();
FileBasedConfigurationBuilder<XMLConfiguration> fileBuilder =
new FileBasedConfigurationBuilder<>(XMLConfiguration.class)
.configure(params.fileBased().setFileName("/tmp/dummy.xml"));`
XMLConfiguration xmlConfiguration = fileBuilder.getConfiguration();
xmlConfiguration.read(inputStream);
The dummy file can be anything as long as it’s well-formed, it doesn’t need to be valid. In my case, /tmp/dummy.xml just contains <_/>.

Uploading a PDF file to Google Docs generated by pdfjet on GAE/J

I need to upload a PDF file to users google docs which is generated by pdfjet on google app engine. I figure out to generate pdf using pdfjet for gae/j. pdfjet uses streams to create the pdf. Is there anyway to convert stream to file so I can upload to users google docs. I tried gaevfs but couldn't make it work. I can use another pdf generation solution if needed or another virtual file system etc.
PDF generation code
ByteArrayOutputStream os = new ByteArrayOutputStream();
PDF pdf = new PDF(os);
Google Docs API code
DocumentListEntry newEntry = new PdfEntry();
newEntry.setTitle(new PlainTextConstruct("Some Report"));
The line I couldn't get make it work : setFile(File, String)
newEntry.setFile(pdf, "application/pdf");
Thanks.
Couldn't you simply write from ByteArrayOutputStream to FileOutputStream using the ByteArrayOutputStream.writeTo(OutputStream) method? Like this:
ByteArrayOutputStream os = new ByteArrayOutputStream();
PDF pdf = new PDF(os);
FileOutputStream fos = new FileOutputStream("myPDF.pdf");
baos.writeTo(os);

Categories

Resources