Render docx file in a browser - java

I'm using docx4j to convert a microsoft word document into a pdf then displaying it in a browser http://www.docx4java.org/trac/docx4j and it works well for a preview. The problem I'm facing is that this conversion loses most of the microsoft word document formatting. Page breaks and fonts don't transfer into the PDF format properly and even though I'm using standard font types docx4j doesn't come with them. In a Linux Tomcat hosted scenario fonts are not found and throw exceptions as it falls back to sans serif or other generic types.
I have found this Microsoft tool to make documents render online, but I'm behind a firewall so I cannot include this tool as an option: https://products.office.com/en-us/office-online/view-office-documents-online
I'm open to suggestion on displaying a docx file as a preview and print option from within a browser. Pdf conversion appears to be the most promising, but I run into formatting issues.
Any ideas are welcome!

Have a play with http://converter-eval.plutext.com/viewer.html
Consider it an alpha level preview. We haven't quite released it yet, but you will be able to host it behind a firewall.
It isn't open source, I'm afraid, and we're still working out pricing (and whether/how there could be a free edition).

If you only need to render a docx document in a browser, u can use Google Documents Viewer for this as :
<iframe src="http://docs.google.com/gview?url=pathOfDocx&embedded=true" />

Related

PDFBox - show icon for embedded files in pdf

I developed a Java PDF viewer using Apache PDFBox. The problem is, when rendering a page of a PDF, if the page has file attachments, there is no icon shown in PDFBox rendering, like there is a paper clip icon, when such a file is opened in Adobe PDF reader.
Is it possible to automatically have such icons in the rendering using PDFBox? I think I saw such a code some time ago, like a single line that switches this behavior on and off but I can't find it. Thanks.
This was fixed in PDFBOX-5394 and will be in the version 2.0.26. However only one single symbol will be shown at this time: a paperclip in fixed size.

Convert html+css+js to PDF

I want to create something like this (code is here):
in pdf format. I'm using google charts and regarding to this forum converting chart to pdf is impossible. I've already tryied iText+XMLWorker, but there is some problem with css and any js supporting at all, I think.
So, the questions are: How can I convert html+css+js to .pdf file? Or, may be, the issue have other variants?
As promised in the comment, I've asked Raf. This was his answer:
One way to use XML Worker for HTML+CSS+JS is to use a browser engine to preprocess the HTML. Examples of such a browser engine are WebKit (Chrome, Safari) and Gecko (Firefox). These can interpret the CSS and JS and give you HTML that is ready to be parsed by XML Worker.
Examples of competing products are:
wkhtmltopdf, a command line tool that uses WebKit as its rendering engine.
Prince XML supports HTML+CSS+JS to PDF using their own engine.
Maybe there are others, but this is what Raf told me. I hope this helps.

File previewers for PDF and Docx

I would like to have a preview of a .pdf, .docx or .doc file inside a JDialog. But I'm unable to find previewers that allow nesting of such previews inside a Swing application. Alternatively are there any previewers that can transform such files into .html and then display them in a TextPane.
Fidelity isn't that much of an issue as is embedding and ease of use. Also I don't require one tool to be able to preview all types of files.
That's a tough one because of the formats you're dealing with. You might want to try ImageMagik for PDF -> image format for display in your TextPane. If that works well enough for PDFs, then you could use JOD Converter or Docmosis to get from Doc -> PDF then ImageMagick again for a display image. JODConverter and Docmosis are based on OpenOffice which can do pretty rough html / xhtml output as another option for display. The latest version of OpenOffice can read docx also, meaning all your bases are covered, and if fidelity is not too big a deal as you've indciated, then JODConverter/Docmosis and ImageMagick might be a combo you can use.

Java generate PDF from RTF

I want to generate PDF file from RTF file.
I have tried following.
Itext
It's already outdated and new version doesn't support rtf.
JDocConverter
It uses OpenOffice on the background. it is working fine, there is only one problem. Open office doesn't support drawing object in RTF.
Any other possible and reliable solutions?
Note: It would be fine don't use any commercial software.
Windows has native convert RTF to PDF using command line, however it will to a degree be limited, so it will use direct convert text and images, but it will depend on rtf syntax as to which drawn objects are supported. WORD ART drawing objects need MS Word to print
The output looks reasonable but here is the source in MSWord where the art was clearly not handled by the non-word printout.
Under Windows you could print to CutePDF Writer. This freeware uses Ghostscript as a back end.
You may try Aspose.Words for Java to convert RTF file to PDF format. You can load a file in RTF format into Aspose.Words for Java and then save it to PDF format. Please note that while loading specify RTF as LoadFormat value and pass PDF as SaveFormat value while saving the document. This doesn't require OpenOffice or any other software to be installed for the conversion to work.
Disclosure: I work as developer evangelist at Aspose.
Best way to do it is use MS Office. And Ms Office is able to save file in PDF format (you need install some addons I think).

View PDF files in IFrame with Named Destinations

We've got an application that displays PDF files in an IFrame at specific Named Destinations. This works well on Windows systems but not Mac. In Safari, with Acrobat, the Named Destination is ignored and the document is displayed at the start.
Does anyone have any suggestions on how we might accomplish the task of displaying this information? Our initial thoughts are to:
Convert the PDF to HTML on the fly and display the HTML version in the IFrame
Convert the PDF on the page referenced to another format such as PNG etc. and display that in the IFrame
Utilize some kind of Java app that allowed us to render the PDF while honouring the Named Destination (not sure if this exists)
Any other ideas on a potential method of better displaying PDF files at Named Destination points that is a little more cross platform?
EDIT: I guess another option is to store the data in XSL/XSLT type format and convert to HTML for veiwing or PDF for saving to the desktop.
Not much help, but I found that alternative ways to display PDF files (other than the Acrobat Reader client) are few and far between. As you say, the commonly accepted way to render PDF's in something that doesn't natively support it seems to be converting it "something else", which is supported (even Acrobat.com does it this way in their Flex client if I remember it correctly).
Even converting the PDF document to other formats may be disappointing - especially if you expect a certain level of quality. It may also introduce server-side performance issues.
I realise this doesn't help anyone much but I'm interested to see if any other suggestions come up. We've dealt with this problem before in the same way, using IFrame controls (but without named destinations) but I'm very much interested in other suggestions/ideas as well.

Categories

Resources