Edit SVG xml text content after converted from PDF - java

I am using Inkscape to Convert my PDF to SVG file and I would like to change the text content using the xml format in SVG. However, the changed text font seem to be very different and the alignment is totally off from the original place.
Hence, how can I edit the text content using SVG? is there any other tool can be done by converting the PDF to SVG and edit the text content?

There are differences in the 2 formats that can cause issues when converting from pdf to svg, take a read over this guide. It's suggested to try pdf2svg if you don't mind getting your hands dirty.
Excerpt below:
Conversion with Inkscape
Download Inkscape from www.inkscape.org (version 0.46 and above)
Download the PDF you want to convert
Run Inkscape
Open the PDF file you want convert in Inkscape (not Acrobat)
Uncheck Embed images on the box that comes up and click OK
Wait a little while as Inkscape converts it
Click File>Save As..
Under Save as type:, choose "Plain SVG (*.svg)"
Click Save in the bottom right corner
Done! You now have an SVG file with the same name as the PDF, but with the .svg extension
Before uploading you may assure its W3C-validity, with tool SVG-check
For checking that it displays properly, upload it first to Test.svg
Upload the SVG to Wikimedia Commons and tag it with {{Extracted with Inkscape|v}}
Conversion with PDF2SVG
Some versions of Inkscape do not have PDF support compiled in; also, text importing does not always produce satisfactory results in Inkscape. In that case, you might try performing the conversion with the PDF2SVG command line tool. (It requires that Poppler, Cairo, and X are installed on your system.)
Get PDF2SVG from http://www.cityinthesky.co.uk/opensource/pdf2svg/ and compile it. If you are using Linux or FreeBSD or MacPorts, PDF2SVG might also be installable via the package installer.
Convert the PDF with pdf2svg file.pdf file.svg
If necessary use Inkscape to edit the resulting SVG.

Related

How to differentiate text and images from a PDF file using java?

So i have to make an android app using Java that reads a PDF File and displays it on screen without using other programs(such as PDF Reader). How to make a distinction between text and image in that file? in other words, there is text and in between text ther is an image, how do i verify where it is text and where is an image?
PDF files don't work like that.
It is a complex format, and there is a lot more data in the files than just text and images, such as metadata and formatting.
If you want to handle PDF files in your app, you should use a PDF library, such as the ones listed here:
https://camposha.info/android-examples/android-pdf-libraries/#gsc.tab=0
How exactly to load text will depend on the specific library you choose, and you should check the relevant documentation.

Preview LaTeX output with Java

Right now I'm working on displaying LaTeX generated document with Java.
Strictly speaking, LaTeX source can be used to directly generate two formats:
DVI using latex, the first one to be supported;
PDF using pdflatex, more recent.
However rendering dvi or pdf is not available as far as I know.
Is there any way to handle those formats ? Or maybe others that makes sense ?
There are not enough details with regards to how you wish to "render" DVI or PDF from a LaTeX document. However, you could always just render the pdf using pdflatex and DVI using latex and use ICEpdf for viewing PDFs and javaDVI for viewing DVIs.
Another neat hack to display pdf in a panel is to pass the file path to an embedded web component in the application, and the web component will use whatever pdf rendering tool is available on your machine (Acrobat, Foxit, Preview, etc.)
I remember there was a post about this a long time ago.
I don't think there's a generic way to preview the rendered output without generating the file itself. You can write your own LaTeX engine which caches the output every few seconds and displays that but regardless of the storage, you have to output it somewhere physically and then render the output separately using any of the steps mentioned above.
Another approach is to convert the div output to an svg image file and render that with SVGGraphics2D. That will produce nice scalable results. Dvi files can be converted to svg on the command line (or in a script) using:
dvisvgm --no-fonts input.dvi -o output.svg
For more conversion options see this thread on how to convert pdf to clean svg.

Java DOCX file Viewer

Currently I'm developing an application that allows users to create a template and generate it into a DOCX file. The application needs to be able to display to users the changes in the template as the user is creating it.
The approach I tried was using DOCX4J library (allows manipulation of DOCX file) and ICEPDF which is primarily used to display the DOCX into the swing component by converting it first into a PDF file. Now the problem in this approach is that it loads pretty slow and some of the changes that occurs in the DOCX file does not reflect on the PDF conversion (example: dashed underline, font changes). When I tried to open the DOCX file ouput in MS WORD, the file is viewed correctly so I know changes do occur, but it seems that ICEPDF just can't show it properly.
So I was wondering if anyone knows a java library that allows DOCX files to be viewed directly from a Swing Component instead of converting it first into a PDF file.
You can try docx4all or DocxEditorKit. Both of these are built around docx4j.

File previewers for PDF and Docx

I would like to have a preview of a .pdf, .docx or .doc file inside a JDialog. But I'm unable to find previewers that allow nesting of such previews inside a Swing application. Alternatively are there any previewers that can transform such files into .html and then display them in a TextPane.
Fidelity isn't that much of an issue as is embedding and ease of use. Also I don't require one tool to be able to preview all types of files.
That's a tough one because of the formats you're dealing with. You might want to try ImageMagik for PDF -> image format for display in your TextPane. If that works well enough for PDFs, then you could use JOD Converter or Docmosis to get from Doc -> PDF then ImageMagick again for a display image. JODConverter and Docmosis are based on OpenOffice which can do pretty rough html / xhtml output as another option for display. The latest version of OpenOffice can read docx also, meaning all your bases are covered, and if fidelity is not too big a deal as you've indciated, then JODConverter/Docmosis and ImageMagick might be a combo you can use.

Generate Save Convert to TIFF PDF

Please point us in right direction
We have a requirement to
Generate a PDF
Edit/Enter some fields on it
Save/Print the information
Should have a button on the pdf "Convert to TIFF" that should generate TIFF image of that PDF
I am sure we can do 1 and 2 very easily, we are planning to use iText API.
We dont have any clue about 4.
Experts if you have any idea please let us know.
We are using Java
There are lots of programs which do PDF to image conversion (both Open Source and Commercial). You can also use icepdf, Jpedal, Qoppa and PDFRenderer
You can create, edit and fill PDF form fields using Gnostice PDFOne. PDFOne can also print PDF documents and forms. Existing documents can also export PDF pages to image formats. For exporting to TIFF, you will also need for Advaned Imaging IO library from Oracle (Sun). Disclaimer: I work for this company.
If you want a button on the PDF to export the document to TIFF, then that is not possible, as PDF specification does not describe such a feature. As mentioned earlier, any PDF document can be converted to image formats including TIFF.
DISCLAIMER: I work for Gnostice.

Categories

Resources