Pdf Text Coordinate / Font - java

I have a project, i have to get title,author informations from inside of the PDF file(not from metaData). So i try to read text from PDF by given coordinates and try to get fonts of texts.
Is there any way to do that, can anyone give advise ? Or is there another solutions to do my project?
Thanks for every help and thought you're sharing with me.

There are multiple PDF libraries for Java which allow you to extract text, my favourite being iText, as examples for text parsing have a look at ExtractPageContentArea and other examples from chapter 15 of iText in Action, 2nd edition.
Currently there is no example making use of the font information, but the information is available to the RenderListeners.

Related

Convert html string to pptx in java

In my application, there are notes being fed by user inside browser. These notes can be formatted for font, size, color etc. These notes are saved in database using html tags string.
Now I want to export these formatted text into PPTX. Is there any solution for it? Currently, I have tried Apache POI which allows for formatted text but does not allow input of html string.
I am looking for open source library, so using Aspose is a difficulty. Somehow, I need to render these HTML text and then copy as it is to PPTX.
Any solution or way will be helpful.
EDIT: I am thinking for custom parsing the string html text; using JAXB to convert the tags into objects and then using some java logic to integrate POI with it. Any wayout/ help on achieving this will be appreciated.
Aspose.Slides offers you to import HTML text inside presentation and also exporting presentation to HTML. I suggest you please visit the following documentation link to serve the purpose in this regard. You are right that Aspose.Slide
I work as developer evangelist at Aspose.

Using qoppa to create pdf with table structure just like we do using itext pdf library

Sample pdf
A sample pdf is shown in image. We need to create 2 column structure which can have text/images/figures etc. Moreover, we need to change the text format like font/size, auto wrapping etc. Text content will be dynamic as we don't know the content at compile time hence, it should be able to align itself after paragraph ends and we should not need to provide hard coded value for height. Giving absolute positions of components in qoppa to create pdf is not feasible for us because of dynamic content.
We've already explored qoppa library and we couldn't figure out how to make a pdf like shown in image. If anyone has worked on qoppa, Please do share the valuable resources available online related to qoppa. And Please, let me know if it is possible to create a pdf like this using qoppa.

How can I start a RTF editor from scratch(using Java)

I know this may sound silly to some of you experienced guys out there but it’s really important for me and my group at school, we need to create a software that allows the user to create a new RTF document from scratch (like an editor where you can center, change font size, style, save, insert picture), it also needs to be able read a docx document with images and format included and save it as a RTF document.
What we have done so far is being able to open the .docx document, extract the text without format and put it into an RTF document out. In other words using docx4j library we have been able to transform a .docx document text to .rtf, no pictures included, no formatting, just plain text surrounded by [ ].
We have made some progress today but we can’t figure out the next steps, considering the delivery date is in 72 hours, I thought it’d be a good idea to ask for help from more experienced people than us.
Please leave your answers or request info about the project, we’ll be glad to learn from you guys 
To convert a .docx to .rtf use a library like https://code.google.com/p/jodconverter/. It will do all the heavy lifting for you.
Anyway, now about your editor itself. If I had to do it as fast as I could, I would use JavaFX to make my interface. There is a control called "Rich Text Editor" (http://docs.oracle.com/javafx/2/ui_controls/editor.htm) which you can just put into your application.
The trick here is that you can actually extract the HTML of the editor using getHtmlText(), and then you can the HTML to RTF using... yes, a library. I suspect that jodconverter can do this too, but if not, you can look at this question: Convert HTML to RTF in java?.
This should give you a better idea of how to do your project. There are Java libraries to handle conversion between HTML and RTF, so you can use an HTML editor (provided by JavaFX). And of course, a .docx can be converted to HTML too. Let libraries do all the dirty work :).

Finding the Coordinates of the selected text in a pdf using any pdf api for java

I want to extract the selected text from a pdf using any PDF Api for java. Using iText and PdfBox I am unable to do this task. So can any one help me out with this. I want to extract the selected text(which I will select by dragging my mouse over the text in pdf) and want to give comment to that selected text.
as far as I know this kind of feature could be done using the excellent jpedal library..
So please refer to the following URL : http://www.jpedal.org/
HTH
Jerome

Changing/Replacing text inside a PDF using Java

Any clue about a good library to programmatically produce a PDF in Java, using a PDF as a template?
try iText. It has many many goodies. There is also a book about it by Manning iText in Action
If you want to be able to edit the Text in the template, you should set up the template carefully and use forms for text content. You can't easily replace text in PDFs because they do not contain text structure.
There is a blog article highlighting some of the issues at http://pdf.jpedal.org/java-pdf-blog/bid/17370/Problems-editing-PDF-files

Categories

Resources