Generating Table of Contents using XMLWorker

Generating Table of Contents using XMLWorker - java

I am generating PDF using iText and XMLWorker. There problem is we need to generate the TOC for the PDF with page no. I am having mt section headings in a list. With this list i can generate the TOC without page no. But our requirement is we need page no also. Below is my list containing section details.
List<String> sectionList=new ArrayList<String>();
sectionList.add("Section1");
sectionList.add("Section2");
sectionList.add("Section3");
sectionList.add("Section4");
sectionList.add("Section5");`
My CLOB object is
String pdfString="<h1>Section1</h1><p>Some content for section1</p>" +
"<h1>Section2</h1><p>Some content for section2</p>" +
"<h1>Section3</h1><p>Some content for section3</p>" +
"<h1>Section4</h1><p>Some content for section4</p>" +
"<h1>Section5</h1><p>Some content for section5</p>";
Section contents will be more than 1 page so we need the page no in TOC. is there any wat to achieve this.
NOTE This is a sample we have many sections and subsections.

As of the XML Worker 5.5.4 source, it doesn't seem to create "Chapters" anywhere which is required for creating the table of contents. You can create your own tag and build into XML Worker how to process it. Some browsers may ignore an unknown tag and not display it, so be careful.
How to generate a Table of Contents “TOC” with iText?
JavaDoc method for telling XML Worker how to process a new Tag

Related

Search inside a pdf without opening the contents

I would like to create a searchview in android in a pdf file without opening the content and if the pdf has the searched word then it will show only the title/titles of that pdf.

It is not possible to search text in a PDF file w/o reading its content. What you may find - it is strings and names(field names, document info, metadata etc.), and it will work only if the document is not encrypted.
All streams in a PDF document are compressed(mostly using FlateDecode filter).

How to add page header with some information and footer with page number in .docx file using docx4j with Java?

I have a word document with .docx extension. I want add header with some information and footer with page number in each page.
I don not know how add header and footer on word document.
I am using Docx4j open source edition with Java.

Start by looking at samples/HeaderFooterCreate.java
Basically you create the header and footer parts, and add them as rels of the MainDocumentPart. Then you reference these rels appropriately from the sectPr element.
For the actual content of your header/footer parts, I'd suggest you create a docx containing what you want in Word, then use the docx4j webapp or Helper AddIn to generate corresponding code.

POI enable different header/footer for the first page in word docx file

I'm generating a docx file using Apache POI 3.13 and I stuck with headers/footers for first page.
I create XMPFParagraph[] without any problem. Next I create headers and footers like this (I've tried in different oreder):
policy.createHeader(XWPFHeaderFooterPolicy.DEFAULT, defaultHeader);
policy.createFooter(XWPFHeaderFooterPolicy.DEFAULT, defaultFooter);
policy.createHeader(XWPFHeaderFooterPolicy.FIRST, firstHeader);
policy.createFooter(XWPFHeaderFooterPolicy.FIRST, firstFooter);
Once I generate my docx file I could see my default header/footer on every page including first one. But if I select to use different header/footer for the first page - my first header and footer apperes correctly.
How could I make this happens automaticaly via code? And is there any appropriate documentation with examples about POI?

If you want to set a first page header in a section, you must enter a title page tag in section properties tag (w:sectPr). The title page tag can be empty, but it is necessary. In your case, you can add only 2 code lines:
CTSectPr sect = document.getDocument().getBody().getSectPr();
sect.addNewTitlePg();
`Best regards!

Creating a dynamic PDF in Java

This is not a duplicate question. I had searched and tried many options before posting this question.
We have a web page, in which user should be able to input data in text boxes, text areas, images and also Rich Text editors. This data has to be filled in an existing report, like filling the blanks.
I was able to achieve the functionality using Apache FOP when the user input is simple text. But Apache FOP doesn't work if the user input is Rich Text(html format). FOP will not render html, and it just pushes the html code(ex: <strong> XYZ /strong>) into the pdf.
I tried using iText, but the setback here is that even though iText supports rendering of html to pdf, it is not able to place the images, that are included in <img> tags, in the pdf file.
I can try to create a pdf using iText api block by block, but the problem is rich text data entered by the user can not be embedded between the code since building pdf block by block and html to pdf can not be done together in iText. Or at least that is what I think from my experience.
Is there any other way to create a pdf file from java with images, rich text rendering as it is, headers and footers?

iText provides the capability to convert HTML Data to Pdf. Below is the snippet to do it :
Lets assume the html data is available as Input Stream (If its a String then we can convert it to InputStream using Apache Commons - IOUtils)
InputStream htmlData; // Html Data that needs to converted to Pdf
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
Document document = new Document();
PdfWriter pdfWriter = PdfWriter.getInstance(document, outputStream);
document.open();
// convert the HTML with the built-in convenience method
XMLWorkerHelper.getInstance().parseXHtml(pdfWriter, document, htmlData);
document.close();
// outputStream now has the required pdf data

I am working as Social Media Developer for Aspose and to add rich text to a form field in PDF file, you can try our Aspose.Pdf for Java API. Check the following sample code:
// Open a PDF document
com.aspose.pdf.Document pdfDocument = new com.aspose.pdf.Document("c:\\data\\input.pdf");
//Find Rich TextBox field using Field Name
RichTextBoxField textBoxField1 = (RichTextBoxField)pdfDocument.getForm().get("textbox1");
//Set the field value
textBoxField1.setValue("<strong> XYZ </strong>");
// Save the modified PDF
pdfDocument.save("c:\\data\\output2.pdf");

I am not trying to market or promote this product. This api actually solved our problem so thought of mentioning it as it might help fellow developers. please let me know if this is against your policy.
I finally realized that the solution for my requirement can not be achieved with either FOP, iText, Aspose, Flying Saucer, JODConverter.
I found a paid api Sferyx. This api allows to render a very complex html to pdf almost preserving the original style. It also renders the images included in the html. We are still exploring this api and will post what other features this api provides.

using JSF PrimeFaces' text editor, how to add text in PDF using iText

We are using JSF PrimeFaces' text editor. When we receive String from text editor in backing bean, it also includes HTML tags. Following image might help in understanding this problem.
Following is what we wrote:
Following is what we received:
Next thing we want to do is, to write what was written in text editor, as it is, into PDF using iText. But we do not know how to convert this string (with HTML tags) into only data.
Following was the code:

You can go for XMLWorker in iText. Below code will give you the content in Orange color
document.open();
String finall= "<style>h1{color:orange;} </style><body><h1>This is a Demo</h1></body>";
InputStream is = new ByteArrayInputStream(finall.getBytes());
XMLWorkerHelper.getInstance().parseXHtml(pdfWriter,document, is);
document.close();
Whatever HTML content we are giving it will create it as PDF. The only thing to take care is it will work for XHTML means all the opening tags should have a end tag.
For example in HTML for break we will use <br> but here it should be <br/>Hope this will help you.

Use JSoup to achieve this. Jsoup
Then in your code
Jsoup.parse(textRecievedFromEditor).text();
This will return text without HTML Tags.
e.g.
For example, given HTML {#code <p>Hello <b>there</b> now!</p>},
{#code p.text()} returns {#code "Hello there now!"}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Generating Table of Contents using XMLWorker - java

Related

Search inside a pdf without opening the contents

How to add page header with some information and footer with page number in .docx file using docx4j with Java?

POI enable different header/footer for the first page in word docx file

Creating a dynamic PDF in Java

using JSF PrimeFaces' text editor, how to add text in PDF using iText

Categories

Resources