iText create PDF/A_1A with images

iText create PDF/A_1A with images - java

I am trying to create a pdf/a 1a document with iText 5.5.2
I can create a simple pdf/a with hello world but I am not able to add an image in the document without the error:
com.itextpdf.text.pdf.PdfAConformanceException: Alt entry should specify alternate description for /Figure element.
Below is my code tryout. I don't know how to add Alt entry Figure for the image. I tried with the PdfDictionary but it does not work. I am out of idea's. Does anyone have a tip for me?
final float MARGIN_OF_ONE_CM = 28.8f;
final com.itextpdf.text.Document document = new com.itextpdf.text.Document(PageSize.A4
, MARGIN_OF_ONE_CM
, MARGIN_OF_ONE_CM
, MARGIN_OF_ONE_CM
, MARGIN_OF_ONE_CM);
ByteArrayOutputStream pdfAsStream = new ByteArrayOutputStream();
PdfAWriter writer = PdfAWriter.getInstance(document,
new FileOutputStream("D:\\tmp\\pdf\\test.pdf"), PdfAConformanceLevel.PDF_A_1A);
document.addAuthor("Author");
document.addSubject("Subject");
document.addLanguage("nl-nl");
document.addCreationDate();
document.addCreator("Creator");
document.addTitle("title");
writer.setPdfVersion(PdfName.VERSION);
writer.setTagged();
writer.createXmpMetadata();
document.open();
final String FONT = "./src/main/resources/fonts/arial.ttf";
Font font = FontFactory.getFont(FONT, BaseFont.CP1252, BaseFont.EMBEDDED);
final Paragraph element = new Paragraph("Hello World", font);
document.add(element);
final InputStream logo = this.getClass().getResourceAsStream("/logos/logo.jpg");
final byte[] bytes = IOUtils.toByteArray(logo);
Image logoImage = Image.getInstance(bytes);
document.add(logoImage);
final String colorProfile = "/color/sRGB Color Space Profile.icm";
final InputStream resourceAsStream = this.getClass().getResourceAsStream(colorProfile);
ICC_Profile icc = ICC_Profile.getInstance(resourceAsStream);
writer.setOutputIntents("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", icc);
document.close();

Normally, you need to add the alternate description like this:
logoImage.setAccessibleAttribute(PdfName.ALT, new PdfString("Logo"));
This works for PDF/A2-A and PDF/A3-A, but not for PDF/A1-A. I have tested this and I see that you have discovered a bug. This bug has now been fixed. The fix will be in the next release.

Related

Problem about font encoding in PDF/A generation

So here is my problem :
I'm currently working on an java application that will archive document in a PDF/A-1. I'm using PdfBox for pdf generation and when I can't generate a valid PDF/A-1 pdf, because of the font. The font is embedded inside the pdf file but this website : https://www.pdf-online.com/osa/validate.aspx tell me that this is not a valid PDF/A because of :
The key Encoding has a value Identity-H which is prohibited.
I look on internet on what is this Identity-H encoding and it seem that it's the way that font are encoded, like the ansi encoding.
I've already tried to get different font like Helvetica or arial unicode Ms but nothing work, there is alway this Identity-H encoding.I'm a bit lost with all this mess in encoding so if someone can explain me it'll be great. Also here is the code I write to embedded a font in the pdf :
// load the font as this needs to be embedded
PDFont font = PDType0Font.load(doc, getClass().getClassLoader().getResourceAsStream(fontfile), true);
if (!font.isEmbedded())
{
throw new IllegalStateException("PDF/A compliance requires that all fonts used for"
+ " text rendering in rendering modes other than rendering mode 3 are embedded.");
}
Thanks for your help :)

Problem solved :
I used the example of apache : CreatePDFA ( I have no clue why that work and not my code ) : Example in examples/src/main/java/org/apache/pdfbox/examples
I add to fit the PDF/A-3 requirement :
doc.getDocumentCatalog().setLanguage("en-US");
PDMarkInfo mark = new PDMarkInfo(); // new PDMarkInfo(page.getCOSObject());
PDStructureTreeRoot treeRoot = new PDStructureTreeRoot();
doc.getDocumentCatalog().setMarkInfo(mark);
doc.getDocumentCatalog().setStructureTreeRoot(treeRoot);
doc.getDocumentCatalog().getMarkInfo().setMarked(true);
PDDocumentInformation info = doc.getDocumentInformation();
info.setCreationDate(date);
info.setModificationDate(date);
info.setAuthor("KairosPDF");
info.setProducer("KairosPDF");
info.setCreator("KairosPDF");
info.setTitle("Generated PDf");
info.setSubject("PDF/A3-A");
Here is my code to embedded a file to the pdf :
private final PDDocument doc = new PDDocument();
private final PDEmbeddedFilesNameTreeNode efTree = new PDEmbeddedFilesNameTreeNode();
private final PDDocumentNameDictionary names = new PDDocumentNameDictionary(doc.getDocumentCatalog());
private final Map<String, PDComplexFileSpecification> efMap = new HashMap<>();
public void addFile(PDDocument doc, File child) throws IOException {
File file = new File(child.getPath());
Calendar date = Calendar.getInstance();
//first create the file specification, which holds the embedded file
PDComplexFileSpecification fs = new PDComplexFileSpecification();
fs.setFileUnicode(child.getName());
fs.setFile(child.getName());
InputStream is = new FileInputStream(file);
PDEmbeddedFile ef = new PDEmbeddedFile(doc, is);
//Setting
ef.setSubtype("application/octet-stream");
ef.setSize((int) file.length() + 1);
ef.setCreationDate(date);
ef.setModDate(date);
COSDictionary dictionary = fs.getCOSObject();
dictionary.setItem(COSName.getPDFName("AFRelationship"), COSName.getPDFName("Data"));
fs.setEmbeddedFile(ef);
efMap.put(child.getName(), fs);
efTree.setNames(efMap);
names.setEmbeddedFiles(efTree);
doc.getDocumentCatalog().setNames(names);
is.close();
}
The only problem left is this error from the validation :
File specification 'Test.txt' not associated with an object.
Hope it'll help some.

Gap between text and barcode with iText

I am generate Barcode128 with library iText-2.1.3. This is code which I am using:
private void createBarcode(String kodDokumentu, String idSadowka, String projekt) throws IOException, DocumentException
{
File barcodePdf = new File(pathToPdf);
Files.deleteIfExists(barcodePdf.toPath());
Document document = new Document();
Rectangle size = new Rectangle(151,60);
document.setMargins(5, 1, -6, 0);
document.setPageSize(size);
FileOutputStream fos = new FileOutputStream(pathToPdf);
PdfWriter writer = PdfWriter.getInstance(document, fos);
document.open();
PdfContentByte cb = writer.getDirectContent();
Barcode barcode128 = new Barcode128();
barcode128.setBarHeight(40);
barcode128.setX(1.04f);
barcode128.setCode("VL#"+kodDokumentu.toUpperCase());
barcode128.setCodeType(Barcode.CODE128);
Image code128Image = barcode128.createImageWithBarcode(cb, null, null);
Font code = new Font(FontFamily.TIMES_ROMAN, 8, Font.NORMAL, BaseColor.BLACK);
Paragraph p = new Paragraph("ID: "+idSadowka+", Projekt: "+projekt.substring(0, 2), code);
document.add(p);
document.add(code128Image);
document.close();
fos.close();
}
I want to achive as small h (take a look at image) as it is possible, best if h=0.01 because I want to save more place for ID: xxxxx, Projekt: xx to make it bigger and easier to read by human.
First (bottom barcode) I used font size 8, then (upper barcode) I tried to change font size to 10 but when I did it h is bigger than previous. I know that font size is connected with gap between barcode and text above it but is it possible to use bigger font size and set this gap really small?

Not sure if it's available in the version of iText you use, but on iText 5 at least, you have the option to set the "spacing after" on Paragraphs. It would be exactly what you need, i.e. you specify a fixed space under the paragraph, and any elements you add to the document after that paragraph, will go under that space.
p.setSpacingAfter(x)
Where x being the space you need, in user units or "points".

PDFBox incorrect text appearance after copy/paste

I’m using PDFBox 2.0.4 to create PDF documents with acroForms. Here is my test code example:
PDDocument document = new PDDocument();
PDPage page = new PDPage(PDRectangle.A4);
document.addPage(page);
PDAcroForm acroForm = new PDAcroForm(document);
document.getDocumentCatalog().setAcroForm(acroForm);
String dir = "../testPdfBox/src/main/resources/fonts/";
PDType0Font font = PDType0Font.load(document, new File(dir + "Roboto-Regular.ttf"));
PDResources resources = new PDResources();
String fontName = resources.add(font).getName();
acroForm.setDefaultResources(resources);
String defaultAppearanceString = format("/%s 12 Tf 0 g", fontName);
acroForm.setDefaultAppearance(defaultAppearanceString);
PDTextField field = new PDTextField(acroForm);
field.setPartialName("SampleField");
field.setDefaultAppearance(defaultAppearanceString);
acroForm.getFields().add(field);
PDAnnotationWidget widget = field.getWidgets().get(0);
PDRectangle rect = new PDRectangle(50, 750, 200, 50);
widget.setRectangle(rect);
widget.setPage(page);
widget.setPrinted(true);
page.getAnnotations().add(widget);
field.setValue("Sample field 123456");
acroForm.flatten();
document.save("target/SimpleForm.pdf");
document.close();
Everything works fine. But when I try to copy text from the created document and paste it to the NotePad or Word it becomes squares.
􀀷􀁅􀁑􀁔􀁐􀁉􀀄􀁊􀁍􀁉􀁐􀁈􀀄􀀕􀀖􀀗􀀘􀀙􀀚
I search a lot about this problem. The most popular answer is that there is no toUnicode cmap in created PDF. So I explore my document with CanOpener for Acrobat:
Yes, there is no toUnicode cmap, but everything works properly, if not to use acroForm.flatten(). When form fields are not flattened, I can copy/paste text from the document and it looks correct. Nevertheless I need all fields to be flattened.
So, I have two questions:
Why there is a problem with copy/pasting text in flattened form, and everything is ok in non-flattened?
What can I do to avoid problem with text copy/pasting?
Is there only one solution - to create toUnicode CMap by my own, like in this example?
My test pdf files are available here.

Please replace
PDType0Font font = PDType0Font.load(document, new File(dir + "Roboto-Regular.ttf"));
with
PDType0Font font = PDType0Font.load(document, new FileInputStream(dir + "Roboto-Regular.ttf"), false);
This makes sure that the font is embedded in full and not just as a subset.

How to copy/move AcroForm fields from one document to new blank one using IText5 or IText7?

I need to copy whole AcroForm including field positions and values from template PDF to a new blank PDF file. How can I do that?
In short words - I need to get rid of "background" from the template and leave only filed forms.
The whole point of this is to create a PDF with content that would be printed on pre-printed templates.
I am using IText 5 but I can switch to 7 if usefull examples would be provided

After a lot of trial and error I have found the solution to "How to copy AcfroForm fields into another PDF". It is a iText v7 version. I hope it will help somebody someday.
private byte[] copyFormElements(byte[] sourceTemplate) throws IOException {
PdfReader completeReader = new PdfReader(new ByteArrayInputStream(sourceTemplate));
PdfDocument completeDoc = new PdfDocument(completeReader);
ByteArrayOutputStream out = new ByteArrayOutputStream();
PdfWriter offsetWriter = new PdfWriter(out);
PdfDocument offsetDoc = new PdfDocument(offsetWriter);
offsetDoc.initializeOutlines();
PdfPage blank = offsetDoc.addNewPage();
PdfAcroForm originalForm = PdfAcroForm.getAcroForm(completeDoc, false);
// originalForm.getPdfObject().copyTo(offsetDoc,false);
PdfAcroForm offsetForm = PdfAcroForm.getAcroForm(offsetDoc, true);
for (String name : originalForm.getFormFields().keySet()) {
PdfFormField field = originalForm.getField(name);
PdfDictionary copied = field.getPdfObject().copyTo(offsetDoc, false);
PdfFormField copiedField = PdfFormField.makeFormField(copied, offsetDoc);
offsetForm.addField(copiedField, blank);
}
offsetDoc.close();
completeDoc.close();
return out.toByteArray();
}

Did you check the PdfCopyForms object:
Allows you to add one (or more) existing PDF document(s) to create a new PDF and add the form of another PDF document to this new PDF.
I didn't find an example, but you could try something like this:
PdfReader reader1 = new PdfReader(src1); // a document with a form
PdfReader reader2 = new PdfReader(src2); // a document without a form
PdfCopyForms copy = new PdfCopyForms(new FileOutputStream(dest));
copy.AddDocument(reader1); // add the document without the form
copy.CopyDocumentFields(reader2); // add the fields of the document with the form
copy.close();
reader1.close();
reader2.close();
I see that the class is deprecated. I'm not sure of that's because iText 7 makes it much easier to do this, or if it's because there were technical problems with the class.

not getting Font effect in PDF when convert HTML To PDF file

Using this below Code i am Abel to convert a HTML text To PDF and my code can generate PDF File on particular location . but problem is...... i give font style in body tags so when PDF is generate i am not getting this font style effect in generate PDF ex.
// Here On Body Tag I have given a Zurich BT font style
StyleSheet styles = new StyleSheet();
//styles.loadTagStyle("body", "font-family", "Zurich BT");
styles.loadTagStyle("body", "font", "Zurich BT");
so here my font style is Zurich BT but i have just got plane simple text on generate PDF not get any effect on text.
i am using itextpdf-5.1.1 version and my code is....
ByteArrayOutputStream baos = new ByteArrayOutputStream();
Document pdfDocument = new Document();
Reader htmlreader = new StringReader("<html><head></head><body>"
+ " <font> HELLO MY NAME IS JIMIT TANK </font> </html></body>");
PdfWriter.getInstance(pdfDocument, baos);
pdfDocument.open();
// Here On Body Tag I am giving a Zurich BT font style
StyleSheet styles = new StyleSheet();
//styles.loadTagStyle("body", "font-family", "Zurich BT");
styles.loadTagStyle("body", "font", "Zurich BT");
ArrayList arrayElementList = HTMLWorker.parseToList(htmlreader,styles);
for (int i = 0; i < arrayElementList.size(); ++i) {
Element e = (Element) arrayElementList.get(i);
pdfDocument.add(e);
}
pdfDocument.close();
byte[] bs = baos.toByteArray();
String pdfBase64 = Base64.encodeBytes(bs); //output
File pdfFile = new File("c:/pdfExample.pdf");
FileOutputStream out = new FileOutputStream(pdfFile);
out.write(bs);
out.close();

Use iTextSharp Paragraph class and set font and style to it like this
Document doc = new Document(PageSize.A4);
Paragraph paraReportTitle = new Paragraph();
//paraReportTitle.Font = new Font(Font.FontFamily.HELVETICA, 13f, Font.BOLD);
paraReportTitle.Font = new Font(Font.FontFamily.HELVETICA, 8f, Font.NORMAL);
doc.Add(paraReportTitle);
Setting style to html will not work.
You can also use the BaseFont class in iTextSharp

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

iText create PDF/A_1A with images - java

Related

Problem about font encoding in PDF/A generation

Gap between text and barcode with iText

PDFBox incorrect text appearance after copy/paste

How to copy/move AcroForm fields from one document to new blank one using IText5 or IText7?

not getting Font effect in PDF when convert HTML To PDF file

Categories

Resources