I am using PDFBox to display PDF files inside a JInternalFrame. When opening PDF I get lots of warnings like this:
Changing font on <m> from <Tahoma Negrita> to the default font
I am aware that the fonts being reported are not part of the standard set of 14 fonts. So I decided to check if those fonts are embedded on the PDF file (thinking that there shouldn't be a problem loading embedded fonts, right?).
So I open the file on different readers and check properties/fonts. I am in doubt whether this section reports fonts required by the document or fonts actually embedded in the document.
The information that I get is as follows:
BAAAA+Tahoma-Bold (embedded Subset), type:TrueType, Encoding:
CAAAA+Tahoma (Embedded Subset), type:TrueType, Encoding:
Confused about this, I researched on how to embed fonts from OpenOffice and found that the PDF/A-1a option should be checked. So I made another PDF using this option (in case this was not used when making the original PDF file), yet I got the same results.
I would like your guidance understanding how this works. I would like to be able to open PDF files just as PDF readers do. I also read about the PDFBox_External_Fonts.properties but I am guessing this file shouldn't be modified since I am dealing with embedded fonts.
Thanks.
pdfbox is not able to parse embedded subsets of TrueType fonts.
As far as I understand it, embedded TrueType subsets are missing some metadata for the font file that pdfbox needs.
The bug is known but not easy to solve. Right now I can only advise to use embedded Type 1 Fonts if possible, pdfbox can deal with them.
You can also try to set the path to your complete font files in your pdfbox.jar under org/apache/pdfbox/resources/PDFBox_External_Fonts.properties, so if pdfbox cannot parse the subset, at least it can find a full path to the original font file. Maybe that works, but I have not tested this.
Good Luck!
Related
I am looking for a way to embedd an external Font to use in on a PDF page (other fonts which are not available by default) in Java Apache PdfBox. Does someone know how to do it.
They have a way to use external fonts as described at
Ref - https://pdfbox.apache.org/1.8/cookbook/workingwithfonts.html :
...PDFBox will load Resources/PDFBox_External_Fonts.properties off of the classpath to map font names to TTF font files. The UNKNOWN_FONT property in that file will tell PDFBox which font to use when no mapping exists.
If you could share the specific issue/error you ran into, if you did, I can offer a better actionable-response.
I use the PDFBox 1.8.3 jar to print a PDF file in printer(HW). I printed the PDF file in both ways normal and program. When I print the PDF using normal way, I got the original pdf file as a printed document. But when I use my code I'm unable to get the original pdf file as the printed output. I can see a couple of changes in the printed file; for example alignments, font and ink are different from the original document.
ReadPDF readPDF = new ReadPDF();
PDDocument document = readPDF.loadPdf(path);
document.addPage(new PDPage());
printerJob.setPageable(document);
printRequestAttributeSet.add(new PageRanges(1,3));
printerJob.print(printRequestAttributeSet);
Also I try to uppgrade the PDFBox jar 1.8.3 to upcoming jar 2.0.0. I faced a few difficulties (for example: in PDFBox 2.0.0 I'm unable to use the printerJob.setPageable(document);). Could you please help me to solve this issue.
This is sometimes related with printer also. Please try out in a different printer, just to check.
You can have a look at the answer of below question on StackOverflow and can make use of extracts from the explaination.
How to determine artificial bold style ,artificial italic style and artificial outline style of a text using PDFBOX
Also, verify if the fonts which are supplied to the original PDF are also present in the container in which the application is running.
Shishir
We are currently working with a selection of publishers to generate online books from their PDF's. Our legacy app uses flex, so for this we are converting the PDF to SWF files using PDF2SWF by SWFTools.
The problem that we are having is that the text within the SWF document is not being highlighted by our flex reader when the user performs a search. After a quick investigation we found that when extracting text we need to embed the fonts that are used by the PDF document:
http://wiki.swftools.org/wiki/How_do_I_highlight_text_in_the_SWF%3F
pdf2swf -F $YOUR_FONTS_DIR$ -f input.pdf -o output.swf
As you can see from the code above, we need a path to a font directory containig the fonts found within that PDF.
Since we will be converting a large number of PDF's, is it possible to access the font files directly through the PDF rather than having a lot of fonts stored within our app?
Additional Information
Our app is written in Java.
We are currently using PDFBox and Ghostscript within the app, so if any solutions use these libraries than that would be a preferred option, but we are open to all ideas.
PDF files don't contain font 'files' they may not even contain any fonts at all, though this is rare. The embedded font data can be in a bewildering variety of formats:
type 1 PostScript fonts
type 3 PostScript
fonts TrueType fonts
PostScript CFF fonts
CIDFonts with type 1 PostScript outlines
CIDFonts with type 3 PostScript outlines
CIDFonts with TrueType outlines
CIDFonts with CFF outlines
CIDFonts with bitmap images
Will your application be able to read all these font formats ? If you want to use them then you must use the fonts embedded in the PDF file as these will very often be subset fonts, and supplied with a custom Encoding, which means that even if you have the original font, you can't use it because the Encoding will not be correct.
Of course it may be that these PDF files are all created in a consistent way and do not use embedded fonts, but I have my doubts....
I would like to have a preview of a .pdf, .docx or .doc file inside a JDialog. But I'm unable to find previewers that allow nesting of such previews inside a Swing application. Alternatively are there any previewers that can transform such files into .html and then display them in a TextPane.
Fidelity isn't that much of an issue as is embedding and ease of use. Also I don't require one tool to be able to preview all types of files.
That's a tough one because of the formats you're dealing with. You might want to try ImageMagik for PDF -> image format for display in your TextPane. If that works well enough for PDFs, then you could use JOD Converter or Docmosis to get from Doc -> PDF then ImageMagick again for a display image. JODConverter and Docmosis are based on OpenOffice which can do pretty rough html / xhtml output as another option for display. The latest version of OpenOffice can read docx also, meaning all your bases are covered, and if fidelity is not too big a deal as you've indciated, then JODConverter/Docmosis and ImageMagick might be a combo you can use.
I was wondering what steps were needed to render Asian characters using the java based xhtmlrenderer (flying saucer) library?
I am wanting to render the following:
<html>
<body>同名の映画のモデ</body>
</html>
Without any font settings being added to the HTML this renders fine in normal browsers, but I can't find anyway to render this to PDF using the iTextRenderer portion of xhtmlrenderer.
After following various threads on the mailing list, I see lots of posts talking about adding .TTF files from the c:\windows\fonts directory, and I have modified the examples to run on linux ( https://gist.github.com/643173745182c9becc57 ), which shows me various fonts being displayed, but I don't see any Asian glyffs.
Does anyone have any decent pointers, or clean solutions to this problem? Or am I looking at the wrong problem with a really simple solution elsewhere?
You can also add the font style information in css.
#font-face {
font-family: 'your_font_face_name';
src: url('your_font.ttf');
-fs-pdf-font-embed: embed;
-fs-pdf-font-encoding: Identity-H;
}
To support the big character set you need to specify a font file that has all those characters in it. Once you've picked a font file you'll need your application to point to that file. I've found that just putting the font files in your font's directory doesn't work.
Try embedding the font too, eg.
renderer.getFontResolver().addFont("your_font_file.ttf", BaseFont.EMBEDDED);
This link has quite a few font files.