Apache PDFBox cannot find class 'Loader'. Why?

Apache PDFBox cannot find class 'Loader'. Why? - java

I'm using either pdfbox-app-2.0.18.jar or pdfbox-app-2.0.17.jar.
From the example here, I have this code below :
try (FileOutputStream fos = new FileOutputStream(signedFile);
PDDocument doc = Loader.loadPDF(inputFile)) {
// code
}
After executing this code, I'm getting this error given below :
org.apache.pdfbox.Loader is not found
How to resolve this issue ?

Loader class is never introduced in version 2.x or lower. So, you can't use it.
Alternatively, you can use load() method from PDDocument class to load PDF files.
Modify to this :
try (FileOutputStream fos = new FileOutputStream(signedFile);
PDDocument document = PDDocument.load(inputFile)) {
// code
}
Read this :- https://pdfbox.apache.org/2.0/migration.html

The Loader class has been added January 25, 2020. SVN log
It's not part of version 2.0.18, as it is not in this file:
pdfbox-2.0.18-src.zip
So this class is simply too new and that's why you cannot use it!

The PDDocument class will represent the PDF document that is being processed. Its load() method will load in the PDF file specified by the File object :
PDDocument document = PDDocument.load(new File("path/to/pdf"));

Related

Java - Convert a docx to a pdf document

I am trying to convert a docx document containing a logo to a pdf document.
I have tried this :
FileInputStream in=new FileInputStream(fileInput);
XWPFDocument document=new XWPFDocument(in);
File outFile=new File(fileOutput);
OutputStream out=new FileOutputStream(outFile);
PdfOptions options=null;
PdfConverter.getInstance().convert(document,out,options);
But in the pdf document the logo is not at the right place.
Is there a way to create a PDF that is exactly the same as the docx document?

Could be document4j an option? It delegates the convertion to the native application.
This is achieved by delegating the conversion to any native application which understands the conversion of the given file into the desired target format.
File wordFile = new File( ... );
File target = new File( ... );
IConverter converter = ... ;
Future<Boolean> conversion = converter
.convert(wordFile).as(DocumentType.MS_WORD)
.to(target).as(DocumentType.PDF)
.prioritizeWith(1000) // optional
.schedule();
You can quickly test if the convertion fit your requirement with the "Local demo" on a Windows machine with Word and Excel installed:
git clone https://github.com/documents4j/documents4j.git
cd documents4j
cd documents4j-local-demo
mvn jetty:run
Then go to http://localhost:8080
See the full documentation here :
http://documents4j.com/#/

org.apache.poi.xwpf.converter.xhtml.XHTMLConverter not generating images

I am using org.apache.poi.xwpf.converter.xhtml.XHTMLConverter class to convert docx to html. Below is my groovy code
public Map convert(String wordDocPath, String htmlPath,
Map conversionParams)
{
log.info("Converting word file "+wordDocPath)
try
{
...
String notificationWorkingFolder = "C:\tomcats\Notification\store\Notification1234"
FileInputStream fis = new FileInputStream(wordDocPath);
XWPFDocument document = new XWPFDocument(fis);
XHTMLOptions options = XHTMLOptions.create().URIResolver(new FileURIResolver(new File(notificationWorkingFolder)));
File htmlFile = new File(htmlPath);
OutputStream out = new FileOutputStream(htmlFile)
XHTMLConverter.getInstance().convert(document, out, options);
log.info("Converted to HTML file "+htmlPath)
return [success:true,htmlFileName:getFileName(htmlPath)]
}
catch(Exception e)
{
log.error("Exception :"+e.getMessage(),e)
return [success:false]
}
}
The above code is converting docx to html successfully, but if docx contains any images it puts <img src="C:\tomcats\Notification\store\Notification1234\word\media\image1.png"> but do not copy the image to that folder. As a result, when I open html tag, all images appears empty. Am I missing something in code? Is there a way to generate an image srouce link instead of absolute path, like <img src="http://localhost:8080/webapp/image1.png">

I got answer for first question from this link lychaox.com/java/poi/Word07toHtml.html. I had to add one line of code options.setExtractor(new FileImageExtractor(imageFolderFile)); to generate images.
Second question I resolved by pattern search and replacement.

Even with proper usage, it's worth noting that XHTMLConverter uses XHTMLMapper, which does not process headers, footers, or VML Images. Any images falling into those categories will be lost.
The PDFConverter is more fully featured, but also uses the GPL licensed library, iText.

How to not hard-code a file path

I'd like to calculate the path of a file placed into Source Packages using this implementation:
URL pathSource = this.getClass().getResource("saveItem.xml");
When I try to create a new File like the code below:
File xmlFile = new File(pathSource.toString());
And I try to use it to create a document like this:
Document document = builder.parse(xmlFile);
This give me the java.io.FileNotFoundException.
How can I calculate the file path without hard-coding?
PS: I already used pathSource.getPath() but it doesn't work either.
I would like to use a similar implementation:
FXMLLoader loader = new FXMLLoader(getClass().getResource("FXMLDocument.fxml"));
PPS: The structure is the following:

You can't access a resource that inside a JAR file as a File instance. You can only get an InputStream to it.
As such, the following line
File xmlFile = new File(pathSource.toString());
won't work properly and when an attempt is made to read it later, a FileNotFoundException will be thrown.
Assuming you're trying to parse a XML file using DocumentBuilder, you can use the parse(InputStream) method:
try (InputStream stream = this.getClass().getResourceAsStream("saveItem.xml")) {
Document document = builder.parse(stream);
}

Short answer - saveItem.xml is not in the classpath.
If it is a web application, then file may be added to WEB-INF/classes folder.
Edit:
Try this.getClass().getResourceAsStream() too.

getClass().getResource("saveItem.xml");
looks for the file in the same package (which are directories when you look at the file system) as the class that getClass() returns.
Make sure the file is in there. Also make sure it's really in there when you run your code, there's a difference between your source folder and the target or bin folder where the compiled class files are placed.
Also check what pathSource.toString() contains.

How to create PS file from PDF file using Java?

I wrote an application to create PDF file to PDDocument file it work fine. i use the pdfbox library
PDDocument pdfDoc = PDDocument.load(pdfFile);
Now i want to create PS(Post script) file from PDF file. Is there are any way in java. I can use any free API.
Many thanks.

Adobe seems to have a library. Here are some instructions. Please note, I have not tried this myself: http://help.adobe.com/en_US/livecycle/9.0/programLC/help/index.htm?content=000761.html
This link has a more detailed solution:
http://help.adobe.com/en_US/livecycle/9.0/programLC/help/index.htm?content=000074.html

You can use PDFDocument to load your PDF then use PSConverter to convert the PDF document into an OutputStream.
The library I'm using is called ghost4j:
import org.ghost4j.converter.PSConverter;
import org.ghost4j.document.PDFDocument;
Here's a small snippet:
private ByteArrayOutputStream convertPDFtoPS(){
ByteArrayOutputStream outstreamFile = new ByteArrayOutputStream();
try{
PDFDocument document = new PDFDocument();
//getPDFFile just returns an InputStream of the PDF file
document.load(getPDFFile());
PSConverter converter = new PSConverter();
converter.convert(document, outstreamFile);
outstreamFile.close();
}
catch(Exception e)
{
e.printStackTrace();
}
return outstreamFile;
}

FileInputStream vs ClassPathResource vs getResourceAsStream and file integrity

I have a weird problem :
in src/main/resources i have a "template.xlsx" file.
If i do this :
InputStream is = new ClassPathResource("template.xlsx").getInputStream();
Or this :
InputStream is = ClassLoader.getSystemResourceAsStream("template.xlsx");
Or this :
InputStream is = getClass().getResourceAsStream("/template.xlsx");
When i try to create a workbook :
Workbook wb = new XSSFWorkbook(is);
I get this error :
java.util.zip.ZipException: invalid block type
BUT, when i get my file like this :
InputStream is = new FileInputStream("C:/.../src/main/resources/template.xlsx");
It works !
What is wrong ? I can't hardcode the fullpath to the file.
Can someone help me with this ?
Thanks

I had the same issue, you probably have a problem with maven filtering.
This code load the file from source, unfiltered
InputStream is = new FileInputStream("C:/.../src/main/resources/template.xlsx");
This code load the file from the target directory, after maven has filtered the content
InputStream is = getClass().getResourceAsStream("/template.xlsx");
You should not filter binary files like excel and use two mutually exclusive resource sets as described at the bottom of this page maven resources plugin

haven't you try accessing it like
InputStream is = new FileInputStream("/main/resources/template.xlsx");
?

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Apache PDFBox cannot find class 'Loader'. Why? - java

The Loader class has been added January 25, 2020. SVN log It's not part of version 2.0.18, as it is not in this file: pdfbox-2.0.18-src.zip So this class is simply too new and that's why you cannot use it!

The PDDocument class will represent the PDF document that is being processed. Its load() method will load in the PDF file specified by the File object : PDDocument document = PDDocument.load(new File("path/to/pdf"));

Related

Java - Convert a docx to a pdf document

org.apache.poi.xwpf.converter.xhtml.XHTMLConverter not generating images

How to not hard-code a file path

How to create PS file from PDF file using Java?

FileInputStream vs ClassPathResource vs getResourceAsStream and file integrity

Categories

Resources