I am trying to create a word document with apache poi which will contain a jpeg picture. I ve found code to do so from here stackoverflow. However, when I run the code a docx is created, it seems with its size that contains the jpg image but I couldn't open it.
My code is the following:
import org.apache.poi.util.Units;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import org.apache.poi.xwpf.usermodel.BreakType;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFRun;
public class SimpleImages {
public static void main(String[] args) throws Exception {
XWPFDocument doc = new XWPFDocument();
XWPFParagraph p = doc.createParagraph();
XWPFRun r = p.createRun();
//for(String imgFile : args) {
String imgFile = "mosaic.jpg";
int format =XWPFDocument.PICTURE_TYPE_JPEG;
r.setText(imgFile);
r.addBreak();
r.addPicture(new FileInputStream(imgFile), format, imgFile, Units.toEMU(200), Units.toEMU(200)); // 200x200 pixels
r.addBreak(BreakType.PAGE);
//}
FileOutputStream out = new FileOutputStream("images.docx");
doc.write(out);
out.close();
}
}
When I tried to open my docx I am receiving:
the file file.docx cannot be opened because there are problems with
the contents
.
I had the same problem but its got resolved. Previously i was using poi 3.10 version and that was culprit for the issue. I just updated it to 3.12 and issue got resolved
Related
Can anyone help , I've been trying to store the MS Word format XML data into XWPFDocument of Apache Poi
But it is giving me error
here is the code
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFRun;
public class WordMerger {
public static void main(String[] args) {
try {
Open the first document
XWPFDocument doc1 = new XWPFDocument(new FileInputStream("document1.xml"));
Open the second document
XWPFDocument doc2 = new XWPFDocument(new FileInputStream("document2.xml"));
Iterate through the paragraphs of the second document
for (XWPFParagraph p : doc2.getParagraphs()) {
Create a new paragraph in the first document
XWPFParagraph newParagraph = doc1.createParagraph();
Iterate through the runs of the current paragraph in the second document
for (XWPFRun r : p.getRuns()) {
Create a new run in the new paragraph of the first document
XWPFRun newRun = newParagraph.createRun();
Copy the text and formatting of the current run to the new run
newRun.setText(r.getText(0));
newRun.setBold(r.isBold());
newRun.setItalic(r.isItalic());
newRun.setUnderline(r.getUnderline());
newRun.setColor(r.getColor());
newRun.setFontFamily(r.getFontFamily());
newRun.setFontSize(r.getFontSize());
}
}
Save the merged document
doc1.write(new FileOutputStream("merged_document.xml"));
Close the documents
doc1.close();
doc2.close();
System.out.println("Documents merged successfully!");
catch (IOException e) {
e.printStackTrace();
}
}
}
is their any way i can read the file and store in it
I keep getting this error in some pdf file. It works perfectly from some pdf while fails and give error on other pdfs.
Jar used:
forms-7.1.4.jar
io-7.1.4.jar
layout-7.1.4.jar
kernel-7.1.4.jar
package test;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.PrintWriter;
import java.nio.file.Files;
import java.util.*;
import com.itextpdf.kernel.pdf.PdfDocument;
import com.itextpdf.kernel.pdf.PdfName;
import com.itextpdf.kernel.pdf.PdfReader;
import com.itextpdf.forms.PdfAcroForm;
public class test5 {
public static final String DATASHEET
= "2.pdf";
public static void main(String[] args) throws Exception {
PdfReader reader = new PdfReader(DATASHEET);
PdfDocument pdfDoc = new PdfDocument(reader);
PdfAcroForm form = PdfAcroForm.getAcroForm(pdfDoc, true);
Set<String> fields = form.getFormFields().keySet();
for (String key : fields) {
PdfName type = form.getField(key).getFormType();
if(type!= null && 0 == PdfName.Btn.compareTo(type) )
{
String[] states = form.getField(key).getAppearanceStates();
for (int i = 0; i < states.length; i++) {
System.out.println(states[i]);
}
}
}
}
}
PDF FILE
This program finds the radio button values in the pdf
You open the PdfDocument with only a PdfReader, no PdfWriter:
PdfDocument pdfDoc = new PdfDocument(reader);
Thus, you cannot (deeply) change the document. On the other hand you retrieve the AcroForm with the second argument true:
PdfAcroForm form = PdfAcroForm.getAcroForm(pdfDoc, true);
This signals to iText that you want it to add a new AcroForm structure to the document if it does not have one yet. This is a deep change.
Thus, your code works for pdfs that already have an AcroForm structure and fail for pdfs that don't.
So either use a writable PdfDocument (with also a PdfWriter) or don't tell iText to create AcroForm structures (with a false parameter). For the latter option you may have to add a null check.
I'm getting some error while converting document to pdf using docx4j library in Java. Sadly, my error is this
NOT IMPLEMENTED support for w:pict without v:imagedata
and it's showing up on the converted pdf instead of displaying the error in my java terminal.
I have gone through some article and questions,thus found this converting docx to pdf . However, I am uncertain how to use this in my code or convert it. This is my code :
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.List;
import java.util.Map;
import org.docx4j.convert.out.pdf.viaXSLFO.PdfSettings;
import org.docx4j.fonts.PhysicalFont;
import org.docx4j.fonts.PhysicalFonts;
import org.docx4j.model.structure.SectionWrapper;
import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
public class docTopdf {
public static void main(String[] args) {
try {
InputStream is = new FileInputStream(
new File(
"test.docx"));
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage
.load(is);
List<SectionWrapper> sections = wordMLPackage.getDocumentModel().getSections();
for (int i = 0; i < sections.size(); i++) {
wordMLPackage.getDocumentModel().getSections().get(i)
.getPageDimensions();
}
PhysicalFonts.discoverPhysicalFonts();
#Deprecated
Map<String, PhysicalFont> physicalFonts = PhysicalFonts.getPhysicalFonts();
// 2) Prepare Pdf settings
#Deprecated
PdfSettings pdfSettings = new PdfSettings();
// 3) Convert WordprocessingMLPackage to Pdf
#Deprecated
org.docx4j.convert.out.pdf.PdfConversion conversion = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(
wordMLPackage);
#Deprecated
OutputStream out = new FileOutputStream(
new File(
"test.pdf"));
conversion.output(out, pdfSettings);
} catch (Throwable e) {
e.printStackTrace();
}
}
}
And my pom.xml
<dependency>
<groupId>org.docx4j</groupId>
<artifactId>docx4j</artifactId>
<version>3.2.1</version>
</dependency>
any help would be appreciated as I am noob to this conversion. Thanks in advance
Creating a PDF via XSL FO doesn't support w:pict without v:imagedata (ie a graphic which isn't a simple image).
Whilst you could suppress the message by configuring logging appropriately, your PDF output would be lossy.
Your options are to correct the input docx (ie use an image instead of whatever you currently have), or to use a PDF converter with appropriate support. For one option, see https://www.docx4java.org/blog/2020/03/documents4j-for-pdf-output/
Can somebody help me to integrate some MS Word document to another.
I can open, edit and save, but only with one MS Word document.
My simple code only creates, edits and saves .docx:
import java.io.FileOutputStream;
import org.apache.poi.xwpf.usermodel.*;
public class SimpleDocument {
public void SimpleDocument() throws Exception {
XWPFDocument doc = new XWPFDocument();
XWPFParagraph p1 = doc.createParagraph();
p1.setAlignment(ParagraphAlignment.CENTER);
p1.setAlignment(ParagraphAlignment.LEFT);//setVerticalAlignment(TextAlignment.TOP);
XWPFRun r1 = p1.createRun();
r1.setBold(true);
r1.setText("The quick brown fox");
r1.setFontFamily("Courier");
r1.setUnderline(UnderlinePatterns.DOT_DOT_DASH);
XWPFParagraph p2 = doc.createParagraph();
p2.setAlignment(ParagraphAlignment.RIGHT);
XWPFRun r2 = p2.createRun();
r2.setText("jumped over the lazy dog");
FileOutputStream out = new FileOutputStream("C:/simple.docx");
doc.write(out);
out.close();
}
}
How to combine two pieces of formatted text (RANGE, PARAGRAPH) ?
try the following code:
import java.io.*;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.usermodel.*;
public class test {
public static void main(String[] args) throws Exception {
// POI apparently can't create a document from scratch,
// so we need an existing empty dummy document
HWPFDocument doc = new HWPFDocument(new FileInputStream("D:\\src.doc"));
Range range = doc.getRange();
CharacterRun run = range
.insertAfter("Text After copied file contents!");
run.setBold(true);
OutputStream out = new FileOutputStream("D:\\result.doc");
doc.write(out);
out.flush();
out.close();
}
}
I am trying to read a Microsoft word file through Java. I have included all the .jar files from Apache poi-3.8-beta1 to my classpath. However, when I try running this, I get the following exception:
org.apache.poi.poifs.filesystem.OfficeXmlFileException: The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF)
at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:131)
at org.apache.poi.poifs.storage.HeaderBlock.<init>(HeaderBlock.java:104)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:138)
at readingmsword07.Main.main(Main.java:27)
Following is my code:
import org.apache.poi.xwpf.extractor.XWPFWordExtractor;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.*;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
public class Main {
public static void main(String[] args) {
try {
FileInputStream fis = new FileInputStream("C:\\TrialDoc.docx");
POIFSFileSystem fileSystem = new POIFSFileSystem(fis);
org.apache.poi.xwpf.extractor.XWPFWordExtractor oleTextExtractor =
new XWPFWordExtractor(new XWPFDocument(fis));
System.out.print(oleTextExtractor.getText());
} catch (Exception e) {
e.printStackTrace();
}
}
}
I am using the XWPFWordExtractor since I am trying to read a 2007 word document but for some reason I am unable to figure out the right POI that deals with this.
Any help is much appreciated. Thanks in advance!
~ Woods
remove the line,
POIFSFileSystem fileSystem = new POIFSFileSystem(fis);