API to read, write and manipulate msword files - java

I want to write some java code to read, write, edit and do other things with sets of Microsoft Word files (different Word versions). What is the best API for this, and how do I get started?
P.S. I searched StackOverflow and found that this question has been asked before, but several years ago. I would like to know the best available API today. Thanks.

Check out apache POI : https://poi.apache.org/
As quoted on their website.. "Apache POI - the Java API for Microsoft Documents"
Personally I have only used this for EXCEL spreadsheet manipulation, but it does support word documents as well...
The best part is it's open source and completely free!

Related

How to parse and edit MS Visio file in java code

I have checked multiple links and two options were shown for editing MS visio file in Java code.
Apache POI - HDGF and XDGF - Java API To Access Microsoft Visio Format Files
Aspose.diagram APIs
Has anyone done any coding in Java language using above option?
I am using eclipse IDE.
Also please suggest if there is third better way to edit MS visio file using java code.
If you are talking about libraries, these are the two basically. Apache POI AFAIK can't create diagrams, only read, if I am not mistaking - but please verify, maybe something changed since I last looked at that ten years ago.
So this basically leaves you with a single choice. Or you can always spend a few years and write it all yourself. Well, man does not simply walk into mordor create visio files with java.
Maybe you could consider using SVG instead, that can be generated and consumed by basically anything? Visio can also read and write SVG out of the box.

Java libraries that work with Microsoft Office documents but do not depend on automation

By "not depend on automation", I mean that it should not require a Microsoft Office installation to work; let alone interact with a live instance of a Microsoft Office component. One such library is Aspose.Total for Java. Are there any more out there?
Another solution I'm considering is to use OpenOffice.org. However, I'm not sure if I'm going to run into the same problems as with Microsoft Office as detailed here.
For Office Documents: http://poi.apache.org/
I have not tried this myself, but Apache usually deliver good libraries
For just Excel: JExcel API for Java
I use this for one application, and it works quite well. May use a fair bit of RAM for larger documents.
One designed specifically to with with the newer XML formats is docx4j: http://dev.plutext.org/trac/docx4j
There are two further answers for this question. Depending on your application.
can borrow from the OpenOffice library code that deals with opening and saving MS Office files. (See: http://www.artofsolving.com/opensource/jodconverter or jOpenDocument )
You might just use OpenOffice itself by scripting or automating that.
I faced this question a while back with a Ruby app and because I was in control of the source document, I got the originator to save things as HTML format and used Tidy to filter the junk. Another option it to find a tool to convert the Office files to RTF which is more generic.
Another to consider ...
LibreOffice looks useful.
jExcelAPI if you just want excel.
Finally there are some opportunities on sourceForge, try this search: http://sourceforge.net/search/?q=java+ms+office
You may find spreadsheets BIG unless you use OpenOffice or MS Office because you need to have a fancy shamancy virtual sparse matrix to do what they do well.
ODF Toolkit - http://odftoolkit.org

Apache POI. Setup data filters in Excel

I have been using Apache POI for quite sometime and it works great but I am not able to find a reliable answer about filter support in the library.
For reference, I mean the filter option available in data tab in excel which allows you to show all unique values of a column as a combo box in the header of the Column.
I know there is already a question about it generate excel in java where this question was asked.
And I see that Apache POI people checked in something recently for this
https://issues.apache.org/bugzilla/show_bug.cgi?id=35125
Is there anyone who has used POI new version to try filter option?
As waiting for the final release of POI with this feature may not be possible for us, can anyone point out other Excel Java API which allow this option (JExcel does not as far as I found out). I do see many websites allowing export to excel with filtering available. If there is no good API to provide it, is there any other way or post processing on excel which i can do to add this option
Sorry for the repeated question, but I could not see any other way to resolve my issue other than approaching stackoverflow community
It's already enabled in Apache POI 3.7. How? take a look...
sheet.setAutoFilter(CellRangeAddress.valueOf("A1:C200"));

How to create an excel file in google app engine (java)?

A question that seems to have quite a few options for Python, but none for Java after googling for two days. Really really could use some help all I have found so far is a recommendation to use gaeVFS to build an excel file from the xml components and then zip it all together which sounds like a slap in the face. Oh yes and if you were wondering I am questioning my use of Java rather than python but at 5,000 lines of code it would be insane to turn back now...
Other things you might find useful
Client: GWT
Server: Servlets running
on google app engine storing data
into the google data store
Excel file: mandatory, CSV isn't good
enough, no need to save the file just
to be able to "serve" it to the
client i.e. open a "Save As" box.
Have you checked out this api already: Java Excel API ?
You could also take a look at the Apache POI project. You can read and write MS Excel documents with this library.
Take a look at this post.
It's a step by step tutorial on how to generate excel files on google app engine.
Try this :
http://code.google.com/p/gwt-table-to-excel/
google app engine do not support input/output stream classes, you need to use google app engine virtual file system.

Can Java POI write image to word document?

Anyone know if it is possible?
And got any sample code for this?
Or any other java API that can do this?
The Office 2007 format is based on XML and so can probably be written to using XML tools. However there is this library which claims to be able to write DocX format word documents.
The only other alternative is to use a Java-COM Bridge and use COM to manipulate word. This is probably not a good idea though - I would suggest finding a simpler way.
For example, Word can easily read RTF documents and you can generate .rtf documents from within Java. You don't have to use the Microsoft Word format!
As others have said POI isn't going to allow you to do anything really fancy - plus it doesn't support Office 2007+ formats. Treating MS Word as a component that provides this type of functionality via COM is most likely the best approach here (unless you are running on a non-Windows OS or just can't guarantee that Word will be installed on the machine).
If you do go the COM route, I recommend that you look into the JACOB project. You do need to be somewhat familiar with COM (which has a very steep learning curve), but the library works quite well and is easier than trying to do it in native code with a JNI wrapper.
If you are using docx, you could try docx4j.
See the AddImage sample
Surely:
Take a look at this: http://code.google.com/p/java2word
Word 2004+ is XML based. The above framework gets the image, convert to Base64 representation and adds it to the XML.
When you open your Word Document, there will be your image.
Simple like this:
IDocument myDoc = new Document2004();
myDoc.getBody().addEle("path/myImage.png"));
Java2Word is one API to generate Word Docs using obviously Java code. J2W takes care of all implementation and XML generation behind the scenes.
As far as can be gathered from the project website: no.
POI's HWPF can extract an MS Word document's text and perform simple modifications (basically deleting and inserting text).
AFAIK it can't do much more than that.
Also keep in mind that HWPF works only with the older MS Word (97) format, not the latest ones.
Not sure if Java out of the box can do it directly. But i've read about a component that can pretty much do anything in terms of automating word document generation without having Word. Aspose Words
JasperReports uses this API alternatively to POI, because it supports images:
JExcelAPI
I didn't try it yet and don't know how good/bad it is.

Categories

Resources