I would like to create a word document using a template, replace some variables (fields) and save it as a new word document.
I was thinking using Apache POI, http://poi.apache.org/ is it the best for this purpose?
can you share your impression from it?
I've worked with POI before and it's certainly able to generate Word documents. But the devil is in the details.
Word has thousands of features: You can put numbered lists starting at #13 with negative indents into two joined cells of a table included in another table that is itself part of a bullet list... you get the idea. When the POI documentation says they are a work in progress, that reflects what will probably be an eternal state of trying to catch up to the (to us, undocumented) specification of Word.
Documents with a reasonably "normal" set of used features are well supported by POI, whose interfaces and methods are reasonable and consistent but sometimes require a bit of work. But as Pascal says, documents with a not too exorbitant set of features are also supported by RTF.
I have almost no experience "doing" RTF but it's probably a bit simpler than working with POI.
If you're working in an environment or for a customer who insists that your produced documents be .DOC rather than .RTF, then POI is pretty much your only choice, unless you can introduce a step where you use a bit of Office automation to convert RTF into DOC.
Update: I've had a couple more ideas in the meantime.
Using POI or creating RTF documents is something that you could do on practically any platform. At my job, all servers doing processing like this happen to be running Linux, for example.
However, in the likely case that your programs will run under Windows, there is another alternative: Jacob http://www.land-of-kain.de/docs/jacob/
Jacob is a COM interface for Java; it essentially allows you to "remote control" Windows programs such as Word and Excel. The document I linked to above is not to Jacob's own site but to a single page with "cookie cutter" recipes for using Jacob. The project itself is on SourceForge: http://sourceforge.net/projects/jacob-project/ But people claim, and rightly so, that the documentation is a bit lacking.
Jacob has the advantage over all other solutions that you're dealing with the "real" Word and therefore all capabilities of Word are available to you. This would be an alternative if there are detail aspects of your document that just can't be handled with POI or via the RTF format.
This is obviously way too late, But since 2013 there is a much better, more flexible solution to word document creation.
http://www.docx4java.org/trac/docx4j
I have had much more luck with docx4j than I ever did with POI.
I'm not sure of the exact status of the Word documents support in POI but, according to the POI website, work is still in progress (can't say what this mean exactly). So, at this time, I would not use POI but rather try to generate a RTF document. For this, you could :
Use RTFTemplate which is a RTF to RTF Engine that can generate RTF document as the result of the merge of a RTF model and data.
Use iText which is primarly a PDF generator but can also generate RTF.
Build your own custom solution (but I wouldn't do that).
I'd go for iText.
If you use a template, and do not want to create the word document from scratch, for what I know, POI is a pretty good solution. You open the template and select the zones you want to replace.
They say POI is still is developpement, but I've been using it in production environnement and it works pretty good at the moment.
I know this question is a bit old, but I think many people still find this with search engines, so I post another possibility to do what you want right here:
If the one and only goal is to have a Word Template and to replace some values in it, you might consider saving a Word Template as single xml (not docx) and then processing it with simple Java and without any Framework. If you want to do more (e.g. create lists or tables) you might also consider understanding the xml format and writing your own helpers before loading a Framework like POI.
Here is an example on how to do that:
http://dev-notes.com/code.php?q=10
This is the fast version, if you want a nice version, you could try using an XML processor.
PS: users might notice that the file extension is not doc but xml and they may blame you for that, but that's ok... just rename it to doc, word will recognize the format and everyone is happy again ;)
You should look into the Aspose.Words components. They have recently begun providing a Java version of the component.
See the following link: Aspose.Word for Java
This supports Word automation, creation and advanced features such as mail merging without the need for an instance of Microsoft Word on the machine. The real benefits are that you are able to work within the context of an actual word document and not having to compromise by creating RTFs etc.
The Java version is not currently as fully featured as the .Net version but the main core functionality is there and they are pushing very hard to have a feature equivalent version soon.
Also, if you purchase the Java version you get a years free upgrades / support as the new releases are created.
If you are working with docx documents, docx4j is an option. Like POI, its open source.
I created and use this: http://code.google.com/p/java2word
Related
How to generate the report in to excel by java. Is there any link that describing this topics. I am using spring 3. Please suggest the examples.
You will likely need to use some 3rd party libraries. One such option is Java Excel API library as illustrated in this post by Lars Vogel.
You can check out the sample here
Disclaimer : I havent used it before but the article seems pretty descriptive. Hope it helps.
I've used Apache POI. It seems to be good enough for Excel file generation (though its Word document generator is not mature enough, by the way). I'm not sure it's very easy but it's quite flexible.
We have many library for generate report. I was working with JasperReport and Apache POI.
I think POI is good choose for you. It's very easy.
Jxls is a useful option. It integrates with Apache POI to allow you to have report templates that your Java code fills in with data.
See http://jxls.sourceforge.net/
I used previous versions, but it looks like it has come quite a ways since then.
I use simreport library. I see it is simple enough to make excel reports in java. It make report based on the report you want to make, so it is very easy to understand and edit, customize. It takes me only 15 minutes to start to make the first one. You try it. it's in http://www.jsimreport.com
i know some similar question already exists, but I haven't found any satisfying answer yet.
I found several library such as Apache POI and JExcelAPI, however as I don't have any previous experience with any Java Excel API yet, perhaps some of you guys who experienced it before can enlighten me regarding the advantages and disadvantages of each API. My requirements are reliability and ease-of-use, because I have to parse and write numerous excel reports with ~10,000 lines in each file.
I'm also considering JXLS which can parse and write document using template to minimize coding effort, but based on my test, we have to hard-code the startRow and endRow when parsing (the startRow and endRow for my document is different for each file).
Actually, even the old versions of POI will support 10,000 rows - the limitation was either ~32000 or ~64000 rows.
But the latest POI supports the XML file formats for 2007, and therefore I'm sure memory will be your only limitation.
I use POI in a corporate application, and I've never had a problem with it.
Aspose.Cells for Java allows you to create or parse large Excel files in Java applications. The API is simple along with complete documentation and support. A large number of users have already incorporated it in their applications. It is easy to learn and any questions can be answered quickly through our support forum. You may try this and see if it helps in your scenario.
Disclosure: I work as developer evangelist at Aspose.
I've used JExcel with great success, although I can't say that any of those files were on the order of 10,000 rows per file.
I'd wonder if you'd be better off with a relational database with this volume of data. Excel might have been a fine way to start, but maybe it's time to ask yourself if you've outgrown spreadsheets.
I'm looking for 2 or 3 of the most common/industry wide used libraries for the Java platform for the creation of PDFs on the fly.
The one requirement I'm focusing on is the ability to use specific formatting such as page layout and font sizes and typefaces (this will be a dynamically created legal document with frustratingly specific type standards).
I'm not actually going to be the one implementing this (I'm not a Java developer), but am trying to get the ball running and need to pass along some things to have our dev team to start investigating.
I'm investigating iText at the moment, which seems to be a well established option. I'm not yet sure how robust/flexible the templating abilities are, though.
EDIT: I just realized that there's probably no one 'right' answer for this question so maybe htis is better as part of the Wiki.
iText is probably the best all around free tool.
PDFLib is another choice if you are willing to pay for the license. It has a bit more features and has a native implementation backing the Java API.
There is always FOP (from apache) if you are willing to deal with XSLT and XSL-FO, but I believe they haven't updated those engines in a while.
I agree that iText is a great tool. However, the current version of iText is not free if you intend to use it in a closed source project. See Wikipedia:
In the end of 2009, iText version 5 is released under Affero GPL license. This license is drastically different from the previous license that iText had been distributed under, in that it requires anyone using iText 5 under a free license to provide the users with the full source of their application. Projects that do not want to provide their source code are required to purchase a commercial license for a non-disclosed price or they cannot upgrade to iText 5.
However, you may still use iText 4 under the LGPL license.
Take a look at Apache FOP. Very powerful.
IText will probably serve most of your purposes. However, if you are looking to convert from rtf or doc to pdf, you can use a java plugin for open source tools like OpenOffice( openoffice.org)
Hope this is helpful,
R
iText is probably your #1 standard in this area. You might also consider JODReports or Docmosis since they can do template-based reporting using standard word processor documents as templates.
Have you considered http://jasperforge.org/
I got a document that need to be filled in (it was in microsoft word doc), I have no idea how to filled in / integrated with my current web apps.
is there any good java api / lib that could be used ? preferrably the free one.
here is the example of doc that need to be filled in.
http://drop.io/callmeblessed/asset/debt-agremeent-certificate-doc
Apache POI - the Java API for Microsoft Documents
If Leniel's suggestion doesn't work (I would suggest trying POI first, as well), there's the OpenOffice.org java UNO API which has a different implementation. It introduces a significant runtime dependency, but if POI doesn't cut it, it's the obvious second choice.
Docmosis can do this as long as you have control of the source document and can specify placeholders etc. It's free and makes use of OpenOffice to do the format conversions. The Docmosis engine can do document manipulation (population, repetition, deletion etc). Load balancing and scalability features are paid for though.
I'm looking for a simple (free) way to convert an arbitrary document to a PDF from within a program. There are any number of free PDF printers, but I need to be able to call the conversion within a program without human intervention. The program is being developed in Java, but will run exclusively in a Windows environment so calling an exe seems like a good solution if such a conversion program exists.
I have had some success with JodConverter, which is a Java-based wrapper around the OpenOffice.org API. Basically, you can run OpenOffice as a server and automate the action of opening a document in OpenOffice (which supports many many types) and saving it as PDF. JodConverter makes that a lot easier and has built-in support for running as a web service if you're interested in that.
Downsides: 1) Like OpenOffice itself, the conversion for certain complicated proprietary documents is not perfect; some of your Word documents may not look exactly identical as PDFs. 2) OpenOffice as a server is not entirely stable; if you hit it with a bunch of requests it will crash. One (somewhat expensive -- I think a few thousand dollars US) alternative is Sun's StarOffice Server, which does exactly the same thing as JodConverter (wrap OpenOffice) but adds pooling of OpenOffice instances and other stability support.
The most accurate PDF conversion tools are made by Adobe (and they do have server-based converters with API support), but they are very expensive - tens of thousands of dollars US.
simple... free... pdf... arbitrary input... At least the requirements are easy and reasonable.
Seriously, those requirements just aren't going to be met. If you are willing to pay money for a library that does some of this, you can check out Amyuni - It's a great library, but the type of stuff you are asking for is squarely in native win32 land - not something that's going to happen in Java. And even with that in place, it's not going to be simple.
I suppose you could do something with Ghostscript as well (many of the free PDF converters use it). But even then, you still have to deal with the conversion from arbitrary input issue.
There are other libraries available that can display lots of different file formats (even without the native application available) - perhaps something like that would work. Here's one (owned by Oracle now, so you know it's gotta be good ;-) : Outside In.
(BTW - iText is most definitely not going to do what you are asking about. I love iText, I use iText - heck, I'm a developer for part of iText - but it's most definitely not a PDF print driver, which is more in line with what you are going for).
for Java, the most recommended is iText