I want to create an application deployed into the google app engine and I want to use the docx4j library. My application will read in a preexisting docx file (perhaps from a database) and then parse the docx document and replace some content, before outputting the final version as a PDF (also stored in a database)
I know app enine has a quite a few restrictions on what can and cannot run, and I will be using Java to do this. I know it does not allow writing to the filesystem, hence my comment about needing to get the input file and generating the output file into a database.
Does anyone know if the docx4j library, and it's dependencies, will be allowed to run in the app engine environment?
Thanks!
Java is one of the languages that is supported in both of the Google App Engine environments: standard and flexible. Go to these GCP docs to check the differences between those and choose the environment suitable for you. You can use Maven to handle your dependencies (the Docx4J library is available in Maven Repository).
When it comes to storing your files, you have a few options that can be used with Google App Engine app: Google Cloud Datastore, Google Cloud SQL and Google Cloud Storage. Their comparison can be found in GCP docs here.
I'm trying to create a spreadsheet on google drive using DRIVE API in Java, the documentation on google site is very confusing, can somebody please help me by pointing to a sample example that demonstrate the creation of documents on google drive in Java using the latest DRIVE API?
Based from this SO question, using Drive API seems to only let you create new empty files with the spreadsheet MIME type. The Drive API is only concerned with operations at the whole file level. It is only possible in Google Drive to upload the created spreadsheet.
You can use Google Sheets API (formerly called the Google Spreadsheets API) which lets you develop client applications that create, read and modify worksheets and data in Google Sheets. This API can manage the worksheets in a Google Sheets file. You should strongly consider using a GData client library to interact with the API. Follow the following steps here to setup a development environment for working with the Sheets API.
You can create a spreadsheet via Drive API by calling files.create() with mimeType=application/vnd.google-apps.spreadsheet and uploading a CSV file. See Importing to Google Docs Types where there's a Java sample.
I am developing an app and I need the app to be able to post data into a spreadsheet. I hear google API has something for this?
Does it handle creating spreadsheets and then adding data to tables etc?
Simply the process is:
- Create spreadsheet
- Upload it to the web (google docs spreadsheet)
- It gives you a URL to the spreadsheet
- Send data to it via JSON requests
To learn more about how to implement it see this article here which talks about how to Use a Google Spreadsheet as your JSON backend or for more links type that term in to google.
Also don't see this as an easy way out to not having to deal with a database, if you are designing an app and you need it to be secure then you are better off learning how to use sqlite or the likes
A question that seems to have quite a few options for Python, but none for Java after googling for two days. Really really could use some help all I have found so far is a recommendation to use gaeVFS to build an excel file from the xml components and then zip it all together which sounds like a slap in the face. Oh yes and if you were wondering I am questioning my use of Java rather than python but at 5,000 lines of code it would be insane to turn back now...
Other things you might find useful
Client: GWT
Server: Servlets running
on google app engine storing data
into the google data store
Excel file: mandatory, CSV isn't good
enough, no need to save the file just
to be able to "serve" it to the
client i.e. open a "Save As" box.
Have you checked out this api already: Java Excel API ?
You could also take a look at the Apache POI project. You can read and write MS Excel documents with this library.
Take a look at this post.
It's a step by step tutorial on how to generate excel files on google app engine.
Try this :
http://code.google.com/p/gwt-table-to-excel/
google app engine do not support input/output stream classes, you need to use google app engine virtual file system.
The interop library is slow and needs MS Office installed.
Many times you don't want to install MS Office on servers.
I'd like to use Apache POI, but I'm on .NET.
I need only to extract the text portion of the files, not creating nor "storing information" in Office files.
I need to tell you that I've got a very large document library, and I can't convert it to newer XML files.
I don't want to write a parser for the binaries files.
A library like Apache POI does this for us. Unfortunately, it is only for the Java platform. Maybe I should consider writing this application in Java.
I am still not finding an open source alternative to POI in .NET, I think I'll write my own application in Java.
For all MS Office versions:
You could use the third-party components like TX Text Controls for Word and TMS Flexcel Studio for Excel
For the new Office (2007):
You could do some basic stuff using .net functionality from system.io.packaging. See how at http://msdn.microsoft.com/en-us/library/bb332058.aspx
For the old Office (before 2007):
The old Office formats are now documented: http://www.microsoft.com/interop/docs/officebinaryformats.mspx. If you want to do something really easy you might consider trying it. But be aware that these formats are VERY complex.
Check out the Aspose components. They are designed to mimic the Interop functionality without requiring a full Office install on a server.
As the new docx formats are inherently XML based files, you can create and manipulate them programmatically with standard XML DOM techniques, once you know the structure.
The files are basically zip archives with an alternate file extension. Use the System.IO.Packaging namespace to get access to the internal elements of the file, then open them into a XmlDocument to perform the manipulation.
There are examples available for doing this, and the Office Open XML project on SourceForge may be worth looking at for inspiration.
As for the older binary formats, these were proprietary to MS, and the only way you're likely to get at the content from within is through the Office object model (requires an Office install), or a third party file converter/parser.
Unfortunately there's nothing first party and native to the .NET platform to work with these files.
What do you need to do with those file? If you just want to stream them to the user, then the basic file streams are fine. If you want to create new files (perhaps based on a template) to send to the user that the user can open in Office, there are a variety or work-arounds.
If you're actually keeping data in Office documents for use by your web site, you're doing it wrong. Office documents, even Excel spreadsheets and access databases, are not really an appropriate choice for use with an interactive web site.
If the document is in word 2007 format, you can use the system.io.packaging library to interact with it programatically.
RWendi
In Java world, there is also JExcelApi. It is very clearly written, from what I was able to see, much cleaner then POI. So maybe even a port of that code to .NET is not out of the question, depending of course you have enough of time on your hands.
OpenOffice.
You can program against it and have it do a lot for you, without spending the money on a license for the server, or have the vulnerability associated with it on your server.
Microsoft Excel workbooks can be read using an ODBC driver (or is it an OLE DB driver? can't remember) that makes the workbook look like a database table. But I don't know whether that driver is available without the Office Suite itself.
You can use OpenOffice. It has a command-line conversion tool:
Conversion Howto
In short, you define a macro in OpenOffice and you call that macro with a command-line
argument to OpenOffice. In that argument the name of the local file (the Office file) is
encoded.
It's not a great sollution, but it should be workable.