Using iText, I am wanting to open a PDF file, add some more pages with text to it, and then close it. I have found some questions like this on here, but all require creating a new PDF file. Is there any way to read in the pdf file and modify it and then overwrite the original?
Of course you can create a new pdf file, and afterwards overwriting the old file with the new one.
Commons Apache File Util
forceDelete(oldPdf)
moveFile(newPdf, oldPdf)
Of course, you can always overwrite a file (if it is not locked by the OS) by writing to the whole content to the FileOutputStream. You cannot partially write to part of a file unless it is to append data at the end of file. This is limited by the operating system itself so there is nothing you can do.
Related
I have a web application using iText v2.1.7 to create PDFs; before anyone tries to move me to a different library, let me point out that, like most programmers, I don't choose the libraries my company uses for things, or I certainly would not use this one.
I have code that generates these PDFs; now I am to add code that takes the contents of an existing PDF and inserts it into the PDF I'm creating.
I've found examples of how to do this, but they all use files. Except for the one I'm reading, I don't have files; I'm in a web application where I don't have easy access to a place to write a file.
Can't I open the existing PDF and somehow insert its entire content into the document I'm creating, without having to write to a file?
After I do this, I will have more content to add to the document, either from another file, dynamically created content, or both, so it isn't a simple merge of my content with one existing file. I also haven't created the existing file as its own entity, to be merged with another file, though I suppose I can do that IF it's necessary.
But I was hoping there was a way (or were ways) to do this without having to reorganize my existing code. It's possible the answer is implied in one of these examples, but they don't explain the concepts behind things, so I don't know where I can put input Streams instead of file input streams, output streams instead of file output streams, etc.
I need to create a Jar file which will read an excel and display as output, the existing data and the updated data.
This file needs to keep on running and displaying the Excel data as output. Any update that has been done on the Excel recently needs to be reflected in the output, along with the previous data.
I know how to create a Jar file, i am also able to read an excel file using Apache POI.
I just need an idea regarding how during every run, if the Excel is updated, that updated values can be displayed.
Do we need to implement threading,synchronization? If so, then how?
Synchronization does only work inside of your Java process. Assuming that an external process creates/updates the Excel file therefore synchronization will not help you.
The best chance you have is to listen for file-system changes of the Excel file (see WatchService class) and access the file after it has been changed.
For avoiding (or better minimize) file access conflicts I would open the file, copy the data to memory and then directly close the file.
Alternatively you could copy the file and then operate on the copied file. In both cases conflicts can still occur if the program writing the Excel file tries to perform changes while you are accessing the file.
Potential errors are errors because of blocked file or inconsistent data.
Using BufferedWriter.write() when is a file created?
I know from the docs that when the buffer is filled it will flush to file, does this mean that:
every-time the buffer is filled an incomplete file will appear on my file system?
or that the file is only created when the BufferedWriter is closed?
My concern is that I am writing files to a directory using a BufferedWriter and another process is polling the directory for new files and reading them. I do not want an incomplete file to be created and be read by the other process.
Using BufferedWriter.write() when is a file created?
Never. BufferedWriter itself just writes to another Writer. Now if you're using a FileOutputStream or a FileWriter (where the first would probably be wrapped in an OutputStreamWriter) the file is created (or opened for write if it already exists) when you construct the object, i.e. before you've actually written any data.
My concern is that I am writing files to a directory using a BufferedWriter and another process is polling the directory for new files and reading them. I do not want an incomplete file to be created and be read by the other process.
One typical way of handling this is to write to a staging area and then rename the file into the correct place, which is usually an atomic operation. Or even write the file into the correct directory, but with a file extension which the polling process won't spot - and then rename the file to the final filename afterwards.
BufferedWriter doesn't create a file as Jon Skeet said. And you cannot guarantee that another process won't read an incomplete file when it is being written to disk. But there are two things you can do:
Lock the file so that the other process cannot read it before writing is complete. There are several questions concerning file locking in Java on this site (search for "[java] lock file").
Create the file with another filename (ie. use an extension that is not being looked for by the other process) and rename it when writing is finished.
I need to duplicate various kinds of file types, change them a bit so that the original's md5 hash won't match the modified one, but keep them readable and not corrupted.
TXT files - that's obvious. I just add a random string to the end of the file.
PDF file - well I started looking for a java library to edit pdf files, but then I accidentally tried to open a pdf file in notepad++, and thought - why don't I try to add a random string to the end of the not readable content that I see there. Well, to my surprise it worked and the file wasn't corrupted.
ZIP file - I've tried the same that I did with pdf, and it also worked.
DOCX- the same method stopped working here. Appending just a space (" ") at the end of the binary content of a docx file that I open in a text editor, corrupts the file.
So what I need is:
java libraries for modifying office documents :doc, docx, xls, xlsx, ppt, pptx.
There are still file types that I need to change there md5 hash output, but I don't think they are modifiable in java - media files for example, executables and etc..
So, nevertheless, how can i perform what I want on these files? Is there a way to just "touch" the file, change a header or something and make it nonidentical to an untouched one?
edit:
Ok, here's the motivation - I want to generate massive amount of data as I asked here: How to produce massive amount of data?
At the time of that question, the answers I got there were enough, but not they dont.
I need the data to be nonidentical. Pairs of files must fail md5 hash test.
i can't just generate random strings, because I need to simulate real files and documnets.
I can't use existing data dumps, because I need various sizes of these data sets that include various file types. I need something that I'll give as an input the size, and it will generate the data for me.
So I figured that I should use a starting data set of all the file types that I eventually need, and just duplicate this data set.
java libraries for modifying office documents :doc, docx, xls, xlsx, ppt, pptx.
Apache POI is used to modify MS Office files. Note that newer formats (xlsx, docx, etc.) are simply ZIP files containing XML. Unzipping them and modifying plain text XML might work as well.
The same advice goes to ZIP files: try unzipping and modifying the easiest file.
But what are you actually trying to achieve? Note that randomly attaching some string at the end of the file works only by chance. On other computer or other version of software the file might be considered as corrupted...
I would advice you to either store some metadata external to the file rather than comparing MD5 or look deeper into file formats. There are almost always headers and various pieces of metadata hidden in the file (ID3 tags in MP3, EXIF in images, etc.) It is much safer to modify it instead.
Also look for reserved/not used bytes - it is quite often. But again - why? are you doing it on the first place?
I'm facing a problem that, we have a .zip file that contains some text files. Now I'm using java to access that files. If it is not in the .zip file I can read and print on my console easily using FileInputStream.
But how to read a file from .zip file? I use J2SE only..
You should try a ZipInputStream. The interface is a little obtuse, but you can use getNextEntry() to iterate through the items in the .zip file.
As a side note, the Java class-loader does exactly this to load classes from .jar files without extracting them first.
Everything you need is in ZipFile: https://docs.oracle.com/javase/7/docs/api/java/util/zip/ZipFile.html. Google for examples on the web, and if you have specific problems then come back to SO for help.
(The link will eventually break; when it does simply websearch java zipfile.)