Hey guys so I am brand new to the world of Java-XML parsing and found that the StaX API is probably my best bet as I need to both read and write XML files. Alright so I have a very short (and should be very simple) program that (should) create an XMLInputFactory and use that to create a XMLStreamReader. The XMLStreamReader is created using a FileInputStream attached to an XML file in the same directory as the source file. However even though the FileInputStream compiled properly, the XMLInputFactory cannot access it and without the FileInputStream it cannot creat the XMLStreamReader. Please help as I have no idea what to and am frustrated to the point of giving up!
import javax.xml.stream.*;
import java.io.*;
public class xml {
static String status;
public static void main(String[] args) {
status = "Program has started";
printStatus();
XMLInputFactory inFactory = XMLInputFactory.newInstance();
status = "XMLInputFactory (inFactory) defined"; printStatus();
try { FileInputStream fIS = new FileInputStream("stax.xml"); }
catch (FileNotFoundException na) { System.out.println("FileNotFound"); }
status = "InputStream (fIS) declared"; printStatus();
try { XMLStreamReader xmlReader = inFactory.createXMLStreamReader(fIS); } catch (XMLStreamException xmle) { System.out.println(xmle); }
status = "XMLStreamReader (xmlReader) created by 'inFactory'"; printStatus();
}
public static void printStatus(){ //this is a little code that send notifications when something has been done
System.out.println("Status: " + status);
}
}
also here is the XML file if you need it:
<?xml version="1.0"?>
<dennis>
<hair>brown</hair>
<pants>blue</pants>
<gender>male</gender>
</dennis>
Your problem has to do w/ basic java programming, nothing to do w/ stax. your FileInputStream is scoped within a try block (some decent code formatting would help) and therefore not visible to the code where you are attempting to create the XMLStreamReader. with formatting:
XMLInputFactory inFactory = XMLInputFactory.newInstance();
try {
// fIS is only visible within this try{} block
FileInputStream fIS = new FileInputStream("stax.xml");
} catch (FileNotFoundException na) {
System.out.println("FileNotFound");
}
try {
// fIS is not visible here
XMLStreamReader xmlReader = inFactory.createXMLStreamReader(fIS);
} catch (XMLStreamException xmle) {
System.out.println(xmle);
}
on a secondary note, StAX is a nice API, and a great one for highly performant XML processing in java. however, it is not the simplest XML api. you would probably be better off starting with the DOM based apis, and only using StAX if you experience performance issues using DOM. if you do stay with StAX, i'd advise using XMLEventReader instead of XMLStreamReader (again, an easier api).
lastly, do not hide exception details (e.g. catch them and print out something which does not include the exception itself) or ignore them (e.g. continue processing after the exception is thrown without attempting to deal with the problem).
Related
Is it not possible to append to an ObjectOutputStream?
I am trying to append to a list of objects. Following snippet is a function that is called whenever a job is finished.
FileOutputStream fos = new FileOutputStream
(preferences.getAppDataLocation() + "history" , true);
ObjectOutputStream out = new ObjectOutputStream(fos);
out.writeObject( new Stuff(stuff) );
out.close();
But when I try to read it I only get the first in the file.
Then I get java.io.StreamCorruptedException.
To read I am using
FileInputStream fis = new FileInputStream
( preferences.getAppDataLocation() + "history");
ObjectInputStream in = new ObjectInputStream(fis);
try{
while(true)
history.add((Stuff) in.readObject());
}catch( Exception e ) {
System.out.println( e.toString() );
}
I do not know how many objects will be present so I am reading while there are no exceptions. From what Google says this is not possible. I was wondering if anyone knows a way?
Here's the trick: subclass ObjectOutputStream and override the writeStreamHeader method:
public class AppendingObjectOutputStream extends ObjectOutputStream {
public AppendingObjectOutputStream(OutputStream out) throws IOException {
super(out);
}
#Override
protected void writeStreamHeader() throws IOException {
// do not write a header, but reset:
// this line added after another question
// showed a problem with the original
reset();
}
}
To use it, just check whether the history file exists or not and instantiate either this appendable stream (in case the file exists = we append = we don't want a header) or the original stream (in case the file does not exist = we need a header).
Edit
I wasn't happy with the first naming of the class. This one's better: it describes the 'what it's for' rather then the 'how it's done'
Edit
Changed the name once more, to clarify, that this stream is only for appending to an existing file. It can't be used to create a new file with object data.
Edit
Added a call to reset() after this question showed that the original version that just overrode writeStreamHeader to be a no-op could under some conditions create a stream that couldn't be read.
As the API says, the ObjectOutputStream constructor writes the serialization stream header to the underlying stream. And this header is expected to be only once, in the beginning of the file. So calling
new ObjectOutputStream(fos);
multiple times on the FileOutputStream that refers to the same file will write the header multiple times and corrupt the file.
Because of the precise format of the serialized file, appending will indeed corrupt it. You have to write all objects to the file as part of the same stream, or else it will crash when it reads the stream metadata when it's expecting an object.
You could read the Serialization Specification for more details, or (easier) read this thread where Roedy Green says basically what I just said.
The easiest way to avoid this problem is to keep the OutputStream open when you write the data, instead of closing it after each object. Calling reset() might be advisable to avoid a memory leak.
The alternative would be to read the file as a series of consecutive ObjectInputStreams as well. But this requires you to keep count how many bytes you read (this can be implementd with a FilterInputStream), then close the InputStream, open it again, skip that many bytes and only then wrap it in an ObjectInputStream().
I have extended the accepted solution to create a class that can be used for both appending and creating new file.
import java.io.DataOutputStream;
import java.io.IOException;
import java.io.ObjectOutputStream;
import java.io.OutputStream;
public class AppendableObjectOutputStream extends ObjectOutputStream {
private boolean append;
private boolean initialized;
private DataOutputStream dout;
protected AppendableObjectOutputStream(boolean append) throws IOException, SecurityException {
super();
this.append = append;
this.initialized = true;
}
public AppendableObjectOutputStream(OutputStream out, boolean append) throws IOException {
super(out);
this.append = append;
this.initialized = true;
this.dout = new DataOutputStream(out);
this.writeStreamHeader();
}
#Override
protected void writeStreamHeader() throws IOException {
if (!this.initialized || this.append) return;
if (dout != null) {
dout.writeShort(STREAM_MAGIC);
dout.writeShort(STREAM_VERSION);
}
}
}
This class can be used as a direct extended replacement for ObjectOutputStream.
We can use the class as follows:
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
public class ObjectWriter {
public static void main(String[] args) {
File file = new File("file.dat");
boolean append = file.exists(); // if file exists then append, otherwise create new
try (
FileOutputStream fout = new FileOutputStream(file, append);
AppendableObjectOutputStream oout = new AppendableObjectOutputStream(fout, append);
) {
oout.writeObject(...); // replace "..." with serializable object to be written
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
How about before each time you append an object, read and copying all the current data in the file and then overwrite all together to file.
I want to save data with jackson to existing file (update it) but it won't work when I run my project from jar.
I need to use json as "database" (I know it's pretty stupid but that's for a school project) and to do it I load and save all the data when I do any of CRUD operations. It's working fine when I run it with IDE but when I tried as a jar it had a problem with reading file from ClassPathResource.
So I have this method to save changes to file:
private List<Item> items;
private ObjectMapper mapper;
private ObjectWriter writer;
public void saveData() {
mapper = new ObjectMapper();
writer = mapper.writer(new DefaultPrettyPrinter());
try {
writer.writeValue(new ClassPathResource("items.json").getFile(), items);
} catch (IOException e) {
e.printStackTrace();
}
}
And it works just fine when i run this through IntelliJ but it won't work when I run it as a jar.
I found a solution to loading the data by using InputStream from this question and method looks like this:
public void loadData() {
mapper = new ObjectMapper();
try {
ClassPathResource classPathResource = new ClassPathResource("items.json");
InputStream inputStream = classPathResource.getInputStream();
File tempFile = File.createTempFile("test", ".json");
FileUtils.copyInputStreamToFile(inputStream, tempFile);
System.out.println(tempFile);
System.out.println(ItemDao.class.getProtectionDomain().getCodeSource().getLocation().getPath().toString());
items = mapper.readValue(tempFile, new TypeReference<List<Item>>() {
});
} catch (IOException e) {
items = null;
e.printStackTrace();
}
}
But I still have no idea how to actually save changes. I was thinking about making use of FileOutputStreambut I achieved nothing.
So I want to get this working in jar file and be able to save changes to the same file, thanks for help in advance!
when you want to do read/write operations, it is better keep the file outside of the project. when running the jar, pass file name with path as an argument. like -DfileName=/Users/chappa/Documents/items.json etc. This way, you have absolute path, and you can perform read/write operations on it
if you are using java 1.7 or above, use below approach to write data.
To read data, you can use jackson api to load the json file as is.
Path wipPath = Paths.get("/Users/chappa/Documents/items.json");
try (BufferedWriter writer = Files.newBufferedWriter(wipPath)) {
for (String record : nosRecords) {
writer.write(record);
}
}
Just in case if you want to read json using IO streams, you can use below code
Path wipPath = Paths.get("/Users/chappa/Documents/items.json");
try (BufferedReader reader = Files.newBufferedReader(wipPath)) {
String line=null;
while((line = reader.readLine()) != null) {
System.out.println(line);
}
}
Is it not possible to append to an ObjectOutputStream?
I am trying to append to a list of objects. Following snippet is a function that is called whenever a job is finished.
FileOutputStream fos = new FileOutputStream
(preferences.getAppDataLocation() + "history" , true);
ObjectOutputStream out = new ObjectOutputStream(fos);
out.writeObject( new Stuff(stuff) );
out.close();
But when I try to read it I only get the first in the file.
Then I get java.io.StreamCorruptedException.
To read I am using
FileInputStream fis = new FileInputStream
( preferences.getAppDataLocation() + "history");
ObjectInputStream in = new ObjectInputStream(fis);
try{
while(true)
history.add((Stuff) in.readObject());
}catch( Exception e ) {
System.out.println( e.toString() );
}
I do not know how many objects will be present so I am reading while there are no exceptions. From what Google says this is not possible. I was wondering if anyone knows a way?
Here's the trick: subclass ObjectOutputStream and override the writeStreamHeader method:
public class AppendingObjectOutputStream extends ObjectOutputStream {
public AppendingObjectOutputStream(OutputStream out) throws IOException {
super(out);
}
#Override
protected void writeStreamHeader() throws IOException {
// do not write a header, but reset:
// this line added after another question
// showed a problem with the original
reset();
}
}
To use it, just check whether the history file exists or not and instantiate either this appendable stream (in case the file exists = we append = we don't want a header) or the original stream (in case the file does not exist = we need a header).
Edit
I wasn't happy with the first naming of the class. This one's better: it describes the 'what it's for' rather then the 'how it's done'
Edit
Changed the name once more, to clarify, that this stream is only for appending to an existing file. It can't be used to create a new file with object data.
Edit
Added a call to reset() after this question showed that the original version that just overrode writeStreamHeader to be a no-op could under some conditions create a stream that couldn't be read.
As the API says, the ObjectOutputStream constructor writes the serialization stream header to the underlying stream. And this header is expected to be only once, in the beginning of the file. So calling
new ObjectOutputStream(fos);
multiple times on the FileOutputStream that refers to the same file will write the header multiple times and corrupt the file.
Because of the precise format of the serialized file, appending will indeed corrupt it. You have to write all objects to the file as part of the same stream, or else it will crash when it reads the stream metadata when it's expecting an object.
You could read the Serialization Specification for more details, or (easier) read this thread where Roedy Green says basically what I just said.
The easiest way to avoid this problem is to keep the OutputStream open when you write the data, instead of closing it after each object. Calling reset() might be advisable to avoid a memory leak.
The alternative would be to read the file as a series of consecutive ObjectInputStreams as well. But this requires you to keep count how many bytes you read (this can be implementd with a FilterInputStream), then close the InputStream, open it again, skip that many bytes and only then wrap it in an ObjectInputStream().
I have extended the accepted solution to create a class that can be used for both appending and creating new file.
import java.io.DataOutputStream;
import java.io.IOException;
import java.io.ObjectOutputStream;
import java.io.OutputStream;
public class AppendableObjectOutputStream extends ObjectOutputStream {
private boolean append;
private boolean initialized;
private DataOutputStream dout;
protected AppendableObjectOutputStream(boolean append) throws IOException, SecurityException {
super();
this.append = append;
this.initialized = true;
}
public AppendableObjectOutputStream(OutputStream out, boolean append) throws IOException {
super(out);
this.append = append;
this.initialized = true;
this.dout = new DataOutputStream(out);
this.writeStreamHeader();
}
#Override
protected void writeStreamHeader() throws IOException {
if (!this.initialized || this.append) return;
if (dout != null) {
dout.writeShort(STREAM_MAGIC);
dout.writeShort(STREAM_VERSION);
}
}
}
This class can be used as a direct extended replacement for ObjectOutputStream.
We can use the class as follows:
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
public class ObjectWriter {
public static void main(String[] args) {
File file = new File("file.dat");
boolean append = file.exists(); // if file exists then append, otherwise create new
try (
FileOutputStream fout = new FileOutputStream(file, append);
AppendableObjectOutputStream oout = new AppendableObjectOutputStream(fout, append);
) {
oout.writeObject(...); // replace "..." with serializable object to be written
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
How about before each time you append an object, read and copying all the current data in the file and then overwrite all together to file.
I tried to use PDFBox on regular .pdf files and it worked fine.
However when I encountered a corrupted .pdf , the code would "freeze" .. not throwing errors or something .. simply the load or parse function take forever to execute
Here is the corrupted file (i have zipped it so that everybody could download it), it is probably not a native pdf file but it was saved as a .pdf extension and it is only 4 Kb.
I am not an expert at all, but I think that this is a bug with PDFBox. According to documentation, both load() and parse() methods are supposed to throw exceptions if they fail. However in case with my file, the code would take forever to execute and not throw exception.
I tried using only load, one can try parse() .. the result is the same
import java.io.FileNotFoundException;
import java.io.IOException;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.util.PDFTextStripper;
public class TestTest {
public static void main(String[] args) throws FileNotFoundException, IOException {
System.out.println(pdfToText("C:\\..............MYFILE.pdf"));
System.out.println("done ! ! !");
}
private static String pdfToText(String fileName) throws IOException {
PDDocument document = null;
document = PDDocument.load(new File(fileName)); // THIS TAKES FOREVER
PDFTextStripper stripper = new PDFTextStripper();
document.close();
return stripper.getText(document);
}
}
How to force this code throw an exception or stop executing if the .pdf file is corrupted?
Thanks
Try this solution:
private static String pdfToText(String fileName) {
PDDocument document = null;
try {
document = PDDocument.load(fileName);
PDFTextStripper stripper = new PDFTextStripper();
return stripper.getText(document);
} catch (IOException e) {
System.err.println("Unable to open PDF Parser. " + e.getMessage());
return null;
} finally {
if (document != null) {
try {
document.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
For implementing simple timeouts for 3rd party libs I often use an implementation like Apache Commons ThreadMonitor:
long timeoutInMillis = 1000;
try {
Thread monitor = ThreadMonitor.start(timeoutInMillis);
// do some work here
ThreadMonitor.stop(monitor);
} catch (InterruptedException e) {
// timed amount was reached
}
Example code is from Apache's ThreadMonitor Javadoc.
I only use this when the 3rd party API does not provide some timeout mechanism, of course.
However I was forced to tweak this a bit some weeks ago, because this solution does not work well with (3rd party) code that is using Exception masking.
In particular we run into problems with c3p0 which masks all Exceptions (and in particular InterruptedExceptions). Our solution was to tweak the implementation to also check the exception's cause chain for InterruptedExceptions.
I have a Java desktop application that is using iText to generate PDFs from a resultset. The first time you generate a PDF, it works fine. The problem comes when you try to generate a second one. It throws a DocumentException saying that the document is closed. I have tried to find other examples of people having this problem, and I come up with very little, which leads me to believe that I have made a very simple mistake and I cannot find it.
The code below is a snippet of the event handler that calls the report class:
RptPotReport report = new RptPotReport();
try {
report.rptPot();
} catch (DocumentException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
And here is the code for the report class itself. The error occurs on the second run through this code:
public class RptPotReport {
public static void main(String[] args) throws IOException, DocumentException, SQLException {
new RptPotReport().rptPot();
}
String fileOutput = "Potting Report.pdf";
public void rptPot() throws DocumentException, IOException {
File f = new File("Potting Report.pdf");
if (f.exists()) {
f.delete();
}
Document document = new Document();
document = pdfSizes.getPdfLetter();
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(fileOutput));
document.open();
Phrase title = new Phrase();
title.add(new Chunk("Potting Report"));
document.add(title); // ******* DocumentException here: "The document has been closed. You can't add any Elements."
document.close();
try {
File pdfFile = new File(fileOutput);
if (pdfFile.exists()) {
if (Desktop.isDesktopSupported()) {
Desktop.getDesktop().open(pdfFile);
} else {
System.out.println("Awt Desktop is not supported!");
}
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
EDIT: At someone's suggestion, I tried calling the RptPotReport from a second thread, but that did not change anything. Looking into it further, the Document class of iText creates a new thread when it's instantiated. So I'm right back where I started, still stuck.
What does this line do exactly in your application:
document = pdfSizes.getPdfLetter();
Without the code and with your explanation it seems like the line sets the reference of the document variable to the one that you receive from pdfSizes.getPdfLetter(), which is reused between run, thus you no longer have the reference of the new Document() statement.
I tend to think the pdfSizes.getPdfLetter() method is bugged.