JAXB builds incorrect XML - java

I have some problems writing my java objects to an XML file using JAXB.
My method looks like this:
public void printToXml(PNLExport export, String outputPath, boolean syso)throws Exception
{
FileOutputStream fos = null;
try {
fos = new FileOutputStream(outputPath);
JAXBContext contxt = JAXBContext.newInstance(PNLExport.class);
Marshaller m = contxt.createMarshaller();
m.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
m.setProperty(Marshaller.JAXB_ENCODING, "UTF-8");
if(syso){
System.out.println();
m.marshal(export, System.out);
}
m.marshal(export, fos);
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (fos != null) {
fos.close();
}
} catch (IOException ex) {
ex.printStackTrace();
}
}
}//printToXml
After closing the root element tag it shows some strange behaviour :
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<PNLExport version="">
<Header>
<RecType>0</RecType>
<DateFormat>DD.MM.YY</DateFormat>
<TimeFormat>HH:MM</TimeFormat>
<TimeMode>L</TimeMode>
<GenDate>25.06.12</GenDate>
<GenTime>09:45</GenTime>
</Header>
<Records>
<Record>
<FlightRecord>
<RecType>21</RecType>
<Carrier>HG</Carrier>
<FlightNumber>8332</FlightNumber>
<FlightDate>30.06.12</FlightDate>
<Departure>VIE</Departure>
<Destination>OLB</Destination>
<DepTime>09:40</DepTime>
<DesTime>11:30</DesTime>
</FlightRecord>
<PaxRecord>
<RecType>32</RecType>
<BookingNumber>11632</BookingNumber>
<PaxNumber>1</PaxNumber>
<Name>SCHABAUER,Franz</Name>
<Salutation>MR</Salutation>
<BookingState>OK</BookingState>
<TicketType>T</TicketType>
</PaxRecord>
<PaxRecord>
<RecType>32</RecType>
<BookingNumber>11632</BookingNumber>
<PaxNumber>2</PaxNumber>
<Name>SCHABAUER,Vera</Name>
<Salutation>MRS</Salutation>
<BookingState>OK</BookingState>
<TicketType>T</TicketType>
</PaxRecord>
</Record>
</Records>
.
.
.
</PNLExport>
rrier>
<FlightNumber>8332</FlightNumber>
<FlightDate>02.07.12</FlightDate>
<Departure>VIE</Departure>
<Destination>OLB</Destination>
<DepTime>09:15</DepTime>
<DesTime>10:55</DesTime>
</FlightRecord>
<FlightRecord>
<RecType>21</RecType>
<Carrier>HG</Carrier>
<FlightNumber>8333</FlightNumber>
<FlightDate>02.07.12</FlightDate>
<Departure>OLB</Departure>
<Destination>VIE</Destination>
<DepTime>11:40</DepTime>
<DesTime>13:20</DesTime>
</FlightRecord>
<FlightRecord>
<RecType>21</RecType>
<Carrier>HG</Carrier>
<FlightNumber>8333</FlightNumber>
<FlightDate>29.06.12</FlightDate>
<Departure>OLB</Departure>
<Destination>VIE</Destination>
<DepTime>14:00</DepTime>
<DesTime>15:40</DesTime>
</FlightRecord>
</Record>
</Records>
</PNLExport>
Whats going wrong here?
It's also weird that sometimes the xml is perfectly correct...

Since the extra data that appears in the file after the desired content has the same structure but different data, you most likely have two threads calling this method and occasionally both write to the same File.

Creating a FileOutputStream like this simply opens the file for writing. It does not remove the previous content of file, but overwrites it with new content. If the old content is longer than the new content, then the later parts of the old contents will still linger in the file. This seems to be exactly what's happening here.
The easiest way to solve this is probably to delete the file prior to writing to it (only if it exists, obviously).

Related

Java stax: The reference to entity "R" must end with the ';' delimiter

I am trying to parse a xml using stax but the error I get is:
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[414,47]
Message: The reference to entity "R" must end with the ';' delimiter.
Which get stuck on the line 414 which has P&Rinside the xml file. The code I have to parse it is:
public List<Vild> getVildData(File file){
XMLInputFactory factory = XMLInputFactory.newFactory();
try {
ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(Files.readAllBytes(file.toPath()));
XMLStreamReader reader = factory.createXMLStreamReader(byteArrayInputStream, "iso8859-1");
List<Vild> vild = saveVild(reader);
reader.close();
return vild;
} catch (IOException e) {
e.printStackTrace();
} catch (XMLStreamException e) {
e.printStackTrace();
}
return Collections.emptyList();
}
private List<Vild> saveVild(XMLStreamReader streamReader) {
List<Vild> vildList = new ArrayList<>();
try{
Vild vild = new Vild();
while (streamReader.hasNext()) {
streamReader.next();
//Creating list with data
}
}catch(XMLStreamException | IllegalStateException ex) {
ex.printStackTrace();
}
return Collections.emptyList();
}
I read online that the & is invalid xml code but I don't know how to change it before it throws this error inside the saveVild method. Does someone know how to do this efficiently?
Change the question: you're not trying to parse an XML file, you're trying to parse a non-XML file. For that, you need a non-XML parser, and to write such a parser you need to start with a specification of the language you are trying to parse, and you'll need to agree the specification of this language with the other partners to the data interchange.
How much work you could all save by conforming to standards!
Treat broken XML arriving in your shop the way you would treat any other broken goods coming from a supplier: return it to sender marked "unfit for purpose".
The problem here, as you mention is that the parser finds the & and it expects also the ;
This gets fixed escaping the character, so that the parser finds & instead.
Take a look here for further reference

Wrong characters in Java XML?

I have the following problem:
For a project I created my own logger, which produces an xml file with custom tags.
The problem is that both using DOM and JAXB to create the XML probably have problems in encoding. Since the "content" field always produces incorrect characters.
I have already tried to change the encoding with UTF-8 / windows-1252.
I found that in reality the project that I then run the logger on uses ISO-8859-1 I tried to replace that too, but nothing. As output of the content field I always get these incomprehensible characters.
Can anyone help me?
My Code:
if (OS.contains("Window")) {
try {
fh = new FileHandler(userDir+s+logF+s+jade+s+nameAgent+"-receive(Logger Java).xml" );
logger.addHandler(fh);
XMLFormatter formatter = new XMLFormatter();
fh.setFormatter(formatter);
logger.info(" ");
}
catch (SecurityException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
XmlCreator xmlcreator = new XmlCreator();
xmlcreator.setOntology(onto);
xmlcreator.setPerformative(perf);
xmlcreator.settimeStamp(ts);
xmlcreator.setProtocol(pro);
xmlcreator.setReceiver(rec);
xmlcreator.setContent(con);
try {
File file = new File("C:\\Users\\Francesco\\Desktop\\writereceiver.xml");
JAXBContext jaxbContext = JAXBContext.newInstance(XmlCreator.class);
Marshaller jaxbMarshaller = jaxbContext.createMarshaller();
// output pretty printed
jaxbMarshaller.setProperty(Marshaller.JAXB_ENCODING, "UTF-8");
jaxbMarshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
jaxbMarshaller.marshal(xmlcreator, file);
jaxbMarshaller.marshal(xmlcreator, System.out);
} catch (JAXBException e) {
e.printStackTrace();
}
Output XML (problem in content tag) :
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xmlCreator>
<content>’ sr pojo.SongRequestInfoÃÃcÀWCë</content>
<performative>ACCEPT-PROPOSAL</performative>
<receiver>jade.util.leap.ArrayList$1#445c4a59</receiver>
<timeStamp>1583849551513</timeStamp>
</xmlCreator>
I agree with #VCR. In all likelihood the output XML is a correctly encoded UTF-8 XML document, and it only looks odd because you are looking at it using some piece of software that doesn't know how to display UTF-8.
The prevalence of character pairs starting  is symptomatic of what happens when you display UTF-8 data using software that thinks it is displaying iso-8859-1.

Is encoding Cp1252 invalid in an XML file?

Some XML file I ran across is failing a well-formed XML check, even though it looks well-formed to me (I might be wrong.)
I have reduced it to a trivial example:
<?xml version="1.0" encoding="Cp1252"?>
<jnlp/>
The method being used to do the check works like this:
public static boolean isWellFormedXml(InputStream inputStream) {
try {
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
inputFactory.setProperty(XMLInputFactory.IS_COALESCING, false);
inputFactory.setProperty(XMLInputFactory.SUPPORT_DTD, false);
XMLStreamReader reader = inputFactory.createXMLStreamReader(stream);
try {
// Scan through all the reader tokens to ensure everything is well formed
while (reader.hasNext()) {
reader.next();
}
} finally {
reader.close();
}
} catch (XMLStreamException e) {
// Ignore the exception
return false;
}
return true;
}
The error I'm seeing is:
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,40]
Message: Invalid encoding name "Cp1252".
Only problem is - I can breakpoint at the catch and confirm that this encoding name does resolve. So what's the deal here? Does XML also restrict which encodings you're allowed to use in the prologue?
check:
http://www.iana.org/assignments/character-sets/character-sets.xml
i guess the encoding you're looking for COULD be windows-1252. Cp1252 might be a valid charset in java, but in XML, you're not supposed to use it (by that name).

SAXException: Content is not allowed in trailing section

This is driving me crazy. I have used this bit of code for lots of different projects but this is the first time it's given me this type of error. This is the whole XML file:
<layers>
<layer name="Layer 1" h="400" w="272" z="0" y="98" x="268"/>
<layer name="Layer 0" h="355" w="600" z="0" y="287" x="631"/>
</layers>
Here is the operative bit of code in my homebrew Xml class which uses the DocumentBuilderFactory to parse the Xml fed into it:
public static Xml parse(String xmlString)
{
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
Document doc = null;
//System.out.print(xmlString);
try
{
doc = dbf.newDocumentBuilder().parse(
new InputSource(new StringReader(xmlString)));
// Get root element...
Node rootNode = (Element) doc.getDocumentElement();
return getXmlFromNode(rootNode);
} catch (ParserConfigurationException e)
{
System.out.println("ParserConfigurationException in Xml.parse");
e.printStackTrace();
} catch (SAXException e)
{
System.out.println("SAXException in Xml.parse ");
e.printStackTrace();
} catch (IOException e)
{
System.out.println("IOException in Xml.parse");
e.printStackTrace();
}
return null;
}
The context that I am using it is: school project to produce a Photoshop type image manipulation application. The file is being saved with the layers as .png and this xml file for the position, etc. of the layers in a .zip file. I don't know if the zipping is adding some mysterious extra characters or not.
I appreciate your feedback.
If you look at that file in an editor, you'll see content (perhaps whitespace) following the end element e.g.
</layers> <-- after here
It's worth dumping this out using a tool that will highlight whitespace chars e.g.
$ cat -v -e my.xml
will dump 'unprintable' characters.
Hopefully this can be helpful to someone at some point. The fix that worked was just to use lastIndexOf() with substring. Here's the code in situ:
public void loadFile(File m_imageFile)
{
try
{
ZipFile zipFile = new ZipFile(m_imageFile);
ZipEntry xmlZipFile = zipFile.getEntry("xml");
byte[] buffer = new byte[10000];
zipFile.getInputStream(xmlZipFile).read(buffer);
String xmlString = new String(buffer);
Xml xmlRoot = Xml.parse(xmlString.substring(0, xmlString.lastIndexOf('>')+1));
for(List<Xml> iter = xmlRoot.getNestedXml(); iter != null; iter = iter.next())
{
String layerName = iter.element().getAttributes().getValueByName("name");
m_view.getCanvasPanel().getLayers().add(
new Layer(ImageIO.read(zipFile.getInputStream(zipFile.getEntry(layerName))),
Integer.valueOf(iter.element().getAttributes().getValueByName("x")),
Integer.valueOf(iter.element().getAttributes().getValueByName("y")),
Integer.valueOf(iter.element().getAttributes().getValueByName("w")),
Integer.valueOf(iter.element().getAttributes().getValueByName("h")),
Integer.valueOf(iter.element().getAttributes().getValueByName("z")),
iter.element().getAttributes().getValueByName("name"))
);
}
zipFile.close();
} catch (FileNotFoundException e)
{
System.out.println("FileNotFoundException in MainController.loadFile()");
e.printStackTrace();
} catch (IOException e)
{
System.out.println("IOException in MainController.loadFile()");
e.printStackTrace();
}
}
Thanks for all the people that contributed. I suspect the error was either introduced by the zip process or by using the byte[] buffer. Any further feedback is appreciated.
I had some extra char at the end of XML, check the XML properly or do an online format of XML , which will throw error if XML is not proper. I used
Online XML Formatter
I just had this error in Sterling Integrator - when I looked at the file in a hex editor
it had about 5 extra lines of char(0), not space. No idea where they from, but this was precisely the issue, especially as I was basically doing a unity transform, so the xsl engine - in this case xalan - was obviously passing it through into the result. Removed extra rows after last close angle bracket, problem solved.

JAXB: Marshal output XML with indentation create empty line break on the first line

When I marshal an XML with this attribute
marshal.setProperty(Marshaller.JAXB_FRAGMENT, Boolean.TRUE);
marshal.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
it will generate an empty line break at the very top
//Generate empty line break here
<XX>
<YY>
<PDF>pdf name</PDF>
<ZIP>zip name</ZIP>
<RECEIVED_DT>received date time</RECEIVED_DT>
</YY>
</XX>
I think the reason is because marshal.setProperty(Marshaller.JAXB_FRAGMENT, Boolean.TRUE);, which remove <?xml version="1.0" encoding="UTF-8" standalone="yes"?>, leave the output xml a line break in the beginning. Is there a way to fix this? I use JAXB come with JDK 6, does Moxy suffer from this problem?
As you point out EclipseLink JAXB (MOXy) does not have this problem so you could use that (I'm the MOXy lead):
http://blog.bdoughan.com/2011/05/specifying-eclipselink-moxy-as-your.html
Option #1
One option would be to use a java.io.FilterWriter or java.io.FilterOutputStream and customize it to ignore the leading new line.
Option #2
Another option would be to marshal to StAX, and use a StAX implementation that supports formatting the output. I haven't tried this myself but the answer linked below suggests using com.sun.xml.txw2.output.IndentingXMLStreamWriter.
https://stackoverflow.com/a/3625359/383861
Inspired by first option of bdoughan's comment in this post, I've written a custom writer to remove blank line in xml file like the following ways:
public class XmlWriter extends FileWriter {
public XmlWriter(File file) throws IOException {
super(file);
}
public void write(String str) throws IOException {
if(org.apache.commons.lang3.StringUtils.isNotBlank(str)) {
super.write(str);
}
}
}
To check empty line, I've used org.apache.commons.lang3.StringUtils.isNotBlank() method, you can use your own custom condition.
Then use this writer to marshal method like the following way in Java 8.
// skip other code
File file = new File("test.xml");
marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
marshaller.setProperty(Marshaller.JAXB_FRAGMENT, true);
try (FileWriter writer = new XmlWriter(file)) {
marshaller.marshal(object, writer);
}
It'll remove <?xml version="1.0" encoding="UTF-8" standalone="yes"?> tag, also will not print blank line.
Since I was marshalling to a File object, I decided to remove this line afterwards:
public static void removeEmptyLines(File file) throws IOException {
long fileTimestamp = file.lastModified();
List<String> lines = Files.readAllLines(file.toPath());
try (Writer writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file), StandardCharsets.UTF_8))) {
for (String line : lines) {
if (!line.trim().isEmpty()) {
writer.write(line + "\n");
}
}
}
file.setLastModified(fileTimestamp);
}

Categories

Resources