JDOM - SaxBuilder - Content is not allowed in prolog - java

I am having trouble parsing an XML file into a JDOM Document instance using the SAXBuilder.
It throws the following exception:
[Fatal Error] :1:1: Content is not allowed in prolog.
I have found and read all those threads on Stack Exchange and on other places in the Internet and tried various things to debug the error.
I have end up with the following code snippet, which throws as well.
String template = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<server></server>";
InputStream in = new StringBufferInputStream(template);
return saxBuilder.build(in);
What's wrong with it?
I am ashamed to admit that but it turned out that the error wasn't produced by the snippet I have shown here but rather at a later point where I was comparing the parsed XML against another one using the XMLUnit library.
The think that made me believe that the error was in the presented lines was the content of the error message.
I believe it would be appropriate to close (and delete, if that's possible) this question as it does not mean any value.

This error usually means you have text before your xml declaration.
In your snippet the xml seems fine. The issue may not be in your document though. If you have a schema or other referenced xml file, the error could in fact refer to one of them.

I suspect the problem is somewhere else. The following code (using dom4j) works for me:
public static void main(String[] args) throws DocumentException {
String template = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<server></server>";
SAXReader saxReader = new SAXReader();
InputStream is = new StringBufferInputStream(template);
Document document = saxReader.read(is);
System.out.println(document.asXML());
}
Note also that StringBufferInputStream is deprecated. An alternative is
StringReader sr = new StringReader(template);
Document document = saxReader.read(sr);
So, the problem is not in your XML snippet, but probably in saxBuilder.build(...)

Related

Problems with JAXB and UTF-16 encoding

Hi I have a small APP that reads content from an xml file and put it into a corresponding Java object.
Here is the XML:
<?xml version="1.0" encoding="UTF-16"?>
<Marker>
<TimePosition>2700</TimePosition>
<SamplePosition>119070</SamplePosition>
</Marker>
here is the corresponding Java code:
JAXBContext jaxbContext = JAXBContext.newInstance(MarkerDto.class);
Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();
InputStream inputStream = new FileInputStream("D:/marker.xml");
Reader reader = new InputStreamReader(inputStream, StandardCharsets.UTF_16.toString());
MarkerDto markerDto = (MarkerDto) jaxbUnmarshaller.unmarshal(reader);
If I run this code I get an "Content is not allowed in prolog." exception. If I run the same with UTF-8 everything works fine. Does anyone have a clue what might be the problem?
There's several things wrong here (ranging from slightly suboptimal, to potentially very wrong). In increasing order of likelihood of causing the problem:
When constructing an InputStreamReader, there's no need to call toString() on the Charset, because that class has a constructor that takes a Charset, so simply remove the .toString():
Reader reader = new InputStreamReader(inputStream, StandardCharsets.UTF_16);
This is a tiny nitpick and has no effect on functionality.
Don't construct a Reader at all! XML is a format that's self-describing when it comes to encoding: Valid XML files can be parsed without knowing the encoding up-front. So instead of creating a Reader, simply pass the InputStream directly into your XML-handling code. Delete the line that creates the Reader and change the next one to this:
MarkerDto markerDto = (MarkerDto) jaxbUnmarshaller.unmarshal(inputStream);
This may or may not fix your problem, depending on whether the input is well-formed.
Your XML file might have encoding="UTF-16" in the header and not actually be UTF-16 encoded. If that's the case, then it is malformed and a conforming parser will decline to parse it. Verify this by opening the file with the advanced text editor of your choice (I suggest Notepad++ on Windows, Linux users probably know what their preference is) and check if it shows "UTF-16" as encoding (and the content is readable).
If I run the same with UTF-8 everything works fine.
This line suggests that that's what's actually happening here: the XML file is mis-labeling itself. This needs to be fixed at the point where the XML file is created.
Notably, this demo code provides exactly the same Content is not allowed in prolog. exception message that is reported in the question:
String xml = "<?xml version=\"1.0\" encoding=\"UTF-16\"?>\n<foo />";
JAXBContext jaxbContext = JAXBContext.newInstance();
Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();
InputStream inputStream = new ByteArrayInputStream(xml.getBytes(StandardCharsets.UTF_8));
jaxbUnmarshaller.unmarshal(inputStream);
Note that the XML encoding attribute claims UTF-16, but the actual data handed to the XML parser is UTF-8 encoded.

getDocument() constantly returns a null value

I am trying to parse an XML file using Java that lives on a network drive...I have reviewed lots of XML parsing info here but cannot find the answer I need... the problem is that the getDocument() routine constantly returns a null value even though the parser gets a accurate location and file name.
Here is the code...
String ThisXMLFile = XMLFileData.getPath();
DOMParser myXMLParser = new DOMParser();
myXMLParser.parse(ThisXMLFile);
Document doc = myXMLParser.getDocument();
Some notes:
I had to use getPath() as the getName() function did not return the fully qualified file name and path - the XML file lives on a network directory and that directory is mapped on my PC to the 'V' drive
I have imported all the required class header files for DOM objects
The variable names given above are real and accurate so if I have inadvertently used a reserved keyword in a variable declaration then please offer correction.
I have extensive programming experience in a few languages but this is my first real Java app.
all the lines of code and the variables above work, until I reach the last line and then getDocument() just sets the doc variable to null... which makes the rest of the program break.
I Believe that your are calling the wrong method... according to your code, you're executing: DOMParser.parse(systemId) when you need to call: DOMParser.parse(InputSource) ...
to create an InputSource you can can do this:
InputSource source = new InputSource(new FileInputStream(ThisXMLFile));
myXMLParser.parse(source);
Document doc = myXMLParser.getDocument();
NOTE: remember to close the opened FileInputStream!!!
XMLInputFactory XMLFactory = XMLInputFactory.newInstance();
XMLStreamReader XMLReader = XMLFactory.createXMLStreamReader(myXMLStream);
while(XMLReader.hasNext())
{
if (XMLReader.getEventType() == XMLStreamReader.START_ELEMENT)
{
String XMLTag = XMLReader.getLocalName();
if(XMLTag.equals("value"))
{
String idValue = XMLReader.getAttributeValue(null, "id");
if (idValue.equals(ElementName))
{
System.out.println(idValue);
XMLReader.nextTag();
System.out.println(XMLReader.getElementText());
}
}
}
XMLReader.next();
}
so this is the code I finally got to...it works and solves the issue of retrieving specific XML data fro a XML file. I wanted at first to use nodelists, elements, Documents, etc but those functions never did work for me... this one did - thanks to all for the answers given as they helped me think this one through...

premature end of file while parsing an xml file on android

I'm trying to read an xml file on from an android app using XOM as the XML library. I'm trying this:
Builder parser = new Builder();
Document doc = parser.build(context.openFileInput(XML_FILE_LOCATION));
But I'm getting nu.xom.ParsingException: Premature end of file. even when the file is empty.
I need to parse a very simple XML file, and I'm ready to use another library instead of XOM so let me know if there's a better one. or just a solution to the problem using XOM.
In case it helps, I'm using xerces to get the parser.
------Edit-----
PS: The purpose of this wasn't to parse an empty file, the file just happened to be empty on the first run which showed this error.
If you follow this post to the end, it seems that this has to do with xerces and the fact that its an empty file, and they didn't reach a solution on xerces side.
So I handled the issue as follows:
Document doc = null;
try {
Builder parser = new Builder();
doc = parser.build(context.openFileInput(XML_FILE_LOCATION));
}catch (ParsingException ex) { //other catch blocks are required for other exceptions.
//fails to open the file with a parsing error.
//I create a new root element and a new document.
//I fill them with xml data (else where in the code) and save them.
Element root = new Element("root");
doc = new Document(root);
}
And then I can do whatever I want with doc. and you can add extra checks to make sure that the cause is really an empty file (like check the file size as indicated by one of sam's comments on the question).
An empty file is not a well-formed XML document. Throwing a ParsingException is the right thing to do here.

Parsing xml without namespace

I have a parsing problem that appears when I try to parse from a String, containg a xml, to a org.w3c.dom.Document.
Here is a example of a xml String that i'm trying to parse:
<enviNFe xmlns="http://www.portalfiscal.inf.br/nfe" versao="2.00">
<idLote>123</idLote>
<NFe xmlns="http://www.portalfiscal.inf.br/nfe">
...
</NFe>
</enviNFe>
The problem is, that after que String had been parsed, by the following code:
private Document documentFactory(String xml) throws SAXException,
IOException, ParserConfigurationException, DocumentException, TransformerException {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
Document document = factory.newDocumentBuilder().parse(
new ByteArrayInputStream(xml.getBytes()));
return document;
}
The tag NFe loads without the namespace (xmlns="http://www.portalfiscal.inf.br/nfe")
I want to know why this happens, and what I could do to solve this.
Any help will be great.
Thanks, and sorry for my english.
------EDIT----
For better understanding:
This xml will be signed right after de parsing, and will be sent to a Government's server(Brazil).
After this, I do another request to this server, to verify if it was processed or not. If it was, I will get a positive response in case of any error.
The first problem I had, was that the xml was malformed. This happened because I was sending the xml without that namespace in the tag NFe.
To solve this I added it(namespace) right in the File, after the xml had being signed.
This problem in fact had been solved, but another occurred: the difference in the signature.
Because I signs the xml without the namespace, and send with it.
From what i can put together from your various comments, i think you are misunderstanding how xml works. you indicate that you manually added the namespace to the NFe element. however, in your xml example, the NFe node already has that namespace.
In this xml:
<enviNFe xmlns="http://www.portalfiscal.inf.br/nfe" versao="2.00">
<idLote>123</idLote>
<NFe>
...
</NFe>
</enviNFe>
all of the nodes have the "http://www.portalfiscal.inf.br/nfe" namespace. by putting the xmlns="..." attribute on the parent node, the namespace is applied to that node and all of the child nodes with the same prefix (in this case, no prefix).
It is returning the correct document. To test it you can just walk through your document.
doc.getFirstChild().getFirstChild().getNextSibling().getNextSibling().getNextSibling().getNamespaceURI();
Or try to get the tag by it's name:
NodeList tags = doc.getElementsByTagNameNS("http://www.portalfiscal.inf.br/nfe", "NFe");

How to determine whether a given string is an .xml file

I have an issue that I get some some response as a String.
This String could be a normal string,number etc.. or an .xml file.
Now ,when I get an xml file, I want to treat it differently.
I am not able to distinguish between a string or an .xml file.
Also, this xml file could have some syntatic error.
Please suggest , how do I go ahead
Code is like this:
Document document = reader.read(new StringReader(xml));
where xml can be a string or an xml file itself.
If xml is a string , it is fine but if it is an xml file and with some syntax error then it should throw exception
If it is a proper XML document it should begin with a XML declaration. If that's there, it's intended to be a conforming XML document. If that's not there it cannot be a conforming XML document.
If you are using a coding language like C#, then you can use - XmlDocument.loadxml -
http://msdn.microsoft.com/en-us/library/system.xml.xmldocument.loadxml.aspx
This will throw error if the string is not in correct xml format.

Categories

Resources