XML Parse error - java

I am getting below error pls help
"parse error:
Error on line 1 of document :
The markup in the document preceding the root element must be well-formed.
Nested exception: The markup in the document preceding the root element must be well-formed.
XML is below
<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<'env:Envelope' xmlns>:env=\"http://www.w3.org/2003/05/soap-envelope\" xmlns:ns1=\"urn:zimbraAdmin\">
xmlns:ns2=\"urn:zimbraAdmin\"><env:Header><ns2:context/></env:Header><env:Body>
<ModifyAccountRequest xmlns=\"urn:zimbraAdmin\"><id>4d41ec71-d898-42b8-b522-3c3cdc5583a0</id>
<a n=\"zimbraIsAdminAccount\">TRUE</a>
</ModifyAccountRequest></env:Body></env:Envelope>

That was terribly malformed. Issues are highlighted below:
1. Every instance of \" should be replaced with a simple " as the slash indicates a literal character to Java and is not needed in normal XML.
2. There should be no single quotes around <'env:Envelope' and I honestly have no idea where they came from.
3. The closing carat at xmlns>:env= should be removed, as should the one at the end of the physical line xmlns:ns1=\"urn:zimbraAdmin\">. Removing that carat brings the next namespace statement (which seems unnecessarily identical to ns1) into the envelope tag.
I have no idea what caused the envelope to become so malformed, but you should read up on the purpose of the values and variables you were setting with the xmlns and namespace references so next time you at least uderstand what all the parts of the XML request do. This will help you troubleshoot your own documents in the future.
In the meantime, since you seem to be at a total loss, here is the XML with the errors above corrected.
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope" xmlns:ns1="urn:zimbraAdmin" xmlns:ns2="urn:zimbraAdmin">
<env:Header>
<ns2:context/>
</env:Header>
<env:Body>
<ModifyAccountRequest xmlns="urn:zimbraAdmin">
<id>4d41ec71-d898-42b8-b522-3c3cdc5583a0</id>
<a n="zimbraIsAdminAccount">TRUE</a>
</ModifyAccountRequest>
</env:Body>
</env:Envelope>

Related

XmlPullParser skipping START_TAG?

So I'm trying to parse a GPX file using the XmlPullParser.
For the most part, I have it working, but noticed that I'm not getting what I'm expecting.
A snippet of the file:
<?xml version="1.0" encoding="utf-8"?>
<gpx xmlns="http://www.topografix.com/GPX/1/1">
<wpt lat="34.767778" lon="-88.078889">
<name>EG1325</name>
<type>Waypoint</type>
<extensions>
<groundspeak:cache>
<groundspeak:country>United States</groundspeak:country>
</groundspeak:cache>
</extensions>
</wpt>
</gpx>
I trimmed the unimportant tags here, for the purpose of this question, assuming that the file passes validation with all namespaces represented. (Because the full file does.)
The issue comes when I get past the <type> tag.
Using EITHER next() or nextToken(), I will get the END_TAG event for the <type> tag. Then my next event will be a TEXT event, an the text will contain \n. The event after that will be the START_TAG, but for the <groundspeak:cache> and NOT the <extensions> tag.
I seem to get this both for using the nextToken() and next() calls. Is this expected?
Edit to add: The only setting I am setting in code for the XmlPullParser is:
XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
factory.setNamespaceAware(false);
Check your xml file. Some xml files contains at start some extra bytes, to be specific "EF BB BF". It's called BOM (Byte-Order-Mark). When xml contains this extra bytes our XmlPullParser doesn't work properly and behave like there is no START_TAG event and goes to END_DOCUMENT.

How to parse SOAP XML with prefixes with jsoup?

This is a sample XML.
<env:Envelope xmlns:env='http://schemas.xmlsoap.org/soap/envelope/'>
<env:Header/>
<env:Body>
<ns0:NotifyRequest xmlns:ns3='http://dummyurl.com'>
<PartTotal>10</PartTotal>
<PartNo>2</PartNo>
</ns0:NotifyRequest>
</env:Body>
My server accepts these requests and this is parsed via Jsoup. I get the element by tag "ns0:NotifyRequest" then look for sub elements.
My problem is; when the prefix changes, my parser fails because the element tag "ns0:NotifyRequest" is written hard-coded, it gives an error when received XML is like "ns3:NotifyRequest".
Is there a way to ignore this prefix and get the NotifyRequest element? I know I can get the inner elements not directly from their 1st level upper element. (I mean I can use BodyElement.getElementsByTag("PartTotal") instead of NotifyRequestElement.getElementsByTag("PartTotal"), they do the same job) But I want to use regex or something and ignore that random prefix and get the NotifyRequest element.

Missing NameSpace Information In XML file using EXIficient

I am using EXIficient to convert XML data to EXI and back to XML. Here, i use their EXIficientDemo class. Sample Code:
EXIficientDemo sample = new EXIficientDemo();
sample.parseAndProofFileLocations("FilePath");
sample.codeSchemaLess();
Firstly it converted xml file to EXI then back to XML, when it generate XML from previously generated EXI's file, it loses some information about Namespace.
Actual XML File:
<?xml version="1.0" encoding="utf-8"?>
<tt xml:lang="ja" xmlns="http://www.w3.org/ns/ttml"
xmlns:tts="http://www.w3.org/ns/ttml#styling"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<body>
<div>
<p xml:id="s1">
<span tts:origin="somethings">somethings</span>
</p>
</div>
</body>
Generated XML File By EXIficient
<?xml version="1.0" encoding="UTF-8"?>
<ns3:tt xmlns:ns3="http://www.w3.org/ns/ttml"
xml:lang="ja"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ns3:body><ns3:div>
<ns3:p xml:id="s1">
<ns3:span xmlns:ns4="http://www.w3.org/ns/ttml#styling"
ns4:origin="somethings">somethings</ns3:span>
</ns3:p>
</ns3:div></ns3:body>
In the generated XML file, it is missing xmlns:tts="http://www.w3.org/ns/ttml#styling"
How to fixed this problem? If you can, please help me.
EXIficient may be suppressing unused namespaces. Your example doesn't show any use of the ttm namespace.
As you can see, it didn't retain the namespace prefix for the ttml namespace either (changed to ns3). The generated XML is perfectly valid if the ttml#metadata namespace is unused.
Update
With the updated question, where namespace ttml#styling is used by the origin attribute of the span element, the namespace is retained in the rebuilt XML, but it has been moved to the span element.
This is still a very valid XML document.
Namespace declarations (xmlns) can appear anywhere in a XML document, and applies to the element on which it appears, and all subelements (unless overridden, which is very unusual).
The same namespace can be declared many times on different elements. For simplicity and/or optimization, it is common to declare all namespaces up front, on the root element, using different prefixes, but it is not required to do so.
I read this question by accident and rather late unfortunately.
Just in case people are still struggling with this and are wondering what they can do.
As it was pointed out EXIficient behaves just fine with regards to namespace handling.
Having said that, the EXI specification allows one to preserve prefixes and namespaces (see Preserve Options).
In EXIficient one can set these options accordingly,
e.g.,
EXIFactory.getFidelityOptions().setFidelity(FidelityOptions.FEATURE_PREFIX, true);

Parsing data inside CDATA element

i need to parse a XML file that looks like this
1.<?xml version="1.0" encoding="UTF-8"?>
2.<Root>
3.<Record>
4.<in><![CDATA[<?xml version="1.0" encoding="UTF-8"?><XML><Attribute AttrID="A">Test</Attribute>-<Attribute AttrID="B"> <![CDATA[Aap Noot Mies]]> </Attribute>]]></XML></in>
5.<out><![CDATA[]]></out>
6.</Record>
7.</Root>
I am getting a erro while parsing line number 4 Is there any way to escape a CDATA end token ( ]]> ) within a CDATA section in an xml document.
Your input is not well formed there are several errors I think you need to fix whatever generated that to generate something more like
<?xml version="1.0" encoding="UTF-8"?>
<Root>
<Record>
<in><![CDATA[<?xml version="1.0" encoding="UTF-8"?><!-- - --><XML><Attribute AttrID="A">Test</Attribute>-<Attribute AttrID="B"> <![CDATA[Aap Noot Mies]]<![CDATA[> </Attribute></XML>]]></in>
<out><![CDATA[]]></out>
</Record>
</Root>
Note that the outer CDATA needs <![CDATA[ not <!CDATA[ the first use of ]]> needs to be quoted (for example by stopping and starting the outer CDATA section as here). The outer ]]> needs to be moved after the </XML> so that the end as well as the start of the element is quoted.
That makes the file technically well formed, although elements with name XML (or in general starting with xml in upper or lower case are reserved by the W3C for use in XML related specifications and should not be used in user XML files unless it is a specific element or attribute (such as xmlns defined by the W3C)
In addition I added a (quoted) comment around the dash after the XML declaration as if that CDATA section were extracted and made into an XML document it would make the resulting document non-well formed as only white space or comments and PIs are allowed before the first element.

XML Illegal Attribute Value

I am using the SAX parser in java to read some XML. The XML I am giving it has problems and is causing the parse to fail. Here is the error message:
11-18 10:25:37.290: W/System.err(3712): org.xml.sax.SAXParseException: Illegal: "<" inside attribute value (position:START_TAG <question text='null'>#1:23 in java.io.InputStreamReader#4074c678)
I have a feeling that it does not like the fact that I have some HTML tags inside of a string in the XML. I would think that anything inside the quotes gets ignored from a syntax standpoint. Also, is it valid to use single quotes here? Here is an example:
<quiz>
<question text="<img src='//files/alex/hilltf.PNG' alt='hill' style='max-width:400px' /> is represented on map by cut. ">
<answer text="1"/>
<answer text="2" correct="true"/>
</question>
</quiz>
You need to escape the < inside the text attribute value. Since XML uses < and > to denote tags, it's illegal in content unless escaped or enclosed in a CDATA tag (which isn't an option for an attribute value).
The error message is correct. A < must be the start of a tag, and cannot appear inside a string. It must be < instead. I don't believe the quotes is a problem.

Categories

Resources