XML parsing problem - java

I'm having this strange XML parsing problem.
I have this XML string I'm trying to parse
<?xml version="1.0"?>
<response status="success">
<lot>32342</lot>
</response>
I'm using XPath with Java in order to do this. I'm using the Xpath expression "/response/#status" to find the text "success". However whenever I evaluate this expression I get an empty string.
However I am able to successfully parse this string using "/response/#type"
<?xml version="1.0"?>
<response type="success">
<lot>32342</lot>
</response>
So why would simply changing the name of the attribute change the return string to nothing?
is = new InputSource(new StringReader(testWOcreateStrGood));
xPathexpressionSuccess = xPath.compile("/response/#status");
responseStr = xPathexpressionSuccess.evaluate(is);
reponseStr is the string I posted earlier with the "status" attribute
Also I declared testWOcreateStrGood as below
private String testWOcreateStrGood = "<?xml version=\"1.0\"?>\n" +
"<response status=\"success\">\n" +
"<lot>32342</lot>\n" +
"</response>";

So why would simply changing the name of the attribute change the return string to nothing?
It shouldn't. You must be doing something else wrong, e.g. accessing the wrong XML document or not actually using the XPath expression you believe to be using.
To your code example:
Check the API documentation for InputSource. You cannot pass an XML document as a string directly to the constructor.

Related

How to stop Jackson from parsing an element?

I have a XML Document where there are nested tags that should not be interpreted as XML tags
For example something like this
<something>cbaabc</something> should be parsed as a plain String "cbaabc" (it should be mentioned that the document has other elements as well that get parsed just fine). Jackson tho tries to interpret it as an Object and I don't know how to prevent this. I tried using #JacksonXmlText, turning off wrapping and a custom Deserializer, but I didn't get it to work.
The <a should be translated to <a. This back and forth conversion normally happens with every XML API, setting and getting text will use those entities &...;.
An other option is to use an additional CDATA section: <![CDATA[ ... ]]>.
<something><![CDATA[cbaabc]]></something>
If you cannot correct that, and have to live with an already corrupted XML text, you must do your own hack:
Load the wrong XML in a String
Repair the XML
Pass the XML string to jackson
Repairing:
String xml = ...
xml = xml.replaceAll("<(/?a\\b[^>]*)>", "<$1>"); // Links
StringReader in = new StringReader(xml);

Convert XML to JSON using org.apache.commons.json.utils.XML toJson - Changes empty element to "true"

I'm trying to to convert an xml string to Json in Java.
Here is a sample code:
import org.apache.commons.json.utils.XML;
String test = "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?><a><b>val1</b><d/></a>";
InputStream is = new ByteArrayInputStream(test.getBytes());
String jsonString = XML.toJson(is);
The result is:
{"a":{"b":"val1","d":true}}
I don't understand why d's value is set to true ?
Also is there any way to get this result:
{"a":{"b":"val1","d":""}}
I did a little investigation, org.apache.apache.wink.json4j.utils.XML.toJson method uses SAXParser , i couldn't debugged(it warned me due to missing line number attributes(is it because of decompiler?), anyway) it, but i think it makes true for empty tag.
Then I debugged apache.sling.commons.xml.XML.toJSONObject it has own XMLTokenizer. In my estimation because of SAXParser empty tag comes true.

java convert string to xml and parse node [duplicate]

This question already has answers here:
How to parse a String containing XML in Java and retrieve the value of the root node?
(6 answers)
Closed 9 years ago.
Hello I am getting back a string from a webservice.
I need to parse this string and get the text in error message?
My string looks like this:
<response>
<returnCode>-2</returnCode>
<error>
<errorCode>100</errorCode>
<errorMessage>ERROR HERE!!!</errorMessage>
</error>
</response>
Is it better to just parse the string or convert to xml then parse?
I'd use Java's XML document libraries. It's a bit of a mess, but works.
String xml = "<response>\n" +
"<returnCode>-2</returnCode>\n" +
"<error>\n" +
"<errorCode>100</errorCode>\n" +
"<errorMessage>ERROR HERE!!!</errorMessage>\n" +
"</error>\n" +
"</response>";
Document doc = DocumentBuilderFactory.newInstance()
.newDocumentBuilder()
.parse(new InputSource(new StringReader(xml)));
NodeList errNodes = doc.getElementsByTagName("error");
if (errNodes.getLength() > 0) {
Element err = (Element)errNodes.item(0);
System.out.println(err.getElementsByTagName("errorMessage")
.item(0)
.getTextContent());
} else {
// success
}
I would probably use an XML parser to convert it into XML using DOM, then get the text. This has the advantage of being robust and coping with any unusual situations such as a line like this, where something has been commented out:
<!-- commented out <errorMessage>ERROR HERE!!!</errorMessage> -->
If you try and parse it yourself then you might fall foul of things like this. Also it has the advantage that if the requirements expand, then its really easy to change your code.
http://docs.oracle.com/cd/B28359_01/appdev.111/b28394/adx_j_parser.htm
It's an XML document. Use an XML parser.
You could tease it apart using string operations. But you have to worry about entity decoding, character encodings, CDATA sections etc. An XML parser will do all of this for you.
Check out JDOM for a simpler XML parsing approach than using raw DOM/SAX implementations.

how to pass all parameters of input XML document into output XML document

general task is XML document processing. It'd be nice to have all input XML parameters usually given as <p:MyDocument xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:p="urn:where/to/look/4/schema" schemaVersion="1.3">
passed to an output XML.
More specifically, how to get them from the input XML?
I can pass them (if I know them in advance) as
Element rootElement = document.createElementNS("urn:hard/coded", root);
rootElement.setAttribute("schemaVersion", "myWish");
You can get all the attribute details using document.getAttributes()

Parsing 'pseudo' XML (that is, not well formed) in java?

I have some xml that looks like this:
<xml><name>oscar</name><race>puppet</race><class>grouch</class></xml>
The tags change and are variable, so there won't always be a 'name' tag.
I've tried 3 or 4 parses and they all seem to choke on it. Any hints?
Just because it doesn't have a defined schema, doesn't mean it isn't "valid" XML - your sample XML is "well formed".
The dom4j library will do it for you. Once parsed (your XML will parse OK) you can iterate through child elements, no matter what their tag name, and work with your data.
Here's an example of how to use it:
import org.dom4j.*;
String text = "<xml><name>oscar</name><race>puppet</race><class>grouch</class></xml>";
Document document = DocumentHelper.parseText(text);
Element root = document.getRootElement();
for ( Iterator i = root.elementIterator(); i.hasNext(); ) {
Element element = (Element) i.next();
String tagName = element.getQName();
String contents = element.getText();
// do something
}
This is valid xml; try adding an XML Schema that allows for optional elements. If you can write an xml schema, you can use JAXB to parse it. XML allows for having optional elements; it isn't too "strict" about it.
Your XML sample is well-formed XML, and if anything "chokes" on it then it would be useful for us to know exactly what the symptoms of the "choking" are.

Categories

Resources