I'm trying to translate a java method that uses Xpath to parse XML to one that uses JsonPath instead and I'm having trouble translating what the Xpath parser is doing so i can replicate it using JsonPath.
Here is the code that currently parses "String body".
public static String parseXMLBody(String body, String searchToken) {
String xPathExpression;
try {
// we use xPath to parse the XML formatted response body
xPathExpression = String.format("//*[1]/*[local-name()='%s']", searchToken);
XPath xPath = XPathFactory.newInstance().newXPath();
return (xPath.evaluate(xPathExpression, new InputSource(new StringReader(body))));
} catch (Exception e) {
throw new RuntimeException(e); // simple exception handling, please review it
}
}
Can anyone help translate this into a method that uses JsonPath or something similar?
Thanks
I can explain the XPath for you
//*[1] selects the first element node in the document. This would be the document element and here can be only one so it is a little strange. /* returns the same node.
//*[1]/* or /*/* return all element child nodes of the document element.
[local-name()='tagname'] filters nodes by their local name (the tag name without the namespace prefix).
The full expression //*[1]/*[local-name()='tagname'] fetches all direct child nodes of the document element with the provided tagname, ignoring namespaces. It could be simplified to /*/*[local-name()='tagname'].
Without knowing the Json, here is no chance to say how the JsonPath should look like. I would not expect the Json to have a root element, but I expect the items to be different because in Json you can not have multiple siblings with the same key (You can have multiple siblings with the same node name in XML).
Related
I am trying to figure out how to go about getting the value of jxdm:ID from the following XML file:
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<My:Message
xmlns:Abcd="http://...."
xmlns:box-1="http://...."
xmlns:bulb="http://...."
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xsi:schemaLocation="http://....stores.xsd">
<Abcd:StoreDataSection>
<Abcd:DataSection>
<Abcd:FirstStore>
<box-1:Response>
<box-1:DataSection>
<box-1:Release>
<box-1:Activity>
<bulb:Date>2017-04-29</bulb:Date>
<bulb:Store xsi:type="TPIR:Organization">
<bulb:StoreID>
<bulb:ID>D79G2102</bulb:ID>
</bulb:StoreID>
</bulb:Store>
</box-1:Activity>
</box-1:Release>
</box-1:DataSection>
</box-1:Response>
</Abcd:FirstStore>
</Abcd:DataSection>
</Abcd:StoreDataSection>
</ My:Message>
I keep getting "null" as the value of node
Node node = (Node) xPath.evaluate(expression, document, XPathConstants.NODE);
This is my current Java code:
try {
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document document = builder.parse(new File("c:/temp/testingNamespace.xml"));
XPath xPath = XPathFactory.newInstance().newXPath();
String expression = "//My/Message//Abcd/StoreDataSection/DataSection/FirstStore//box-1/Response/DataSection/Release/Activity//bulb/Store/StoreID/ID";
Node node = (Node) xPath.evaluate(expression, document, XPathConstants.NODE);
node.setTextContent("changed ID");
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.transform(new DOMSource(document), new StreamResult(new File("C:/temp/test-updated.xml")));
} catch (Exception e) {
System.out.println(e.getMessage());
}
How would the correct XPath be formatted in order for me to get that value and change it?
Update 1
So something like this?
String expression = "/My:Message/Abcd:StoreDataSection/Abcd:DataSection/Abcd:FirstStore/box-1:Response/box-1:DataSection/box-1:Release/box-1:Activity/bulb:Store/bulb:StoreID/bulb:ID";
The problem is that you should access to Node by prefix (if you want to) but in a different way, like: //bulb:StoreID if you want to access StorID for example.
Then again it would still not work because you need to tell XPath how to resolve namspaces prefixes.
You should check this answer : How to query XML using namespaces in Java with XPath?
for details on how to implement and use a NamespaceContext.
The bottom line is that you need to implement a javax.xml.namespace.NamespaceContext and set it to the XPath.
XPath xpath = XPathFactory.newInstance().newXPath();
NamespaceContext context = new MyNamespaceContext();
xpath.setNamespaceContext(context);
Two things wrong here:
Your XML is not namespace-well-formed; it does not declare the used namespace prefixes.
Once namespace prefixes are properly declared in the XML and in your Java code, you use them in XPath via : not via /. So, it'd be not /Abcd/StoreDataSection but rather /Abcd:StoreDataSection (and so on for the rest of the steps in your XPath).
See also How does XPath deal with XML namespaces?
I am unable to change anything in the XML so I have to go with it as-is sadly.
Technically you might be able to use some XML tools with undeclared namespaces because this omission only renders the XML only namespace-not-well-formed. Many tools expect not only well-formed but also namespace-well-formed XML. (See Namespace-Well-Formed
for the difference)
Otherwise, see How to parse invalid (bad / not well-formed) XML? to repair your XML.
I have a xpath of an element and need to write a java code which gives me exactly the same element as an object. I believe i need to use SAX or DOM ? i m totally newbie..
xpath :
/*[local-name(.)='feed']/*[local-name(.)='entry']/*[local-name(.)='title']
Your comment suggests you want to use DOM4J, which supports XPath out of the box:
SAXReader reader = new SAXReader();
Document doc = reader.read(new File(....)); // or URL, or wherever the XML comes from
Node selectedNode = doc.selectSingleNode("/*[local-name(.)='feed']/*[local-name(.)='entry']/*[local-name(.)='title']");
(or there's also selectNodes which returns a List, if there might be more than one node matching that XPath expression - quite likely if this is an Atom feed).
But rather than using the local-name hack like this, if you know the namespace URI of the elements in your XML you can declare a prefix for this namespace and select the nodes by their fully qualified name:
SAXReader reader = new SAXReader();
Map<String, String> namespaces = new HashMap<>();
namespaces.put("atom", "http://www.w3.org/2005/Atom");
reader.getDocumentFactory().setXPathNamespaceURIs(namespaces);
Document doc = reader.read(new File(....)); // or URL, or wherever the XML comes from
List selectedNodes = doc.selectNodes("/atom:feed/atom:entry/atom:title");
read here:
https://howtodoinjava.com/java/xml/java-xpath-tutorial-example/
I found it while I were searching to find how to convert Xpath PMD-rule to java-rule,, I did not find what I need in it.
but, anyway may be you can find yours.
I have some xml that looks like this:
<xml><name>oscar</name><race>puppet</race><class>grouch</class></xml>
The tags change and are variable, so there won't always be a 'name' tag.
I've tried 3 or 4 parses and they all seem to choke on it. Any hints?
Just because it doesn't have a defined schema, doesn't mean it isn't "valid" XML - your sample XML is "well formed".
The dom4j library will do it for you. Once parsed (your XML will parse OK) you can iterate through child elements, no matter what their tag name, and work with your data.
Here's an example of how to use it:
import org.dom4j.*;
String text = "<xml><name>oscar</name><race>puppet</race><class>grouch</class></xml>";
Document document = DocumentHelper.parseText(text);
Element root = document.getRootElement();
for ( Iterator i = root.elementIterator(); i.hasNext(); ) {
Element element = (Element) i.next();
String tagName = element.getQName();
String contents = element.getText();
// do something
}
This is valid xml; try adding an XML Schema that allows for optional elements. If you can write an xml schema, you can use JAXB to parse it. XML allows for having optional elements; it isn't too "strict" about it.
Your XML sample is well-formed XML, and if anything "chokes" on it then it would be useful for us to know exactly what the symptoms of the "choking" are.
I am getting stackoverflowerror while conveting org.w3c.dom.Document to org.dom4j.Document
Code :
public static org.dom4j.Document getDom4jDocument(Document w3cDocument)
{
//System.out.println("XMLUtility : Inside getDom4jDocument()");
org.dom4j.Document dom4jDocument = null;
DOMReader xmlReader = null;
try{
//System.out.println("Before conversion of w3cdoc to dom4jdoc");
xmlReader = new DOMReader();
dom4jDocument = xmlReader.read(w3cDocument);
//System.out.println("Conversion complete");
}catch(Exception e){
System.out.println("General Exception :- "+e.getMessage());
}
//System.out.println("XMLUtility : getDom4jDocument() Finished");
return dom4jDocument;
}
log :
java.lang.StackOverflowError
at java.lang.String.indexOf(String.java:1564)
at java.lang.String.indexOf(String.java:1546)
at org.dom4j.tree.NamespaceStack.getQName(NamespaceStack.java:158)
at org.dom4j.io.DOMReader.readElement(DOMReader.java:184)
at org.dom4j.io.DOMReader.readTree(DOMReader.java:93)
at org.dom4j.io.DOMReader.readElement(DOMReader.java:226)
at org.dom4j.io.DOMReader.readTree(DOMReader.java:93)
at org.dom4j.io.DOMReader.readElement(DOMReader.java:226)
Actually i want to convert XML to string by using org.dom4j.Document's asXML method. Is this conversion possible without converting org.w3c.dom.Document to org.dom4j.Document ? How ?
when handling a heavy file, you shouldn't use a DOM reader, but a SAX one. I assume your goal is to output your document to a string.
Here you can find some differences between SAX and DOM (source) :
SAX
Parses node by node
Doesn’t store the XML in memory
We cant insert or delete a node
SAX is an event based parser
SAX is a Simple API for XML
doesn’t preserve comments
SAX generally runs a little faster than DOM
DOM
Stores the entire XML document into memory before processing
Occupies more memory
We can insert or delete nodes
Traverse in any direction.
DOM is a tree model parser
Document Object Model (DOM) API
Preserves comments
SAX generally runs a little faster than DOM
You don't need to produce a model which will need a lot of memory space. You only need to crawl through nodes to output them one by one.
Here, you will find some code to start with ; then you should implement a tree traversal algorithm.
Regards
Take a look at java.lang.StackOverflowError in dom parser. Apparently trying to load a huge XML file into a String can result in a StackoverflowException. I think it's because the parser uses regex's to find the start and end of tags, which involves recursive calls for long Strings as described in java.lang.StackOverflowError while using a RegEx to Parse big strings.
You can try and split up the XML document and parse the sections separately and see if that helps.
I have an xml document as a string without any namespace and I want to parse it using Java, JDOM and XPath, and create a object tree. Since XPAth always requires a prefix and a namespace to query, I added namespace and a prefix to the root and then later to the node I want to get, but I see Xpath requires a namespace in every node in the document but only in the root.
So in the beginning is there a way to add the namespace to all of the elements in the document object so my xpath query works correct?
There should be other mistakes and bad approches in the code as well. Will be glad for any ideas.
String response="myXmlString"
ByteArrayInputStream stream = new ByteArrayInputStream(
response.getBytes());
SAXBuilder builder = new SAXBuilder();
Document doc = builder.build(stream);
org.jdom.Element request=(org.jdom.Element) doc.getRootElement();
request.setNamespace(Namespace.getNamespace("myNamespace"));
createRequest(request);
And then
public Request createRequest(Element requestXML) {
Request request = new Request();
requestXML.detach();
Document doc = new Document(requestXML);
XPath xpath = XPath.newInstance(myExpression);
xpath.addNamespace("m", doc.getRootElement().getNamespaceURI());
xpath.selectSingleNode(doc);
}
this last line returns empty, it is not null but it throws jdom exception inside.
XPath and XML do NOT require namespace. Go back to your original XML and remove any namespace/prefix hackery in your code.