XPath, Java and serialized xml - java

Assuming some xml like
<foo>
<bar>test</bar>
</foo>
Evaluating an expression with returnType = String like
/foo/bar
will return "test". However, I'd like to get the serialized xml instead, so something like
<bar>test</bar>
should be returned instead. As I can not check for the returnType in java's xpath implementation (xerces), I cannot simply get an object as result and if it indeed is a node, convert it to serialized xml.
Note: I don't know whether the expression will actually return a node, a string, a number or whatever so I cannot provide an appropriate return type to the eval function except string which, as my problem states, returns the text content and not the serialized xml.
So I am curious -> is there either a java- or (preferred) a xpath-way (function?) to get serialized xml for type string instead of the text children of the selected node?
thanks!
Alex

use the xpath return type XPathConstants.NODE and then you can serialize the returned Node yourself.

Now, you are right to observe that it's difficult to discover the return type of the result; this is a real design weakness of JAXP.
If it's a problem to you, consider using Saxon's s9api interface, which returns XdmValue objects whose type you can interrogate; you also get XPath 2.0 access as a bonus.

As Michael Kay answered, this is difficult in JAXP (the native Java interface).
In Mr Kay's Saxon library's s9api API (see Evaluating XPath Expressions using s9api), once you've called XPathSelector.evaluate() or XPathSelector.evaluateSingle() you can get the XML serialisation by calling XdmValue.toString().
However, if the XPath selected an attribute (e.g. //#name) you will still get the XML serialisation, e.g. name="value". You can call XdmItem.getStringValue(), but for elements that method will return the same values you're already seeing - the textual content of the element, not the serialisation. I've posted separately about how to distinguish between attributes and elements returned from Saxon s9api.

Related

Jayway JSONPath: how to select terminal nodes

I'm using Jayway JSONPath.
Given a JSON document having nodes with the same name at different structure levels, how would I select only those nodes that are terminal nodes, i.e. having only text or no content?
XPath would allow not(child::*) as a predicate, but I can't see a JSONPath equivalent.
Unfortunately, no JSONPath implementation (as of now) offers such an operation. However, some of the more advanced implementations that expanded on Goessner's reference have operations that get close to this.
One workaround is to use check the type of a node, if possible. For instance this is possible in the JavaScript JSONPath-Plus implementation using Type selectors for JSON types: e.g. #null(), #boolean(), #number(), #string(), #array(), #object()
#integer() and others. This allow us for instance to get only numeric values:
$..*#number()
Combined with a more meaningful path selection we might get close. Nonetheless, this will not yield terminal values only but at least avoids array and object type properties.
Another workaround that is should work with basic data types is to use the regex matcher available in quite a few implementations (like JayWays, many JavaScript implementations, etc) to interpret the type of a node, e.g. again let's say numeric values
$..[?(#.price =~ /[0-9]+\.?[0-9]*/)]
Again, this will not give you terminal values only but avoids array and object type properties.

XPath with custom value for operand instead of xml

Is there is possibility of perform xpath eval with custom value instead of having xml.
Example:
count(/departmemt/employees) > 10
Here, i will provide the values for /department/employee and i want to use xpath libraray in java to take care of doing the evaluation.
It is something like the user exposed method, String getValue (String operand)...
here getValue method should get called from xpathEngine and i will take care of providing the value for each operand.
Please help me if there is any possibility of doing this.
Thanks
Durai
For use XPath-like structures in java can be used commons-jxpath
http://commons.apache.org/proper/commons-jxpath/

How to read XML attribute values using JXPathContext if reference is missing

Given following XML, we are using JXPathContext to create Java object out of it.
<fb1:Activity fb2:metadata="Activity1">
</fb1:Activity>
<fb21:ActivityMetadata fb2:id="Activity1">
<fb1:Response>XXXX</fb1:Response>
</fb1:ActivityMetadata>
reading the value -
String responseCode = context.getValue("metadata[1]/Response/value");
This is working as expected. Now lets say for instance, the reference from Activity to ActivityMetadata is missing. What can we do to read the response value in such case? It is guaranteed that there can only be one ActivityMetadata element at max in the XML.
Incomplete XML - need to parse this
<fb1:Activity fb2:metadata="">
</fb1:Activity>
<fb21:ActivityMetadata>
<fb1:Response>XXXX</fb1:Response>
</fb1:ActivityMetadata>
The path you're giving us doesn't match the document you're showing us.
Ignoring that for a moment:
XML doesn't constrain the tree at all; that's done by the XML Schema (if there is one) and/or the applications which process that kind of document. Only the folks who defined this particular kind of document, or the schema, or the code can tell you whether there are any guarantees about only one ActivityMetadata being present or what it means if there's more than one.
XML is pure syntax. Meaning is someone else's problem.

Apache Digester How do I get some xml nested within a tag as a literal string?

I am parsing a XML with Digester. A part of it contains content formatted in cryptic pseudo-HTML XML elements which I need to transform into an PDF. That will be done via Apache FOP. Hence I need to access the xml element which contains the content elements directly and pipe it to FOP. To do so the Digester FAQ states that one either
Wrap the nested xml in CDATA
or
If this can't be done then you need to use a NodeCreateRule to create a DOM node representing the body tag and its children, then serialise that DOM node back to text
Since it is a third party XML the CDATA approach could only be done via (another) XSLT which I hestitate to do.
It looks like this issue should be solvable via NodeCreateRule but I can not figure out how to get it done.
The documentation states that NodeCreateRule will push a Node onto the stack however I can only get it to pass null.
I tried
digester.addRule(docPath + "/contents", new NodeCreateRule());
digester.addCallMethod(docPath + "/contents", "setContentsXML");
setContentsXML expects a Element parameter.
I also tried this and this without any luck.
I am using the latest stable Digester. Would be thankful for any advice.
Update:
I found the bug . The result on my system is null, too. I am using JDK 6u24
The problem in my case as well as the linked bug lays in the proper serialisation of an Element. In my case the mentioned null value was not returned by Digester but by Element#toString(). I assume something changed since JDK 1.4.
By the bug example:
result contains another (text-)node with the actual content. toString() however simply takes the content of the Element instance it is called uppon.
The Element tree has to be serialized explicitly. For example with the serialization method in this useage example of NodeCreateRule.
In case someone else tries to use that with Digester 3: you have to change the method signature SetSerializedNodeRule#end() to SetSerializedNodeRule#end(String, String).

convert from byte array and mime type to string / object

Given a MIME string, how can I parse it to extract the charset? And is there a utility to map the different MIME types to object types (e.g., return 'xml' for both text/xml and application/xml)
Well, given a byte array of a known character set, the conversion to String is trivial:
String result = new String(byteArray, charset);
So your first question is reduced to "what's the easiest way to extract a charset from the mime type?". This depends on the range of inputs you expect to be able to handle and what libraries you're already using. One way of doing this, for example, is using javax.mail.internet.ContentType to do the parsing; I'm sure other libraries provide similar functionality.
As for the second part, I'm not sure what you mean by "convert to an object". Everything in Java (excluding primitives) is an Object already; if you're talking about something more specific, then you'd need to be more specific. There's no generic framework available that will magically convert from anything to anything, so you'll need to narrow down your requirements there a bit.
Jersy's MediaType has a valueOf static method to parse MIME.
It also has support for creating object given a value stream. Unfortunately, it looks like it cannot be used separately.
With the approach of Andrzej, if you mean that String object which you just got is an XML then there are ways to convert it to Java object. A simple technique is;
Create an XML Document Object from the String.
Convert the XML Document object to Java object.
There are various libraries/APIs available to do the 2nd part. Few to refer;
Castor (http://www.castor.org/xml-framework.html)
XStream (http://x-stream.github.io/)
The libraries are fairly easy to use.
JSON.parse(JSON.stringify(mime))
This worked for me

Categories

Resources