XML and Java... Confused about Values versus Index? - java

I am trying to understand how to read out XML files using Java. I would like to have one XML tag, lets call it enable, pass a true to a method and another XML tag that provides a number to another method. I would like to pass the true by having the line in my XML file and pass the number as valueofnumber. I am reading out the XML file using a series of if statements testing for certain strings in an XML file:
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException
{
if (localName.equals("enabled")){
currentConfig.setenabled(true);
}
else if (localName.equals("number")){
currentConfig.setnumber(Double.parseDouble(attributes.getValue("number")))
}
}
I am getting confused as to how extract the value of number from the XML file. Currently I am just getting an error that nothing is present when I try getIndex() as well.
Thanks very much in advance

The getValue() method you're calling takes a qualified name, meaning XML namespace + local name in the format :. Your XML document probably uses a namespace, which you'd have to supply. If there's no namespace, you might need to use the other getValue() method and pass null for the namespace. It all depends a lot on what parser you're using and how it's configured. You'd be better advised to move to a higher-level parsing library that takes care of these nuances for you:
StAX isn't much higher level than SAX, but it still has a friendlier and generally more intuitive interface.
JDOM, being a DOM parser, will be slightly less efficient, but it makes parsing XML incredibly easy.
Commons Digester is kind of a rules-based XML parsing engine. You establish rules for what you want to happen when this or that element or attribute is encountered, and then run the digester. Method calls are one of the rules you can set, as is creation and population of a POJO.
JAXB or XStream will completely remove the guesswork and bind the XML straight to POJOs with minimal configuration. Then you don't even have to deal with XML and can work with normal objects instead.
Edit: (Based on the XML sample) Your "number" isn't an attribute. It's a nested element. That's why you're having trouble getting it from the Attributes object. My other advice on other libraries still stands.

Related

How to read XML attribute values using JXPathContext if reference is missing

Given following XML, we are using JXPathContext to create Java object out of it.
<fb1:Activity fb2:metadata="Activity1">
</fb1:Activity>
<fb21:ActivityMetadata fb2:id="Activity1">
<fb1:Response>XXXX</fb1:Response>
</fb1:ActivityMetadata>
reading the value -
String responseCode = context.getValue("metadata[1]/Response/value");
This is working as expected. Now lets say for instance, the reference from Activity to ActivityMetadata is missing. What can we do to read the response value in such case? It is guaranteed that there can only be one ActivityMetadata element at max in the XML.
Incomplete XML - need to parse this
<fb1:Activity fb2:metadata="">
</fb1:Activity>
<fb21:ActivityMetadata>
<fb1:Response>XXXX</fb1:Response>
</fb1:ActivityMetadata>
The path you're giving us doesn't match the document you're showing us.
Ignoring that for a moment:
XML doesn't constrain the tree at all; that's done by the XML Schema (if there is one) and/or the applications which process that kind of document. Only the folks who defined this particular kind of document, or the schema, or the code can tell you whether there are any guarantees about only one ActivityMetadata being present or what it means if there's more than one.
XML is pure syntax. Meaning is someone else's problem.

XPath, Java and serialized xml

Assuming some xml like
<foo>
<bar>test</bar>
</foo>
Evaluating an expression with returnType = String like
/foo/bar
will return "test". However, I'd like to get the serialized xml instead, so something like
<bar>test</bar>
should be returned instead. As I can not check for the returnType in java's xpath implementation (xerces), I cannot simply get an object as result and if it indeed is a node, convert it to serialized xml.
Note: I don't know whether the expression will actually return a node, a string, a number or whatever so I cannot provide an appropriate return type to the eval function except string which, as my problem states, returns the text content and not the serialized xml.
So I am curious -> is there either a java- or (preferred) a xpath-way (function?) to get serialized xml for type string instead of the text children of the selected node?
thanks!
Alex
use the xpath return type XPathConstants.NODE and then you can serialize the returned Node yourself.
Now, you are right to observe that it's difficult to discover the return type of the result; this is a real design weakness of JAXP.
If it's a problem to you, consider using Saxon's s9api interface, which returns XdmValue objects whose type you can interrogate; you also get XPath 2.0 access as a bonus.
As Michael Kay answered, this is difficult in JAXP (the native Java interface).
In Mr Kay's Saxon library's s9api API (see Evaluating XPath Expressions using s9api), once you've called XPathSelector.evaluate() or XPathSelector.evaluateSingle() you can get the XML serialisation by calling XdmValue.toString().
However, if the XPath selected an attribute (e.g. //#name) you will still get the XML serialisation, e.g. name="value". You can call XdmItem.getStringValue(), but for elements that method will return the same values you're already seeing - the textual content of the element, not the serialisation. I've posted separately about how to distinguish between attributes and elements returned from Saxon s9api.

Apache Digester How do I get some xml nested within a tag as a literal string?

I am parsing a XML with Digester. A part of it contains content formatted in cryptic pseudo-HTML XML elements which I need to transform into an PDF. That will be done via Apache FOP. Hence I need to access the xml element which contains the content elements directly and pipe it to FOP. To do so the Digester FAQ states that one either
Wrap the nested xml in CDATA
or
If this can't be done then you need to use a NodeCreateRule to create a DOM node representing the body tag and its children, then serialise that DOM node back to text
Since it is a third party XML the CDATA approach could only be done via (another) XSLT which I hestitate to do.
It looks like this issue should be solvable via NodeCreateRule but I can not figure out how to get it done.
The documentation states that NodeCreateRule will push a Node onto the stack however I can only get it to pass null.
I tried
digester.addRule(docPath + "/contents", new NodeCreateRule());
digester.addCallMethod(docPath + "/contents", "setContentsXML");
setContentsXML expects a Element parameter.
I also tried this and this without any luck.
I am using the latest stable Digester. Would be thankful for any advice.
Update:
I found the bug . The result on my system is null, too. I am using JDK 6u24
The problem in my case as well as the linked bug lays in the proper serialisation of an Element. In my case the mentioned null value was not returned by Digester but by Element#toString(). I assume something changed since JDK 1.4.
By the bug example:
result contains another (text-)node with the actual content. toString() however simply takes the content of the Element instance it is called uppon.
The Element tree has to be serialized explicitly. For example with the serialization method in this useage example of NodeCreateRule.
In case someone else tries to use that with Digester 3: you have to change the method signature SetSerializedNodeRule#end() to SetSerializedNodeRule#end(String, String).

Java XML Library that keeps attributes in order?

Is there any java xml library that can create xml tags with attributes in a defined order?
I have to create large xml files with 5 to 20 attributes per tag and I want the attributes ordered for better readability. Attributes which are always created should be at the beginning, followed by common attributes and rarely used attributes should be at the end.
Here's an easy way to create a custom outputter based on jdom:
JDOM uses XmlOutputter to convert a DOM into text output. Attributes are printed with the method:
protected void printAttributes(Writer out, List attributes, Element parent,
NamespaceStack namespaces) throws IOException
As you see, the attributes are passed with a simple list. The (simple) idea would be to subclass this outputter and override the method:
#Override
protected void printAttributes(Writer out, List attributes, Element parent,
NamespaceStack namespaces) throws IOException {
Collections.sort(attributes, new ByNameCompartor());
super.printAttributes(out, attribute, parent, namespaces);
}
The implementation of a comparator shouldn't be that difficult.
I'm not aware of any such library.
The problem is that the XML information model explicitly states that the order the attributes of a tag are irrelevant. Therefore, any in-memory XML representation that goes to the effort of preserving the ordering will more memory than is necessary and/or implement the attribute set/get methods suboptimally.
My advice would be to implement the ordering of attributes in the presentation of your XML content; e.g. in your XML editor.
I should point out that my Answer answers the Question as written - about "keeping the attributes in order". This could mean:
preserving the order of insertion, or
keeping in attributes sorted according to some ordering rule; e.g. ascending order.
From your comment below that you really just want to output the attributes in sorted order. This is a simpler problem, and can be addressed by customizing the code that serializes a DOM. Andreas_D explains how you can do this with JDOM, and there may be other options.
The other point is that something is a bit broken if the order of attributes in the input to a tool makes a significant difference to how it behaves. Because the order of XML attributes is supposed to be non-significant ...

Generate valid XML name in Java

Are there any helpers that will transform/escape a string to be a valid XML name ?
Example, I have the string max(OfAll) and need to generate some XML like e.g.
<max(OfAll)>SomeText</<max(OfAll)>
That's obviously not a valid name, are there some helper methods that can transform the string to be a valid xml name ?
(For comparison, .NET have some methods that the above xml fragment would be:
<max_x028_OfAll_x028_>SomeText</<max_x028_OfAll_x028_>)
The encoding in your .NET example looks like the one defined in ISO9075. I don't think there is a built-in implementation in the jdk, but this encoding is also used by content repositories like alfresco or jackrabbit for their xml import/exports and query apis. A quick search turned up these two implementations, both available under open source licenses:
http://www.docjar.com/html/api/org/apache/jackrabbit/util/ISO9075.java.html
http://kickjava.com/src/org/alfresco/util/ISO9075.java.htm
One class which may be of use in other situations is StringEscapeUtils in the apache commons-lang project. It can escape text for use in XML documents, I'm not aware of anything to escape XML element names.
Could you not generate something more readable such as
<aggregation type="max(OfAll)">SomeText</aggregation>
There are lots of libraries available to marshall/unmarshall objects to xml and back including JAXB (part of the JDK), JiBX, Castor, XStream
I don't know of any helper methods for that, but rules here http://www.w3.org/TR/REC-xml/#NT-Name are pretty straightforward, so it should be easy to implement one.
As should be clear, normal XML escaping (replacing inappropriate characters with character entities) does not result in a valid XML identifier.
For the record, what you are doing is frequently called "name mangling".

Categories

Resources