Generate valid XML name in Java - java

Are there any helpers that will transform/escape a string to be a valid XML name ?
Example, I have the string max(OfAll) and need to generate some XML like e.g.
<max(OfAll)>SomeText</<max(OfAll)>
That's obviously not a valid name, are there some helper methods that can transform the string to be a valid xml name ?
(For comparison, .NET have some methods that the above xml fragment would be:
<max_x028_OfAll_x028_>SomeText</<max_x028_OfAll_x028_>)

The encoding in your .NET example looks like the one defined in ISO9075. I don't think there is a built-in implementation in the jdk, but this encoding is also used by content repositories like alfresco or jackrabbit for their xml import/exports and query apis. A quick search turned up these two implementations, both available under open source licenses:
http://www.docjar.com/html/api/org/apache/jackrabbit/util/ISO9075.java.html
http://kickjava.com/src/org/alfresco/util/ISO9075.java.htm

One class which may be of use in other situations is StringEscapeUtils in the apache commons-lang project. It can escape text for use in XML documents, I'm not aware of anything to escape XML element names.
Could you not generate something more readable such as
<aggregation type="max(OfAll)">SomeText</aggregation>
There are lots of libraries available to marshall/unmarshall objects to xml and back including JAXB (part of the JDK), JiBX, Castor, XStream

I don't know of any helper methods for that, but rules here http://www.w3.org/TR/REC-xml/#NT-Name are pretty straightforward, so it should be easy to implement one.

As should be clear, normal XML escaping (replacing inappropriate characters with character entities) does not result in a valid XML identifier.
For the record, what you are doing is frequently called "name mangling".

Related

Is it legal to reject WSDLs where namespaces differs only in file extension (the part after the last dot)?

I have a complicated web application which uses many web services. I have to start using a new one. The WSDL of this new service defines a target namespace which is almost the same as a namespace used in an older WSDL. Only the part after the last dot differs.
The package names deduced by JAXB are the same for them, and the ObjectFactory generated from the second one overwrites the other.
For example, one wsdl has a target namespace "http://foo.com/a.b.c", another one has "http://foo.com/a.b.c_2". Then the java package name will be com.foo.a.b for both namespaces which is a kind of collision.
I checked the JAXB spec and found this ( https://download.oracle.com/otn-pub/jcp/jaxb-2.0-fr-eval-oth-JSpec/jaxb-2_0-fr-spec.pdf?AuthParam=1542978637_f7c18a1892b0ff022071acdab6259bdd ) :
D.5.1 Mapping from a Namespace URI An XML namespace is represented by
a URI. Since XML Namespace will be mapped to a Java package, it is
necessary to specify a default mapping from a URI to a Java package
name. The URI format is described in [RFC2396]. The following steps
describe how to map a URI to a Java package name. The example URI,
http://www.acme.com/go/espeak.xsd, is used to illustrate each step.
Remove the scheme and ":" part from the beginning of the URI, if present. Since there is no formal syntax to identify the optional URI
scheme, restrict the schemes to be removed to case insensitive checks
for schemes “http” and “urn”.
//www.acme.com/go/espeak.xsd
Remove the trailing file type, one of .?? or .??? or .html.
//www.acme.com/go/espeak
...
Probably there are workarounds for this on my side, but I would like to have the provider of the web services to "correct" the situation by using "proper" namespaces which are not differing in only the last part (file name extension in JAXB parlance?).
I am looking for arguments for "my case".
Please see Namespaces in XML 1.0 §2.3 Comparing URI References:
URI references identifying namespaces are compared when determining whether a name belongs to a given namespace, and whether two names belong to the same namespace. [Definition: The two URIs are treated as strings, and they are identical if and only if the strings are identical, that is, if they are the same sequence of characters. ] The comparison is case-sensitive, and no %-escaping is done or undone.
If your namespaces "only differ in file extension", these are different namespaces. If your tools generate the same package for them, the problem is on your side, not on the side of the WSDL author.
So no, you do not have really good arguments for "your case", sorry.
The fix is trivial: simply config target package per namespace. See the following question, for instance:
CXF: How to change package of WSDL imported XML Schema using JAXB external binding file?

Mustache kind of String replacement in Java

My application allows users to define few templates for text etc. Eg: one of the shortcuts could be hi {{name}}, nice to meet you.
I have a complex json which has name and lot of inner jsons. I am looking for a good mustache kind of implementation in java which can replace the values of json into the string. Currently I am iterating through each key and replacing the string but I am looking for more elegant solution which gives the users more power in their templating like loops, conditions etc similar to mustache/handlebars.
Though mustache for java looks good, I haven't seen any implementation which can replace with a JSON. All examples applies on an object but not on a json object. Looks to me that internally, it uses an object mapper to convert an object to object and somehow it applies that.
Perhaps I can convert JSON into a map and provide it.
Probably I am missing something. Thanks.
You have to convert the JSON string to a Java object. You can use a nested Map, Multimap or create you own object to represent the structure.
You probably want to use a JSON-serializer to create a java object from the JSON-string. Good solutions are Jackson, Gson or Json-simple.
Once you have a correct Java representation of the JSON, you can use a template engine to do the string replacement. Known libraries are Freemarker, Velocity and StringTemplate
Personally I recommend Jackson+Freemarker, but all are good solutions.
Try Apache Velocity it does something very similar for property substitution in text.
Chunk is a very JSON-friendly template engine. Loops & conditions, tag syntax is similar to Mustache, and you can reference nested associative arrays of data fairly elegantly right from the template.
See sample code for JSON + Chunk in this answer.

Apache Digester How do I get some xml nested within a tag as a literal string?

I am parsing a XML with Digester. A part of it contains content formatted in cryptic pseudo-HTML XML elements which I need to transform into an PDF. That will be done via Apache FOP. Hence I need to access the xml element which contains the content elements directly and pipe it to FOP. To do so the Digester FAQ states that one either
Wrap the nested xml in CDATA
or
If this can't be done then you need to use a NodeCreateRule to create a DOM node representing the body tag and its children, then serialise that DOM node back to text
Since it is a third party XML the CDATA approach could only be done via (another) XSLT which I hestitate to do.
It looks like this issue should be solvable via NodeCreateRule but I can not figure out how to get it done.
The documentation states that NodeCreateRule will push a Node onto the stack however I can only get it to pass null.
I tried
digester.addRule(docPath + "/contents", new NodeCreateRule());
digester.addCallMethod(docPath + "/contents", "setContentsXML");
setContentsXML expects a Element parameter.
I also tried this and this without any luck.
I am using the latest stable Digester. Would be thankful for any advice.
Update:
I found the bug . The result on my system is null, too. I am using JDK 6u24
The problem in my case as well as the linked bug lays in the proper serialisation of an Element. In my case the mentioned null value was not returned by Digester but by Element#toString(). I assume something changed since JDK 1.4.
By the bug example:
result contains another (text-)node with the actual content. toString() however simply takes the content of the Element instance it is called uppon.
The Element tree has to be serialized explicitly. For example with the serialization method in this useage example of NodeCreateRule.
In case someone else tries to use that with Digester 3: you have to change the method signature SetSerializedNodeRule#end() to SetSerializedNodeRule#end(String, String).

XML data types specification

While there is plenty of documentation about XML document structure, there are very few links to how non textual data (eg. integer and decimal numbers, boolean values) should be managed.
Is there a consolidated standard? Could tell me a good starting point?
POST SCRIPT
I add an example because the question is actually too generic.
I've defined a document structure:
<rectangle>
<width>12.45</width>
<height>23.34</heigth>
<rounded_corners>true</rounded_corners>
</rectangle>
Since I'm using DOM, the API is oriented to textual data. Say doc is an instance of Document. doc.createTextNode("takes a string"). That is: API doesn't force towards a particular rapresentation.
So two question arise:
1) saving bolean true as 'true' instead of uhm, 1/0 is a standard?
2) i have to define simple methods that write and parse from and to this string representations. Example. I may use java wrappers to do this conversion:
Float f = 23.34;
doc.createTextNode(f.toString());
but does it adhere to the xml standard for decimal numbers? If so, i can say that a non-java programs can read this data because the data representation is xml.
why not jaxb
This is just an example, my xml is made up of a big tree of data and i need to write and read just a little part of it, so JAXB seems not to fit very well. Binding architecture tends to establish a tight copuling between xml structure and class structure. Good implementations like moxy let you loosen this coupling, but there are cases in wich the two structures simply don't match. In those cases, the DTO used to adapt them would be more work than using DOM.
Of course, there are dozens of official standards related to core XML. Some of the important ones:
XML schema 1.1 overview
XML schema 1.1 - datatypes ('XSD')
XML schema 1.1 - structures
Core XML overview
XML 1.0
XML 1.1
Further specification like XPath, XQuery are available at www.w3.org/standards/xml/. As you tagged this question with java you may be interested in JAXB as well, which defines a standard approach for mapping XML to Java and vice versa.
You can leverage the javax.xml.bind.DatatypeConverter class to convert simple Java types to a String that conforms to the XML Schema specification:
Float f = 23.34f;
doc.createTextNode(DatatypeConverter.printFloat(f));

XML and Java... Confused about Values versus Index?

I am trying to understand how to read out XML files using Java. I would like to have one XML tag, lets call it enable, pass a true to a method and another XML tag that provides a number to another method. I would like to pass the true by having the line in my XML file and pass the number as valueofnumber. I am reading out the XML file using a series of if statements testing for certain strings in an XML file:
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException
{
if (localName.equals("enabled")){
currentConfig.setenabled(true);
}
else if (localName.equals("number")){
currentConfig.setnumber(Double.parseDouble(attributes.getValue("number")))
}
}
I am getting confused as to how extract the value of number from the XML file. Currently I am just getting an error that nothing is present when I try getIndex() as well.
Thanks very much in advance
The getValue() method you're calling takes a qualified name, meaning XML namespace + local name in the format :. Your XML document probably uses a namespace, which you'd have to supply. If there's no namespace, you might need to use the other getValue() method and pass null for the namespace. It all depends a lot on what parser you're using and how it's configured. You'd be better advised to move to a higher-level parsing library that takes care of these nuances for you:
StAX isn't much higher level than SAX, but it still has a friendlier and generally more intuitive interface.
JDOM, being a DOM parser, will be slightly less efficient, but it makes parsing XML incredibly easy.
Commons Digester is kind of a rules-based XML parsing engine. You establish rules for what you want to happen when this or that element or attribute is encountered, and then run the digester. Method calls are one of the rules you can set, as is creation and population of a POJO.
JAXB or XStream will completely remove the guesswork and bind the XML straight to POJOs with minimal configuration. Then you don't even have to deal with XML and can work with normal objects instead.
Edit: (Based on the XML sample) Your "number" isn't an attribute. It's a nested element. That's why you're having trouble getting it from the Attributes object. My other advice on other libraries still stands.

Categories

Resources