XML data types specification - java

While there is plenty of documentation about XML document structure, there are very few links to how non textual data (eg. integer and decimal numbers, boolean values) should be managed.
Is there a consolidated standard? Could tell me a good starting point?
POST SCRIPT
I add an example because the question is actually too generic.
I've defined a document structure:
<rectangle>
<width>12.45</width>
<height>23.34</heigth>
<rounded_corners>true</rounded_corners>
</rectangle>
Since I'm using DOM, the API is oriented to textual data. Say doc is an instance of Document. doc.createTextNode("takes a string"). That is: API doesn't force towards a particular rapresentation.
So two question arise:
1) saving bolean true as 'true' instead of uhm, 1/0 is a standard?
2) i have to define simple methods that write and parse from and to this string representations. Example. I may use java wrappers to do this conversion:
Float f = 23.34;
doc.createTextNode(f.toString());
but does it adhere to the xml standard for decimal numbers? If so, i can say that a non-java programs can read this data because the data representation is xml.
why not jaxb
This is just an example, my xml is made up of a big tree of data and i need to write and read just a little part of it, so JAXB seems not to fit very well. Binding architecture tends to establish a tight copuling between xml structure and class structure. Good implementations like moxy let you loosen this coupling, but there are cases in wich the two structures simply don't match. In those cases, the DTO used to adapt them would be more work than using DOM.

Of course, there are dozens of official standards related to core XML. Some of the important ones:
XML schema 1.1 overview
XML schema 1.1 - datatypes ('XSD')
XML schema 1.1 - structures
Core XML overview
XML 1.0
XML 1.1
Further specification like XPath, XQuery are available at www.w3.org/standards/xml/. As you tagged this question with java you may be interested in JAXB as well, which defines a standard approach for mapping XML to Java and vice versa.

You can leverage the javax.xml.bind.DatatypeConverter class to convert simple Java types to a String that conforms to the XML Schema specification:
Float f = 23.34f;
doc.createTextNode(DatatypeConverter.printFloat(f));

Related

Deserialize Json with inconstant field to POJO in Java

Is there a possibility to create POJO class in Java to which this
Json can be deserialized?
{
"name": "value",
"random-value-01" : {
"constant-field-00":"value_00",
"constant-field-01":"value_01"
},
"random-value-02" : {
"constant-field-00":"value_02",
"constant-field-01":"value_03"
},
...
"random-value-XX" : {
"constant-field-00":"value",
"constant-field-01":"value"
},
}
If all of your random-value-x JsonObjects have the same format (i.e. the two constant fields are the same for each), then you could always have something akin to:
class RandomValue {
private final String constantField00;
private final String constantField01;
// ... Constructors, getters, etc.
}
class Pojo {
private final String name;
private final Map<String, RandomValue> randomValues;
// ...
}
If they're ordered (i.e. they're all the same random-value, like property-01, property-02, etc.) then you could also have the Map be a List (or Set, etc.) of your RandomValue elements.
If, on the other hand, the constant-fields are all random keys as well, then you're probably stuck with something more like:
class Pojo {
private final String name;
private final Map<String, Map<String, String>> additionalInfo;
// ...
}
Where the keys to the additionalInfo Map are your random-value-xs, and the values are a Map of String keys (constant-field-0xs) to String values (values).
This is not an answer as such, but may pique one's interest on the general topic.
For those interested in a different approach to JSON, one can look at the ITU's ASN.1. Think of it as a bit like Google Protocol Buffers, but with a whole load of different wire formats (including JSON, XML, and a whole load of binary wire formats with varying properties). Basically there's a wire format for every occasion and purpose.
This can be extremely useful sometimes. If you want to ship data around a distributed system, and parts of that system are, say C on a microcontroller at the end of a low bandwidth radio link, whilst other parts are Java on a server, then you can encompass all of your system messaging in a single schema (which acts as a the single point of truth). From this you can (depending on the tools you use) generate plain old objects in C, C++, Java, C#, and even ADA, VHDL. Python is a notable omission (there's Python modules to do code-first ASN1, which kind of misses the point).
It's use of JSON as a wire format is a relatively recent addition to the standard, but some of the commercial tools support it. For those who really need it, it can be a very useful tool.
From the point of view of this particular question, ASN.1 and the tools that support JSON as a wire format is not useful; you cannot take arbitrary JSON and automatically generate a schema that can be compiled to classes. Where it is useful is in a fresh project, where you want to make it easy for other languages / platforms to consume or generate your data.
I have looked for decent C# class generators that consume JSON schemas; unfortunately the best one out there didn't deal with oneof. However the tools I'm using for ASN.1 (which has the equivalent CHOICE) do generate complete classes in C#, Java, C/C++. So I'm in this amusing situation where I have an ASN.1 schema that I compile (in this particular project) to C# and C, and it deals with JSON, XML, and the binary wire formats. The generated classes are smart enough to do their own validation - I don't have to pass the JSON through a JSON schema validator.
JSON schemas and ASN.1 schemas are broadly comparable in terms of the detail one can put into a specification. Similarly ASN.1 schema and XSD XML schema are broadly equivalent (there's even an official standardised translation between the two languages). The benefits I've seen of using ASN.1 schema instead of JSON or XSD schema is that the tools (especially the commercial tools) seem to be significantly more thorough than the class generators typically associated with JSON and XSD schemas (e.g. Microsoft's xsd.exe sucks). This has had a knock-on positive benefit for systems integration, maintenance, being agile with data definitions, etc.

Java CSON parser?

Is there already a CSON parser for Java? JSON is very difficult to hand-write (which I will be doing a lot of for this project) and I would rather not make my own CSON parser.
If not, is there another easy-to-use alternative to JSON that there is Java support for? The major reason I would like to avoid JSON is that I will be dealing with rather large multiline strings.
EDIT: I am referring to CoffeeScript Object Notation, not Cursive Script Object Notation.
YAML is supported by a ton of languages including Java and is very friendly with multiline strings.

Mustache kind of String replacement in Java

My application allows users to define few templates for text etc. Eg: one of the shortcuts could be hi {{name}}, nice to meet you.
I have a complex json which has name and lot of inner jsons. I am looking for a good mustache kind of implementation in java which can replace the values of json into the string. Currently I am iterating through each key and replacing the string but I am looking for more elegant solution which gives the users more power in their templating like loops, conditions etc similar to mustache/handlebars.
Though mustache for java looks good, I haven't seen any implementation which can replace with a JSON. All examples applies on an object but not on a json object. Looks to me that internally, it uses an object mapper to convert an object to object and somehow it applies that.
Perhaps I can convert JSON into a map and provide it.
Probably I am missing something. Thanks.
You have to convert the JSON string to a Java object. You can use a nested Map, Multimap or create you own object to represent the structure.
You probably want to use a JSON-serializer to create a java object from the JSON-string. Good solutions are Jackson, Gson or Json-simple.
Once you have a correct Java representation of the JSON, you can use a template engine to do the string replacement. Known libraries are Freemarker, Velocity and StringTemplate
Personally I recommend Jackson+Freemarker, but all are good solutions.
Try Apache Velocity it does something very similar for property substitution in text.
Chunk is a very JSON-friendly template engine. Loops & conditions, tag syntax is similar to Mustache, and you can reference nested associative arrays of data fairly elegantly right from the template.
See sample code for JSON + Chunk in this answer.

XJC - Generate xml using Xpath

After converting XSD to java objects using XJC , I would like to generate an xml file giving xpath and value to the xpath.
Examples.
Say I'm giving xpath and the value as
customer/name = XXXXX_VALUE
It should assign internally to the generated objects ... CustomerType.setName() ..
An XML also should generate as expected following the Xpath rule.
I know in Castor we can do this using ClassDescripor and FieldDescriptor. But I would like to know how to do this using XJC
JXPath can be used to navigate javabeans via something similar to xpaths.
http://commons.apache.org/proper/commons-jxpath/
Specifically, when you supply a factory you can create objects. There are several situations that are not supported nativly, but with a little bit of thought you can implement your own extension of createPathAndSetValue that can deal with your specific predicate logic.
http://commons.apache.org/proper/commons-jxpath/users-guide.html#Creating_Objects

Generate valid XML name in Java

Are there any helpers that will transform/escape a string to be a valid XML name ?
Example, I have the string max(OfAll) and need to generate some XML like e.g.
<max(OfAll)>SomeText</<max(OfAll)>
That's obviously not a valid name, are there some helper methods that can transform the string to be a valid xml name ?
(For comparison, .NET have some methods that the above xml fragment would be:
<max_x028_OfAll_x028_>SomeText</<max_x028_OfAll_x028_>)
The encoding in your .NET example looks like the one defined in ISO9075. I don't think there is a built-in implementation in the jdk, but this encoding is also used by content repositories like alfresco or jackrabbit for their xml import/exports and query apis. A quick search turned up these two implementations, both available under open source licenses:
http://www.docjar.com/html/api/org/apache/jackrabbit/util/ISO9075.java.html
http://kickjava.com/src/org/alfresco/util/ISO9075.java.htm
One class which may be of use in other situations is StringEscapeUtils in the apache commons-lang project. It can escape text for use in XML documents, I'm not aware of anything to escape XML element names.
Could you not generate something more readable such as
<aggregation type="max(OfAll)">SomeText</aggregation>
There are lots of libraries available to marshall/unmarshall objects to xml and back including JAXB (part of the JDK), JiBX, Castor, XStream
I don't know of any helper methods for that, but rules here http://www.w3.org/TR/REC-xml/#NT-Name are pretty straightforward, so it should be easy to implement one.
As should be clear, normal XML escaping (replacing inappropriate characters with character entities) does not result in a valid XML identifier.
For the record, what you are doing is frequently called "name mangling".

Categories

Resources