how to create a dtd for this xml? - java

i have been asked to create a simple dtd for this xml :
<?xml version='1.0' encoding='ISO-8859-1'?>
<QUERY>
<PORT>
<NB></NB>
</PORT>
<BLOCK>
<TAB></TAB>
</BLOCK>
<STAND>
<LEVEL></LEVEL>
</STAND>
</QUERY>
i am using java, i've never did dtd before nor do i know precisely what does it mean.
i would like some guidance if its possible, thank you

DTD is Document Type Definition, and is used to represent the structure of you XML document. Other representations include XML Schema, Relax NG, etc.:
http://en.wikipedia.org/wiki/Document_Type_Definition
It will look something like the following (although my syntax may not be quite right):
<!ELEMENT QUERY (PORT, BLOCK, STAND)>
<!ELEMENT PORT (NB)>
<!ELEMENT NB (#PCDATA)>
<!ELEMENT BLOCK (TAB)>
<!ELEMENT TAB (#PCDATA)>
<!ELEMENT STAND (LEVEL)>
<!ELEMENT LEVEL (#PCDATA)>
If you look at the definition for QUERY you see it defines that it contains the elements: "PORT", "BLOCK", and "STAND". If you look at the definition for NB, we have declared that it should contain text (parsed character data).

XMLBeans comes with a tool called inst2xsd which can inspect an XML file and create an XSD for it that you can then edit/refine. I've used it with pretty good results.
Just read the installation guide for XMLBeans and when you install XMLBeans you'll have the inst2xsd tool installed as well.
edit - just realized you wanted a DTD and not an XSD, leaving this here in case an XSD (which is very similar in purpose) could actually solve your problem anyway

There are some DTD generators out there. Quick search yields this. Haven't used it, though.

Related

Add attribute to xml element not allowed in DTD

I have to use a external DTD, that specifies that a certain element can only have id attribute:
<!ELEMENT x (y | z)>
<!ATTLIST x id ID #IMPLIED>
So something like this is valid
<x id="x">...</x>
But if i try something like this:
<x id="x" custom="custom">...</x>
My parser gives me the following error:
Attribute "custom" must be declared for element type "x".
So I understand what the error says and why its happening, but as i said the DTD is external and sadly i cant change it. Is there a workaround or a hack that can use to add my own custom attribute?
You can either disable DTD validation in your parser, or try defining internal DTD.

Parse XML in which tag name is not fixed

It is easy to parse XML in which tags name are fixed. In XStream, we can simply use #XStreamAlias("tagname") annotation. But how to parse XML in which tag name is not fixed. Suppose I have following XML :
<result>
<result1>
<fixed1> ... </fixed1>
<fixed2> ... </fixed2>
</result1>
<result2>
<item>
<America>
<name> America </name>
<language> English </language>
</America>
</item>
<item>
<Spain>
<name> Spain </name>
<language> Spanish </language>
</Spain>
</item>
</result2>
</result>
Tag names America and Spain are not fixed and sometimes I may get other tag names like Germany, India, etc.
How to define pojo for tag result2 in such case? Is there a way to tell XStream to accept anything as alias name if tag name is not known before-hand?
if it is ok for you to get the tag from inside the tag itself (field 'name'), using Xpath, you can do:
//result2/*/name/text()
another option could be to use the whole element, like:
//result2/*
or also:
//result2/*/name()
Some technologies (specifically, data binding approaches) are optimized for handling XML whose structure is known at compile time. Others (like DOM and other DOM-like tree models - JDOM, XOM etc) are designed for handling XML whose structure is not known in advance. Use the tool for the job.
XSLT and XQuery try to blend both. In their schema-aware form, they can take advantage of static structure information when it is available. But more usually they are run in "untyped" mode, where there is no a-priori knowledge of element names or structure, and everything is handled as it comes. The XSLT rule-based processing paradigm is particularly well suited to "semi-structured" XML whose content is unpredictable or variable.

Missing NameSpace Information In XML file using EXIficient

I am using EXIficient to convert XML data to EXI and back to XML. Here, i use their EXIficientDemo class. Sample Code:
EXIficientDemo sample = new EXIficientDemo();
sample.parseAndProofFileLocations("FilePath");
sample.codeSchemaLess();
Firstly it converted xml file to EXI then back to XML, when it generate XML from previously generated EXI's file, it loses some information about Namespace.
Actual XML File:
<?xml version="1.0" encoding="utf-8"?>
<tt xml:lang="ja" xmlns="http://www.w3.org/ns/ttml"
xmlns:tts="http://www.w3.org/ns/ttml#styling"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<body>
<div>
<p xml:id="s1">
<span tts:origin="somethings">somethings</span>
</p>
</div>
</body>
Generated XML File By EXIficient
<?xml version="1.0" encoding="UTF-8"?>
<ns3:tt xmlns:ns3="http://www.w3.org/ns/ttml"
xml:lang="ja"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<ns3:body><ns3:div>
<ns3:p xml:id="s1">
<ns3:span xmlns:ns4="http://www.w3.org/ns/ttml#styling"
ns4:origin="somethings">somethings</ns3:span>
</ns3:p>
</ns3:div></ns3:body>
In the generated XML file, it is missing xmlns:tts="http://www.w3.org/ns/ttml#styling"
How to fixed this problem? If you can, please help me.
EXIficient may be suppressing unused namespaces. Your example doesn't show any use of the ttm namespace.
As you can see, it didn't retain the namespace prefix for the ttml namespace either (changed to ns3). The generated XML is perfectly valid if the ttml#metadata namespace is unused.
Update
With the updated question, where namespace ttml#styling is used by the origin attribute of the span element, the namespace is retained in the rebuilt XML, but it has been moved to the span element.
This is still a very valid XML document.
Namespace declarations (xmlns) can appear anywhere in a XML document, and applies to the element on which it appears, and all subelements (unless overridden, which is very unusual).
The same namespace can be declared many times on different elements. For simplicity and/or optimization, it is common to declare all namespaces up front, on the root element, using different prefixes, but it is not required to do so.
I read this question by accident and rather late unfortunately.
Just in case people are still struggling with this and are wondering what they can do.
As it was pointed out EXIficient behaves just fine with regards to namespace handling.
Having said that, the EXI specification allows one to preserve prefixes and namespaces (see Preserve Options).
In EXIficient one can set these options accordingly,
e.g.,
EXIFactory.getFidelityOptions().setFidelity(FidelityOptions.FEATURE_PREFIX, true);

how to reference the path to a DTD value in XML

I am a newbie when it comes to XML and DTD values, so forgive me if this is a simple question or if I am going about this in the wrong way. Can you specify a DTD value in the same way you can specify a path to a property in XML?
For instance, if you have the XML file below:
<!DOCTYPE ... SYSTEM "<path_to_file>">
<BOOK>
<AUTHOR>
<FIRST>John</FIRST>
<LAST>Quncy</LAST>
</AUTHOR>
<NAME>blah</NAME>
<DATE>12/23/13</DATE>
</BOOK>
You could specify the first name of the author by the path:
/BOOK/AUTHOR/FIRST
Is there any syntax to specify a DTD entity like the DOCTYPE in the same way?
Ultimately what I would like to do is use an in house XML parser already written in java to find a DTD entry that I specify and delete it from the XML file. For instance, with the above XML, I would like to specify DOCTYPE and have it removed from the XML. There is already code in place that, given a path, will delete that section from the XML file. I would like to leverage that to also delete DTD entries as well, but I have no idea how to reference it.
No. DOCTYPE is a parsing and validation directive. That is: DOCTYPE and DTD affect parsing and validation but are not a part of the document as separate entities after that. The XML data model does not containDOCTYPE or DTD definitions and they practically don't exist after the document has been parsed.

Getting started with JAXB

Schema Root
<xs:schema jxb:version="1.0"
xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2001/XMLSchema
http://www.nubean.com/schemas/schema.xsd" >
<xs:element name="UsOrCanadaAddress" >
JAXB Binding XML
<?xml version='1.0' encoding='utf-8' ?>
<jxb:bindings version="1.0"
xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
xmlns:xs="http://www.w3.org/2001/XMLSchema" >
<jxb:bindings node="/xs:schema" schemaLocation="address.xsd" >
<jxb:schemaBindings>
<jxb:package name="com.apress.jaxb1.example" ></jxb:package>
</jxb:schemaBindings>
</jxb:bindings>
</jxb:bindings>
I am beginning with JAXB and these are the two tags I came across in the books.
I have a few basic questions regarding the various parts of the two tags. Here they go:
Question 1:
xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
Does this attribute has to have the exact same value ?
Question 2:
xsi:schemaLocation="http://www.w3.org/2001/XMLSchema
http://www.nubean.com/schemas/schema.xsd"
This attribute, in the schema.. in the schema ??? I mean, I can understand that attribute in an XML document pointing to an XML schema but this ? What does it do if not trigger a schema-ception ?
Also, the namespace-location pair. In an XML document it would point to a physical location. Here, does it have to point to a physical location ?
Question 3:
The word binding. In my head I understand that as settings that you can change in mobile or computer apps. They have default values which you can change. In the above binding document, they have changed the package setting. Now, assuming that I do not want to keep the document in no package, I should leave that as it is ?
I will not need to write that binding XML document ?
Question 4:
In the JAXB binding document schemaLocation="address.xsd" points to the schema location. Now that is the physical location. What if my schema was packed with a JAR file ?
Question 1 - Does this attribute has to have the exact same value ?
A JAXB (JSR-222) implementation expects the elements the elements in the binding file to be qualified with the "http://java.sun.com/xml/ns/jaxb" namespace. It does not depend on a particular prefix being used.
Question 2 - This attribute, in the schema.. in the schema ???
Since the XML schema is an XML document I guess doing this is ok, but I have never done this in an XML schema myself.
Question 3 - The word binding.
I have a kind of love/hate relationship with the word "binding". It has come to be assocated with converting objects to/from data formats that aren't necessarily persistent (i.e. XML, JSON, etc).
Question 4 - In the JAXB binding document schemaLocation="address.xsd"
points to the schema location.
I do not believe that the schemaLocation is required in the bindings file.
Since you are just getting started with JAXB you may not want to get hung up on the binding document. It's only needed when you need to customize the classes generated from an XML schema. Below is an example where it is not needed:
http://blog.bdoughan.com/2010/09/processing-atom-feeds-with-jaxb.html
What I find is the more interesting use case is staring from objects. Below is an example you may find useful:
http://wiki.eclipse.org/EclipseLink/Examples/MOXy/GettingStarted

Categories

Resources