I'm writing custom Spring namespace handler (Java). If the XML is invalid, I'd like to report error message that will include line number (in the parsed document), so that user knows where to look. However, I don't know how to retrieve line number from DOM objects or otherwise.
Note that I'm talking about errors that are not discovered by XSD validation (those report line numbers correctly).
Is it even possible to get such information from inside Namespace handler?
Thanks,
Ondrej
If you are using the SAX parser you can extend the DefaultLocator class and register a Locator in setDocumentLocator method. The locator gets notified when an event occurs and therefore you can call getLineNumber() method to obtain the line number of interest.
Related
Assume the following:
I have a set of XSD schemas S, each with distinct namespace URIs.
I know that I'm going to be receiving an XML document containing a root element that contains exactly one namespace declaration that refers to a member of S. I can abort parsing immediately with an error if I don't receive exactly one namespace declaration, or if the received namespace doesn't refer to any schema in S.
I want to parse the incoming XML document with a SAX parser, and I want to validate the incoming document during parsing against one of the schemas in S. I know from the above that the first call I'm going to see in the ContentHandler that I give to the parser will be a call to startPrefixMapping when the parser encounters the namespace declaration.
Is it possible to, in the startPrefixMapping call, pick one of the schemas in S for validation once I know which one I need?
It seems that I could maybe call setSchema on the parser inside the startPrefixMapping call, but I get the feeling from the API documentation that I'm not supposed to do this (and that it may be too late to call the method at that point anyway).
Is there some other way to supply a set of schemas to the parser and perhaps have it pick the right one itself based on the namespace declaration it receives?
Edit: I was wrong, it's not just inadvisable to call setSchema on a parser once parsing has started - it's actually impossible. Parsers don't expose a setSchema call, only parser factories do. This means that my options are limited to those that can allow the parser to select a schema for itself. Unfortunately, that has its own problems: It's not possible for an XML document to merely specify a namespace, it also has to specify a filename for the intended schema (which in my opinion is an implementation detail on the parser side and should not be required of the incoming data) and the parser has to intercept the request for this filename to supply a member of S for validation.
Edit: I've solved this. I've put together some heavily-commented public domain example code here that looks up schemas based on pre-specified systemIds, and the schemas are delivered programatically (so they can be served from databases, class resources, etc). It correctly rejects any document that specifies an unknown schema, specifies no schema, or tries to specify its own schemaLocation to try to fool the validator.
https://github.com/io7m/xml-schema-lookup-example
At the time of marshalling of JAXB object I want to set some defult value to the resulting XML.
I do not want to use nillable=true as it generates empty tag with unnecessary xsi:nil="true", and this is not for setting default value. Instead I want to generate the XML with some placeholder characters such as '?'.
Use case : I am going to build a tool for WebService testing. There I need to present the entire request xml to the user (Like SOAPUI).
Use case : I am going to build a tool for WebService testing. There I
need to present the entire request xml to the user (Like SOAPUI).
The idea of the place holder character isn't really going to work. For example ? is an ok default value for a string, but not an int, boolean, or for most complex values (i.e. representing the nested address information for a customer). Instead you will want a value that reflects the type.
Then I would have to write large and complex reflection based code. Just assume that
that is almost not possible in my case.
This reflection code probably won't be as bad as you are imagining. A quick search will probably also reveal libraries that populate objects with "dummy" data. When hooking it up with JAXB you could leverage a Marshaller.Listener to populate the object on the before marshal event.
So I searched around quite a bit for a solution to this particular issue and I am hoping someone can point me in a good direction.
We are receiving data as XML, and we only have XSD to validate the data. So I used JAXB to generate the Java classes. When I went to unmarshal a sample XML, I found that some attribute values are missing. It turns out that the schema expects those attributes to be QName, but the data provider didn't define the prefix in the XML.
For instance, one XML attribute value is "repository:<uuid>", but the namespace prefix "repository" is never defined in the dataset. (Never mind the provider's best practices suggest defining it!)
So when I went to unmarshal a sample set, the QName attributes with the specified prefix ("repository" in my sample above) are NULL! So it looks like JAXB is "throwing out" those attribute QName values which have undefined namespace prefix. I am surprised that it doesn't preserve even the local name.
Ideally, I would like to maintain the value as is, but it looks like I can't map the QName to a String at binding time (Schema to Java).
I tried "manually" inserting a namespace definition to the XML and it works like a charm. What would be the least complicated method to do this?
Is there a way to "insert" namespace mapping/definition at runtime? Or define it "globally" at binding time?
The simplest would be to use strings instead of QName. You can use the javaType customization to achieve this.
If you want to add prefix/namespace mappings in the runtime, there are quite a few ways to do it:
Similar to above, you could provide your own QName converter which would consider your prefixes.
You can put a SAX or StAX filter in between and declare additional prefixes in the startDocument.
What you actually need is to add your prefix mappings into the UnmarshallingContext.environmentNamespaceContext. I've checked the source code but could not find a direct and easy way to do it.
I personally would implement a SAX/StAX filter to "preprocess" your XML on the event level.
I did some research, looked at the table at the bottom here (1) and I am trying to find out what kind of API I should use.
Let me introduce the problem my app in going to solve:
My application listens to some observer events fired from all places (e.g. events from CDI) in some observer class. In that class, there are methods which observes these events.
I need to construct XML file on-the-fly as these events are being observed. More concretely, when I observe event "start", I need to create this xml.
<start></start>
After that when I observe some other event, like "installed" (does not matter how it is called really), I need to have this structure:
<start><installed></installed><start>
Everytime I observe some event, I need to be able to write that XML representation to external file. Summing it up, it seems I can not use "SAX" because SAX just parses XML documents but I need to write them or construct them. Next, I am about to use StAX or DOM but StAX is "forward only" which I do not quite understand what it stands for, but when I take StAX API it behaves like this (2) and when it is "forward" I am "forced" to manually start and end elements but that is not applicable in my case. I do not know when I am about to end the document generation, I just need to have valid xml every time in order to write it.
However, there is this method (3) which says that when I call it, it automatically closes all elements. So e.g. when I have this:
<a>
<b></b>
<c>
<d>
</d>
and I call writeEndDocument(), does that mean that it automatically closes "c" and "a"?
(1) http://docs.oracle.com/cd/E17802_01/webservices/webservices/docs/1.6/tutorial/doc/SJSXP2.html
(2) http://docs.oracle.com/javase/tutorial/jaxp/stax/example.html#bnbgx
(3) http://docs.oracle.com/javase/6/docs/api/javax/xml/stream/XMLStreamWriter.html#writeEndDocument()
I recommend to use the following XML libraries (ordered by recommendation; only use the next one if the one before doesn't suit you needs):
JAXB (work with objects rather than XML)
StAX (lower level than JAXB)
SAX (only for reading; should be rarely used now with JAXB and StAX available)
DOM (should be rarely used now with JAXB and StAX available)
Do not use lower level XML techniques (either SAX or DOM) unless you really need them. I believe that this is not the case.
Use JAXB. Create class that represents your events. Every time you get event create instance of this class and populate fields. Every time you have to create XML just marshal the instance(s) to any stream you want (file, socket, whatever).
While working with JAXB 2.0 i came across a query which i am unable to solve so far,while doing the validation i have 2 options
1) Either as soon as i found the error throw the exception as i am done.
2) Move ahead if there is any error or validation and i assume this is the best way since it will help one to show all errors or warings with respect to the whole XML.
but since this process also invlolves unmarshalling means it will unmarshall my provided XML is to respected Object even if there is any Error or warning.so all means extra work..
My question is is these a way so that i can do whole validation and if it is successfull only than should the corresponding XML be bind to respected POJO classes
thanks in advance
You can use the javax.xml.validation APIs to validate an XML document against an XML schema. The you can choose to unmarshal this object again using JAXB.
Below is an example of using these APIs. In this example the input is actually an object model, but you can adapt it to work with any XML input.
http://bdoughan.blogspot.com/2010/11/validate-jaxb-object-model-with-xml.html