JAXP: How to force XPath to validate namespace prefixes? - java

I am relying on the default JAXP implementation and using the Oracle JRE.
When evaluating a XPath which contains an unknown namespace prefix, it does not throw an (expected) exception.
When I run the same application on an IBM JRE, everything is fine and it throws the expected exception javax.xml.xpath.XPathExpressionException: org.apache.xpath.domapi.XPathStylesheetDOM3Exception: Prefix must resolve to a namespace
I am using the following code which tries to access an invalid namespace unknownns
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory
.newInstance();
documentBuilderFactory.setNamespaceAware(true);
documentBuilderFactory.setValidating(true);
documentBuilderFactory.setAttribute(JAXP_SCHEMA_LANGUAGE, W3C_XML_SCHEMA);
DocumentBuilder builder = documentBuilderFactory.newDocumentBuilder();
Document doc = builder.parse(xmlFile_);
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodeList = (NodeList) xpath.evaluate("path/to/node/unknowns:#bla", doc,
XPathConstants.NODESET);
Question:
How can I enforce this validation independently from the JAXP implementation?

Try setting a NamespaceContext on your XPath instance:
public final class NSValidator {
private NSValidator() {
}
private static final NamespaceContext INSTANCE = new NamespaceContext() {
#Override public String getNamespaceURI(String prefix) {
return null;
}
#Override public String getPrefix(String namespaceURI) {
return null;
}
#Override public Iterator<?> getPrefixes(String namespaceURI) {
return Collections.emptyList()
.iterator();
}
};
public static NamespaceContext noNamespaces() {
return INSTANCE;
}
}

Related

Unable to fetch nodes from xml even with namespace management in place in java

Below is the screenshot of the xml file i am working with, i need to get the value 'switchboardid1' from tag Extensions:
Below is the code i have writen: I need to access property 'switchboardid1' from extensions tag. I always get only null in return. Please correct my code and help me understand.
I have NamespaceContext class to return the namespace in class 'HardcodedNamespaceResolver' and it is correctly returning the value of nfh namespace.
public void test() throws Throwable
{
String xpath="//ElectricalProject/Equipments/Equipment/Extensions/Extension/nfh:extensionProperty[#name='switchboardId']";
Node node = GetNodeFromXml("PutNFInProj.xml",xpath);
Element ele = (Element) node;
System.out.println(ele.getNodeValue().toString());
}
//Function to GET a single Node from xml file wrt to xpath defined
public Node GetNodeFromXml(String XmlFileName, String xPathExpression) throws Throwable
{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(GetDataFile(XmlFileName));
((org.w3c.dom.Document) doc).getDocumentElement().normalize();
XPath xPath = XPathFactory.newInstance().newXPath();
xPath.setNamespaceContext(new HardcodedNamespaceResolver());
NodeList nodeList = (NodeList) xPath.evaluate(xPathExpression,doc,XPathConstants.NODESET);
switch (nodeList.getLength())
{
case 0:
{
log.error("In Function: GetNodeFromXml - There are no nodes with respect to given xpath, Please Check the Xpath");
return null;
}
case 1:
{
Node nNode = nodeList.item(0);
return nNode;
}
default:
{
log.error("In Function: GetNodeFromXml- There are more than one nodes with respect to given xpath");
return null;
}
}
}
}
SimpleXml can do it:
final String yourxml = ...
final SimpleXml simple = new SimpleXml();
System.out.println(getSwitchBoardId(simple.fromXml(yourxml)));
private static String getSwitchBoardId(final Element element) {
return element.children.get(2).children.get(0).children.get(4).children.get(0).children.get(0).text;
}
Will output:
switchboardid1
From maven central:
<dependency>
<groupId>com.github.codemonstur</groupId>
<artifactId>simplexml</artifactId>
<version>1.4.0</version>
</dependency>

XPath expression unable to match

I'm trying to parse out some information from XML using XPath in Java (v 1.7). My XML looks like this:
<soap:Fault xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<faultcode>code</faultcode>
<faultstring>string</faultstring>
<detail>detail</detail>
</soap:Fault>
My code:
final InputSource inputSource = new InputSource(new StringReader(xmlContent));
final DocumentBuilder documentBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
final Document document = documentBuilder.parse(inputSource);
final XPath xPath = XPathFactory.newInstance().newXPath();
final String faultCode = xPath.compile("/soap:Fault/faultcode/text()[1]").evaluate(document);
I have tried the XPath expression in an online checker with the XML content and it suggests that a match is made. However, when I run it in a wee stand-alone program, I get no value in "faultCode".
This issue is probably something simple, but I am unable to identify what the problem is.
Thanks for any assistance.
You should bind the namespace prefix "soap" to the URI "http://schemas.xmlsoap.org/soap/envelope/" using the XPath.setNamespaceContext() method.
First you need a namespace aware document builder factory:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
Then you need a namespace context:
NamespaceContext nsContext = new NamespaceContext() {
#Override
public Iterator getPrefixes(String namespaceURI) {
return null;
}
#Override
public String getPrefix(String namespaceURI) {
return "soap";
}
#Override
public String getNamespaceURI(String prefix) {
return "http://schemas.xmlsoap.org/soap/envelope/";
}
};
xPath.setNamespaceContext(nsContext);
With these additions, your code should work.
Regarding namespace contexts, I suggest you read http://www.ibm.com/developerworks/xml/library/x-nmspccontext/index.html.
It may be because of the namespaces effect. You can try namespace independent tags by matching with the local-name giving the syntax below:
/*[local-name()='Fault' and namespace-uri()='http://schemas.xmlsoap.org/soap/envelope/']/*[local-name()='faultcode']/text()[1]

Java Transformer how to ignore namespaces

I have to transform XML to XHTML but the XML defines a namespace xmlns='http://www.lotus.com/dxl' which is never used in the whole XML therefore the parser won't parse anything ...
Is there a way I ignore namepsaces? I am using the Oracle java transformer import javax.xml.transform.Transformer; import javax.xml.transform.TransformerFactory
Or are there any better libraries?
No, you can't ignore namespaces.
If the namespace declaration xmlns='http://www.lotus.com/dxl' appears in the outermost element, then you can't say it "isn't used anywhere" - on the contrary, it is used everywhere! It effectively changes every element name in the document to a different name. There's no way you can ignore that.
If you were using XSLT 2.0, then you would be able to say in your stylesheet xpath-default-namespace="http://www.lotus.com/dxl" which would pretty much do what you want: it says that any unprefixed name in a match pattern or XPath expression should be interpreted as referring to a name in namespace http://www.lotus.com/dxl. Sadly, you've chosen an XSLT processor that doesn't implement XSLT 2.0. So you'll have to do it the hard way (which is described in about 10,000 posts that you will find by searching for "XSLT default namespace").
You can't ignore namespaces easily, and it won't be pretty, but it is possible. Of course, tricking the right part inside the Transformer implementation into just outputting the prefixes without getting flustered is implementation dependent!
OK then, this works for me going from a Node to a StringWriter:
public static String nodeToString(Node node) throws TransformerException {
StringWriter results = new StringWriter();
Transformer transformer = createTransformer();
transformer.transform(new DOMSource(node), new StreamResult(results) {
#Override
public Writer getWriter() {
Field field = findFirstAssignable(transformer.getClass());
try {
field.setAccessible(true);
field.set(transformer, new TransletOutputHandlerFactory(false) {
#Override
public SerializationHandler getSerializationHandler() throws
IOException, ParserConfigurationException {
SerializationHandler handler = super.getSerializationHandler();
SerializerBase base = (SerializerBase) handler.asDOMSerializer();
base.setNamespaceMappings(new NamespaceMappings() {
#Override
public String lookupNamespace(String prefix) {
return prefix;
}
});
return handler;
}
});
} catch(IllegalAccessException e) {
throw new AssertionError("Must not happen", e);
}
return super.getWriter();
}
});
return results.toString();
}
private static <E> Field findFirstAssignable(Class<E> clazz) {
return Stream.<Class<? super E>>iterate(clazz, Convert::iteration)
.flatMap(Convert::classToFields)
.filter(Convert::canAssign).findFirst().get();
}
private static <E> Class<? super E> iteration(Class<? super E> c) {
return c == null ? null : c.getSuperclass();
}
private static boolean canAssign(Field f) {
return f == null ||
f.getType().isAssignableFrom(TransletOutputHandlerFactory.class);
}
private static <E> Stream<Field> classToFields(Class<? super E> c) {
return c == null ? Stream.of((Field) null) :
Arrays.stream(c.getDeclaredFields());
}
What this is doing is pretty much just customizing the mapping of namespaces to prefixes. Every prefix is mapped to a namespace identified by its prefix, so there shouldn't even be any conflicts. The rest of it is fighting the API.
To make the example complete, here are the methods converting to and from the XML as well:
public static Transformer createTransformer()
throws TransformerFactoryConfigurationError,
TransformerConfigurationException {
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.setOutputProperty(OutputKeys.INDENT, "no");
return transformer;
}
public static ArrayList<Node> parseNodes(String uri, String expression)
throws ParserConfigurationException, SAXException,
IOException,XPathExpressionException {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(false);
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(uri);
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile(expression);
NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
ArrayList<Node> nodes = new ArrayList<>();
for(int i = 0; i < nl.getLength(); i++) {
nodes.add(nl.item(i));
}
return nodes;
}

Extracting XML Elements from Java Object [duplicate]

I am new to XML. I want to read the following XML on the basis of request name. Please help me on how to read the below XML in Java -
<?xml version="1.0"?>
<config>
<Request name="ValidateEmailRequest">
<requestqueue>emailrequest</requestqueue>
<responsequeue>emailresponse</responsequeue>
</Request>
<Request name="CleanEmail">
<requestqueue>Cleanrequest</requestqueue>
<responsequeue>Cleanresponse</responsequeue>
</Request>
</config>
If your XML is a String, Then you can do the following:
String xml = ""; //Populated XML String....
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(xml)));
Element rootElement = document.getDocumentElement();
If your XML is in a file, then Document document will be instantiated like this:
Document document = builder.parse(new File("file.xml"));
The document.getDocumentElement() returns you the node that is the document element of the document (in your case <config>).
Once you have a rootElement, you can access the element's attribute (by calling rootElement.getAttribute() method), etc. For more methods on java's org.w3c.dom.Element
More info on java DocumentBuilder & DocumentBuilderFactory. Bear in mind, the example provided creates a XML DOM tree so if you have a huge XML data, the tree can be huge.
Related question.
Update Here's an example to get "value" of element <requestqueue>
protected String getString(String tagName, Element element) {
NodeList list = element.getElementsByTagName(tagName);
if (list != null && list.getLength() > 0) {
NodeList subList = list.item(0).getChildNodes();
if (subList != null && subList.getLength() > 0) {
return subList.item(0).getNodeValue();
}
}
return null;
}
You can effectively call it as,
String requestQueueName = getString("requestqueue", element);
In case you just need one (first) value to retrieve from xml:
public static String getTagValue(String xml, String tagName){
return xml.split("<"+tagName+">")[1].split("</"+tagName+">")[0];
}
In case you want to parse whole xml document use JSoup:
Document doc = Jsoup.parse(xml, "", Parser.xmlParser());
for (Element e : doc.select("Request")) {
System.out.println(e);
}
If you are just looking to get a single value from the XML you may want to use Java's XPath library. For an example see my answer to a previous question:
How to use XPath on xml docs having default namespace
It would look something like:
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
public class Demo {
public static void main(String[] args) {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("E:/test.xml");
XPath xPath = XPathFactory.newInstance().newXPath();
Node node = (Node) xPath.evaluate("/Request/#name", dDoc, XPathConstants.NODE);
System.out.println(node.getNodeValue());
} catch (Exception e) {
e.printStackTrace();
}
}
}
There are a number of different ways to do this. You might want to check out XStream or JAXB. There are tutorials and the examples.
If the XML is well formed then you can convert it to Document. By using the XPath you can get the XML Elements.
String xml = "<stackusers><name>Yash</name><age>30</age></stackusers>";
Form XML-String Create Document and find the elements using its XML-Path.
Document doc = getDocument(xml, true);
public static Document getDocument(String xmlData, boolean isXMLData) throws Exception {
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
dbFactory.setNamespaceAware(true);
dbFactory.setIgnoringComments(true);
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc;
if (isXMLData) {
InputSource ips = new org.xml.sax.InputSource(new StringReader(xmlData));
doc = dBuilder.parse(ips);
} else {
doc = dBuilder.parse( new File(xmlData) );
}
return doc;
}
Use org.apache.xpath.XPathAPI to get Node or NodeList.
System.out.println("XPathAPI:"+getNodeValue(doc, "/stackusers/age/text()"));
NodeList nodeList = getNodeList(doc, "/stackusers");
System.out.println("XPathAPI NodeList:"+ getXmlContentAsString(nodeList));
System.out.println("XPathAPI NodeList:"+ getXmlContentAsString(nodeList.item(0)));
public static String getNodeValue(Document doc, String xpathExpression) throws Exception {
Node node = org.apache.xpath.XPathAPI.selectSingleNode(doc, xpathExpression);
String nodeValue = node.getNodeValue();
return nodeValue;
}
public static NodeList getNodeList(Document doc, String xpathExpression) throws Exception {
NodeList result = org.apache.xpath.XPathAPI.selectNodeList(doc, xpathExpression);
return result;
}
Using javax.xml.xpath.XPathFactory
System.out.println("javax.xml.xpath.XPathFactory:"+getXPathFactoryValue(doc, "/stackusers/age"));
static XPath xpath = javax.xml.xpath.XPathFactory.newInstance().newXPath();
public static String getXPathFactoryValue(Document doc, String xpathExpression) throws XPathExpressionException, TransformerException, IOException {
Node node = (Node) xpath.evaluate(xpathExpression, doc, XPathConstants.NODE);
String nodeStr = getXmlContentAsString(node);
return nodeStr;
}
Using Document Element.
System.out.println("DocumentElementText:"+getDocumentElementText(doc, "age"));
public static String getDocumentElementText(Document doc, String elementName) {
return doc.getElementsByTagName(elementName).item(0).getTextContent();
}
Get value in between two strings.
String nodeVlaue = org.apache.commons.lang.StringUtils.substringBetween(xml, "<age>", "</age>");
System.out.println("StringUtils.substringBetween():"+nodeVlaue);
Full Example:
public static void main(String[] args) throws Exception {
String xml = "<stackusers><name>Yash</name><age>30</age></stackusers>";
Document doc = getDocument(xml, true);
String nodeVlaue = org.apache.commons.lang.StringUtils.substringBetween(xml, "<age>", "</age>");
System.out.println("StringUtils.substringBetween():"+nodeVlaue);
System.out.println("DocumentElementText:"+getDocumentElementText(doc, "age"));
System.out.println("javax.xml.xpath.XPathFactory:"+getXPathFactoryValue(doc, "/stackusers/age"));
System.out.println("XPathAPI:"+getNodeValue(doc, "/stackusers/age/text()"));
NodeList nodeList = getNodeList(doc, "/stackusers");
System.out.println("XPathAPI NodeList:"+ getXmlContentAsString(nodeList));
System.out.println("XPathAPI NodeList:"+ getXmlContentAsString(nodeList.item(0)));
}
public static String getXmlContentAsString(Node node) throws TransformerException, IOException {
StringBuilder stringBuilder = new StringBuilder();
NodeList childNodes = node.getChildNodes();
int length = childNodes.getLength();
for (int i = 0; i < length; i++) {
stringBuilder.append( toString(childNodes.item(i), true) );
}
return stringBuilder.toString();
}
OutPut:
StringUtils.substringBetween():30
DocumentElementText:30
javax.xml.xpath.XPathFactory:30
XPathAPI:30
XPathAPI NodeList:<stackusers>
<name>Yash</name>
<age>30</age>
</stackusers>
XPathAPI NodeList:<name>Yash</name><age>30</age>
following links might help
http://labe.felk.cvut.cz/~xfaigl/mep/xml/java-xml.htm
http://developerlife.com/tutorials/?p=25
http://www.java-samples.com/showtutorial.php?tutorialid=152
There are two general ways of doing that. You will either create a Domain Object Model of that XML file, take a look at this
and the second choice is using event driven parsing, which is an alternative to DOM xml representation. Imho you can find the best overall comparison of these two basic techniques here. Of course there are much more to know about processing xml, for instance if you are given XML schema definition (XSD), you could use JAXB.
There are various APIs available to read/write XML files through Java.
I would refer using StaX
Also This can be useful - Java XML APIs
You can make a class which extends org.xml.sax.helpers.DefaultHandler and call
start_<tag_name>(Attributes attrs);
and
end_<tag_name>();
For it is:
start_request_queue(attrs);
etc.
And then extends that class and implement xml configuration file parsers you want. Example:
...
public void startElement(String uri, String name, String qname,
org.xml.sax.Attributes attrs)
throws org.xml.sax.SAXException {
Class[] args = new Class[2];
args[0] = uri.getClass();
args[1] = org.xml.sax.Attributes.class;
try {
String mname = name.replace("-", "");
java.lang.reflect.Method m =
getClass().getDeclaredMethod("start" + mname, args);
m.invoke(this, new Object[] { uri, (org.xml.sax.Attributes)attrs });
}
catch (IllegalAccessException e) {
throw new RuntimeException(e);
}
catch (NoSuchMethodException e) {
throw new RuntimeException(e); }
catch (java.lang.reflect.InvocationTargetException e) {
org.xml.sax.SAXException se =
new org.xml.sax.SAXException(e.getTargetException());
se.setStackTrace(e.getTargetException().getStackTrace());
}
and in a particular configuration parser:
public void start_Request(String uri, org.xml.sax.Attributes attrs) {
// make sure to read attributes correctly
System.err.println("Request, name="+ attrs.getValue(0);
}
Since you are using this for configuration, your best bet is apache commons-configuration. For simple files it's way easier to use than "raw" XML parsers.
See the XML how-to

Java Xpath expression

I have the following XML:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<application xmlns="http://research.sun.com/wadl/2006/10">
<doc xmlns:jersey="http://jersey.dev.java.net/"
jersey:generatedBy="Jersey: 1.0.2 02/11/2009 07:45 PM"/>
<resources base="http://localhost:8080/stock/">
<resource path="categories"> (<<---I want to get here)
<method id="getCategoriesResource" name="GET">
And I want to get the value of resource/#path so I have the following Java code:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true); // never forget this!
DocumentBuilder builder = factory.newDocumentBuilder();
// get the xml to parse from URI
Document doc = builder.parse(serviceUri + "application.wadl");
XPathFactory xfactory = XPathFactory.newInstance();
XPath xpath = xfactory.newXPath();
XPathExpression expression =
xpath.compile("/application/resources/resource/#path");
this.baseUri = (String) expression.evaluate(doc, XPathConstants.STRING);
With this XPath expression the result (baseUri) is always the empty string ("").
The nodes are not in the empty string namespace, you must specify it: /wadl:application/wadl:resources/wadl:resource/#path. Also, you should register the namespace in the XPath engine namespace context.
This is working example:
xpath.setNamespaceContext(new NamespaceContext()
{
#Override
public String getNamespaceURI(final String prefix)
{
if(prefix.equals("wadl"))
return "http://research.sun.com/wadl/2006/10";
else
return null;
}
#Override
public String getPrefix(final String namespaceURI)
{
throw new UnsupportedOperationException();
}
#Override
public Iterator getPrefixes(final String namespaceURI)
{
throw new UnsupportedOperationException();
}
});
XPathExpression expression = xpath.compile("/wadl:application/wadl:resources/wadl:resource/#path");

Categories

Resources