I have a xpath of an element and need to write a java code which gives me exactly the same element as an object. I believe i need to use SAX or DOM ? i m totally newbie..
xpath :
/*[local-name(.)='feed']/*[local-name(.)='entry']/*[local-name(.)='title']
Your comment suggests you want to use DOM4J, which supports XPath out of the box:
SAXReader reader = new SAXReader();
Document doc = reader.read(new File(....)); // or URL, or wherever the XML comes from
Node selectedNode = doc.selectSingleNode("/*[local-name(.)='feed']/*[local-name(.)='entry']/*[local-name(.)='title']");
(or there's also selectNodes which returns a List, if there might be more than one node matching that XPath expression - quite likely if this is an Atom feed).
But rather than using the local-name hack like this, if you know the namespace URI of the elements in your XML you can declare a prefix for this namespace and select the nodes by their fully qualified name:
SAXReader reader = new SAXReader();
Map<String, String> namespaces = new HashMap<>();
namespaces.put("atom", "http://www.w3.org/2005/Atom");
reader.getDocumentFactory().setXPathNamespaceURIs(namespaces);
Document doc = reader.read(new File(....)); // or URL, or wherever the XML comes from
List selectedNodes = doc.selectNodes("/atom:feed/atom:entry/atom:title");
read here:
https://howtodoinjava.com/java/xml/java-xpath-tutorial-example/
I found it while I were searching to find how to convert Xpath PMD-rule to java-rule,, I did not find what I need in it.
but, anyway may be you can find yours.
Related
I have a String variable in java with xml tags as its value:
eg: String xml="<root><name>abcd</name><age>22</age><gender>male</gender></root>";
Now I need to get the value within the name tag i.e "abcd" from this variable and store the value in another string variable. How to go about this using java. Can anyone please help me out with this?
It is not quite clear what you want, but I think what you will need is something to read an XML document (as a file or directly as a string), an XML parser.
There is a whole list (and many more) of different XML parsers you can use for this:
JDOM
Woodstox
XOM
dom4j
VTD-XML
Xerces-J
Crimson
I would recommend dom4j for its easy usage. Here is an example for a dom4j implemenation:
String xmlPath = "myXmlDocument.xml";
SAXReader reader = new SAXReader();
Document document = reader.read(xmlPath);
Element rootElement = document.getRootElement();
System.out.println("Root Element: "+rootElement.getName());
You can directly feed in a String to be parsed to an XML document too:
String xmlString = "<name>Hello</name>";
SAXReader reader = new SAXReader();
Document document = DocumentHelper.parseText(xmlString);
Element rootElement = document.getRootElement();
System.out.println("Root Element: "+rootElement.getName());
References
Best XML parser for Java
http://dom4j.sourceforge.net/dom4j-1.6.1/faq.html#from-string
I have some xml that looks like this:
<xml><name>oscar</name><race>puppet</race><class>grouch</class></xml>
The tags change and are variable, so there won't always be a 'name' tag.
I've tried 3 or 4 parses and they all seem to choke on it. Any hints?
Just because it doesn't have a defined schema, doesn't mean it isn't "valid" XML - your sample XML is "well formed".
The dom4j library will do it for you. Once parsed (your XML will parse OK) you can iterate through child elements, no matter what their tag name, and work with your data.
Here's an example of how to use it:
import org.dom4j.*;
String text = "<xml><name>oscar</name><race>puppet</race><class>grouch</class></xml>";
Document document = DocumentHelper.parseText(text);
Element root = document.getRootElement();
for ( Iterator i = root.elementIterator(); i.hasNext(); ) {
Element element = (Element) i.next();
String tagName = element.getQName();
String contents = element.getText();
// do something
}
This is valid xml; try adding an XML Schema that allows for optional elements. If you can write an xml schema, you can use JAXB to parse it. XML allows for having optional elements; it isn't too "strict" about it.
Your XML sample is well-formed XML, and if anything "chokes" on it then it would be useful for us to know exactly what the symptoms of the "choking" are.
I have an xml document as a string without any namespace and I want to parse it using Java, JDOM and XPath, and create a object tree. Since XPAth always requires a prefix and a namespace to query, I added namespace and a prefix to the root and then later to the node I want to get, but I see Xpath requires a namespace in every node in the document but only in the root.
So in the beginning is there a way to add the namespace to all of the elements in the document object so my xpath query works correct?
There should be other mistakes and bad approches in the code as well. Will be glad for any ideas.
String response="myXmlString"
ByteArrayInputStream stream = new ByteArrayInputStream(
response.getBytes());
SAXBuilder builder = new SAXBuilder();
Document doc = builder.build(stream);
org.jdom.Element request=(org.jdom.Element) doc.getRootElement();
request.setNamespace(Namespace.getNamespace("myNamespace"));
createRequest(request);
And then
public Request createRequest(Element requestXML) {
Request request = new Request();
requestXML.detach();
Document doc = new Document(requestXML);
XPath xpath = XPath.newInstance(myExpression);
xpath.addNamespace("m", doc.getRootElement().getNamespaceURI());
xpath.selectSingleNode(doc);
}
this last line returns empty, it is not null but it throws jdom exception inside.
XPath and XML do NOT require namespace. Go back to your original XML and remove any namespace/prefix hackery in your code.
Currently I am parsing XML messages with XPath Expression. It works very well. However I have the following problem:
I am parsing the whole data of the XML, thus I instantiate for every call made to xPath.evaulate a new InputSource.
StringReader xmlReader = new StringReader(xml);
InputSource source = new InputSource(xmlReader);
XPathExpression xpe = xpath.compile("msg/element/#attribute");
String attribute = (String) xpe.evaluate(source, XPathConstants.STRING);
Now I would like to go deeper into my XML message and evaluate more information. For this I found myself in the need to instantiate source another time. Is this required? If I don't do it, I get Stream closed Exceptions.
Parse the XML to a DOM and keep a reference to the node(s). Example:
XPath xpath = XPathFactory.newInstance()
.newXPath();
InputSource xml = new InputSource(new StringReader("<xml foo='bar' />"));
Node root = (Node) xpath.evaluate("/", xml, XPathConstants.NODE);
System.out.println(xpath.evaluate("/xml/#foo", root));
This avoids parsing the string more than once.
If you must reuse the InputSource for a different XML string, you can probably use the setters with a different reader instance.
lets say the string is <title>xyz</title>
I want to extract the xyz out of the string.
I used:
Pattern titlePattern = Pattern.compile("<title>\\s*(.+?)\\s*</title>");
Matcher titleMatcher = titlePattern.matcher(line);
String title=titleMatcher.group(1));
but I am getting an error for titlePattern.matcher(line);
You say your error occurs earlier (what is the actual error, runs without an error for me), but after solving that you will need to call find() on the matcher once to actually search for the pattern:
if(titleMatcher.find()){
String title = titleMatcher.group(1);
}
Not that if you really match against a string with non-escaped HTML entities like
<title>xyz</title>
Then your regular expression will have to use these, not the escaped entities:
"<title>\\s*(.+?)\\s*</title>"
Also, you should be careful about how far you try to get with this, as you can't really parse HTML or XML with regular expressions. If you are working with XML, it's much easier to use an XML parser, e.g. JDOM.
Not technically an answer but you shouldn't be using regular expressions to parse HTML. You can try and you can get away with it for simple tasks but HTML can get ugly. There are a number of Java libraries that can parse HTML/XML just fine. If you're going to be working a lot with HTML/XML it would be worth your time to learn them.
As others have suggested, it's probably not a good idea to parse HTML/XML with regex. You can parse XML Documents with the standard java API, but I don't recommend it. As Fabian Steeg already answered, it's probably better to use JDOM or a similar open source library for parsing XML.
With javax.xml.parsers you can do the following:
String xml = "<title>abc</title>";
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse(new InputSource(new StringReader(xml)));
NodeList nodeList = doc.getElementsByTagName("title");
String title = nodeList.item(0).getTextContent();
This parses your XML string into a Document object which you can use for further lookups. The API is kinda horrible though.
Another way is to use XPath for the lookup:
XPathFactory xpathFactory = XPathFactory.newInstance();
XPath xPath = xpathFactory.newXPath();
String titleByXpath = xPath.evaluate("/title/text()", new InputSource(new StringReader(xml)));
// or use the Document for lookup
String titleFromDomByXpath = xPath.evaluate("/title/text()", doc);