How to know the element contain namespace(xmlns) or not? - java

<xxx1 xmlns="hello">
<xxx2>
<xxx3>
<name>rule_1</name>
</xxx3>
</xxx2>
</xxx1>
I select node by "//*[namespace-uri()='hello']/*[local-name()='name']"
It should get //hello:xxx1/xxx2/xxx3/name , and it does.
Now I try to get element . In reality, I don't know how much parent for <name> will get <xxx1>;
I try this code
node.getParent().getNamespaceURI() = "Hello"
and increase getParent() amount to get <xxx1>
But the first time I call <xxx3>.getNamespaceURI() it returns true.
Is the namespace inherited?
How to get the element has or not has xmlns?
Sorry for my question was not clearly.
I'm trying to get the element which is the first declared namespace "hello".
<xxx1 xmlns="hello">
<xxx2>
<xxx3>
this three node which one is contained xmlns="hello", 'cause <xxx2> and <xxx3> was not declare xmlns in the label.

Hello and Welcome to Stack Overflow!
Yes, namespaces are sort of inherited, but the terminology normally used is that, in your example, the <name> element is in the scope of the namespace declaration xmlns="hello", so the <name>element will be in the hello namespace.
With DOM4J, you can test whether an element is in a namespace or not like this:
boolean hasNamespace(Element e) {
return e.getNamespaceURI().length() > 0;
}
If the element is not in any namespace, getNamespaceURI() returns an empty string.
I guess that you want to select the <name> element, but you don't know at which level it be, i.e. how many parents it will have. You can always use this XPath expression:
Node node = doc.selectSingleNode("//*[namespace-uri() = 'foo' and local-name() = 'name']");

Related

Find child element by xpath

public WebElement findChildByXpath(WebElement parent, String xpath) {
loggingService.timeMark("findChildByXpath", "begin. Xpath: " + xpath);
String parentInnerHtml = parent.getAttribute("innerHTML"); // Uncomment for debug purpose.
WebElement child = parent.findElement(By.xpath(xpath));
String childInnerHtml = child.getAttribute("innerHTML"); // Uncomment for debug purpose.
return child;
}
The problem with this code is that childInnerHtml gives me wrong result. I scrape numbers and they are equal.
I even suppose that my code is equal to driver.findElement(By.xpath.
Could you tell me whether my comment really finds a child or what to correct?
Child XPath need to be a relative XPath. Normally this means the XPath expression is started with a dot . to make this XPath relative to the node it applied on. I.e. to be relative to the parent node. Otherwise Selenium will search for the given xpath (the parameter you passing to this method) starting from the top of the entire page.
So, if for example, the passed xpath is "//span[#id='myId']" it should be ".//span[#id='myId']".
Alternatevely you can add this dot . inside the parent.findElement(By.xpath(xpath)); line to make it
WebElement child = parent.findElement(By.xpath("." + xpath));
But passing the xpath with the dot is more simple and clean way. Especially if the passed xpath is come complex expression like "(//div[#class='myClass'])[5]//input" - in this case automatically adding a dot before this expression may not work properly.

Difference between JSoup Element and JSoup Node

Can anyone please explain the difference between the Element object and Node object provided in JSoup ?
Which is the best thing to be used in which situation/condition.
A node is the generic name for any type of object in the DOM hierarchy.
An element is one specific type of node.
The JSoup class model reflects this:
Node
Element
Since Element extends Node anything you can do on a Node, you can do on an Element too. But Element provides additional behaviour which makes it easier to use, for example; an Element has properties such as id and class etc which make it easier to find them in a HTML document.
In most cases using Element (or one of the other subclasses of Document) will meet your needs and will be easier to code to. I suspect the only scenario in which you might need to fall back to Node is if there is a specific node type in the DOM for which JSoup does not provide a subclass of Node.
Here's an example showing the same HTML document inspection using both Node and Element:
String html = "<html><head><title>This is the head</title></head><body><p>This is the body</p></body></html>";
Document doc = Jsoup.parse(html);
Node root = doc.root();
// some content assertions, using Node
assertThat(root.childNodes().size(), is(1));
assertThat(root.childNode(0).childNodes().size(), is(2));
assertThat(root.childNode(0).childNode(0), instanceOf(Element.class));
assertThat(((Element) root.childNode(0).childNode(0)).text(), is("This is the head"));
assertThat(root.childNode(0).childNode(1), instanceOf(Element.class));
assertThat(((Element) root.childNode(0).childNode(1)).text(), is("This is the body"));
// the same content assertions, using Element
Elements head = doc.getElementsByTag("head");
assertThat(head.size(), is(1));
assertThat(head.first().text(), is("This is the head"));
Elements body = doc.getElementsByTag("body");
assertThat(body.size(), is(1));
assertThat(body.first().text(), is("This is the body"));
YMMV but I think the Element form is easier to use and much less error prone.
It's seem like same. but different.
Node have Element. and additionally have TextNode too.
so... Example.
<p>A<span>B</span></p>
In P Elements.
.childNodes() // get node list
-> A
-> <span>B</span>
.children() // get element list
-> <span>B</span>

How to get elements from XPath in Java

I want to get data from an XPath query:
Element location = (Element) doc.query("//location[location_name='"+ locationName +"']/*").get(0).getDocument().getRootElement();
System.out.println(location.toXML());
Element loc = location.getFirstChildElement("location");
System.out.println(loc.getFirstChildElement("location_name").getValue());
However, no matter what I choose, I always get 1 node (because of .get(0)). I don't know how to select the node which was selected by query.
I found that I should cast the node to Element, (XOM getting attribute from Node?) but the link only shows how to select the first node.
Call getParent() on the first element in the result:
Builder parse = new Builder();
Document xml = parse.build("/var/www/JAVA/toForum.xml");
System.out.println(xml.query("//location[#id=83]/*").get(0).getParent().toXML());
Produces the following output:
<location id="83">
<location_name>name</location_name>
<company_name>company a</company_name>
<machines>
<machine id="12">A</machine>
<machine id="312">B</machine>
</machines>
</location>
The call you make to getDocument() is returning the entirety of the XML document.
The call to query() returns a Nodes object directly containing references to the nodes that you are after.
If you change to
Element location = (Element)doc.query(
"//location[location_name='"+ locationName +"']/*").get(0);
System.out.println(location.getAttribute("location_name").getValue());
it should be ok
EDIT (by extraneon)
Some extra explanation not worthy of an answer by itself:
By doing
Element location =
(Element) doc.query("//location[location_name='"
+ locationName +"']/*").get(0)
.getDocument().getRootElement();
you search through the tree and get the requested node. But then you call getDocument().getRootNode() on the element you want, which will give you the uppermost node of the document.
The above query can thus be simplified to:
Element location = (Element)doc.getRootElement();
which is not wahat you intended.
It's a bit like a bungie jump. You go down to where you need to be (the element) but go immediately back to where you came from (the root element).
It's not clear (at least for me) what actually has to be done. From your query you should get list of nodes matching the given criteria. You will get NodeList and then you can iterate over this NodeList and get content of each node with getNodeValue for example.

Remove Element from JDOM document using removeContent()

Given the following scenario, where the xml, Geography.xml looks like -
<Geography xmlns:ns="some valid namespace">
<Country>
<Region>
<State>
<City>
<Name></Name>
<Population></Population>
</City>
</State>
</Region>
</Country>
</Geography>
and the following sample java code -
InputStream is = new FileInputStream("C:\\Geography.xml");
SAXBuilder saxBuilder = new SAXBuilder();
Document doc = saxBuilder.build(is);
XPath xpath = XPath.newInstance("/*/Country/Region/State/City");
Element el = (Element) xpath.selectSingleNode(doc);
boolean b = doc.removeContent(el);
The removeContent() method doesn't remove the Element City from the content list of the doc. The value of b is false
I don't understand why is it not removing the Element, I even tried to delete the Name & Population elements from the xml just to see if that was the issue but apparently its not.
Another way I tried, I don't know why I know its not essentially different, still just for the sake, was to use Parent -
Parent p = el.getParent();
boolean s = p.removeContent(new Element("City"));
What might the problem? and a possible solution? and if anyone can share the real behaviour of the method removeContent(), I suspect it has to do with the parent-child relationship.
Sure, removeContent(Content child) removes child if child belongs to the parents immediate children, which it does not in your case. Use el.detach()instead.
If you want to remove the City element, get its parent and call removeContent:
XPath xpath = XPath.newInstance("/*/Country/Region/State/City");
Element el = (Element) xpath.selectSingleNode(doc);
el.getParent().removeContent(el);
The reason why doc.removeContent(el) does not work is because el is not a child of doc.
Check the javadocs for details. There are a number of overloaded removeContent methods there.
This way works keeping in mind that .getParent() returns a Parent object instead of an Element object, and the detach() method which eliminates the actual node, must be called from an Element.
Instead do:
el.getParentElement().detach();
This will remove the parent element with all it's children !

JDOM.Element.getChild(String) is returning unexpected results

According to the API at jdom.org, the semantics of getChild(String name):
This returns the first child element within this element with the given local name and belonging to no namespace. If no elements exist for the specified name and namespace, null is returned.
Therefore, if I have an XML structure like:
<?xml version="1.0" encoding="UTF-8"?>
<lvl1>
<lvl2>
<lvl3/>
</lvl2>
</lvl1>
I have a JDOM Element which is currently pointing to <lvl1>. I should be able to make the following call:
Element lvl3 = lvl1Element.getChild("lvl3");
and lvl3 should have non-null.
However, I'm finding that lvl3 is actually null. Am I missing something?
Here is a sample code snippet that should work:
import java.io.StringReader;
import org.jdom.*;
public static void main(String[] args){
Document doc = new SAXBuilder().build(new StringReader("path to file"));
Element lvl1Element = doc.getRootElement();
Element lvl3Element = lvl1Element.getChild("lvl3"); //is null. Why?
}
In order to get the functionality I was looking for, I used an Iterator from the getDescendants(ElementFilter) function from jdom.org
I then got the Element I was looking for by using code similar to the following:
Element lvl3 = lvl1.getDescendants(new ElementFilter("lvl3"));
You've just said it....
This returns the first child element
within this element with the given
local name...
Basically, on lvl1, your first child is lvl2. I haven't used JDOM to help further. My suggestion is to go to lvl2 and retrieve lvl3.
---lvl1
---lvl2(child of lvl1)
---lvl3(child of lvl2)

Categories

Resources