I know how to parse XML documents with DOM when they are in the form:
<tagname> valueIWant </tagname>
However, the element I'm now trying to get is instead in the form
<photo farm="9" id="8147664661" isfamily="0" isfriend="0" ispublic="1"
owner="8437609#N04" secret="4902a217af" server="8192" title="Rainbow"/>
I usually use cel.getTextContent() to return the value, but that doesn't work in this case. Neither does cel.getAttributes(), which I thought would work...
Ideally, I need to just get the id and owner numerical values. However if someone can help on how to get all of it, then I can deal with removing the parts I don't want later.
What you're looking to retrieve is the value of different attributes that are attached with an Element. Look at using the getAttribute(String name) method to achieve this
If you want to retrieve all the attributes, all you can do so using getAttributes() and iterate through it. An example of both of these methods might be something like this:
private void getData(Document document){
if(document == null)
return;
NodeList list = document.getElementsByTagName("photo");
Element photoElement = null;
if(list.getLength() > 0){
photoElement = (Element) list.item(0);
}
if(photoElement != null){
System.out.println("ID: "+photoElement.getAttribute("id"));
System.out.println("Owner: "+photoElement.getAttribute("owner"));
NamedNodeMap childList = photoElement.getAttributes();
Attr attribute;
for(int index = 0; index < childList.getLength(); index++){
if(childList.item(index).getNodeType() == Node.ATTRIBUTE_NODE){
attribute = ((Attr)childList.item(index));
System.out.println(attribute.getNodeName()+" : "+attribute.getNodeValue());
}else{
System.out.println(childList.item(index).getNodeType());
}
}
}
}
Something like:
Element photo = (Element)yournode;
photo.getAttribute("farm");
will get you the value of the farm attribute. You need to treat your node as an Element to have access to these attributes (doc).
Related
how can i read value of this xpath '/html/body/article/div[2]/div/div[1]/div[10]/div/div/text()[1]' in java?
findElement(By.xpath(xpath));
i can access item via chrome extention but in java it fails saying no element found. i can only access like this; '/html/body/article/div[2]/div/div[1]/div[10]/div/div' but it is not what i want.
for example for this site: https://www.milliyet.com.tr/siyaset/canikli-kararlilikla-yurumeye-devam-ediyoruz-6277692
i would like to get these items text value separately
/html/body/article/div[2]/div/div[1]/div[10]/div/div/text()[1]
/html/body/article/div[2]/div/div[1]/div[10]/div/div/span/span[1]/time
/html/body/article/div[2]/div/div[1]/div[10]/div/div/span/span[2]/time
thanks.
findElement () can only return the Web element which contains the text you may need. If you need the actual text within the element, you have to use getText() method. You will first need to find the element and use getText() method to retrieve the text.
WebElement myElement = findElement(By.xpath("/html/body/article/div[2]/div/div[1]/div[10]/div/div"));
System.out.println(myElement.getText());
Use relative XPath to fetch your data. You can use the 3 following XPath :
//div[#class='nd-article__info-block']
(//time[#datetime])[1]
(//time[#datetime])[2]
To extract the text, use getText() method :
string el1 = driver.FindElement(By.XPath("//div[#class='nd-article__info-block']")).getText();
string el2 = driver.FindElement(By.XPath("(//time[#datetime])[1]")).getText();
string el2 = driver.FindElement(By.XPath("(//time[#datetime])[2]")).getText();
You can also store the 3 results in a list with a single XPath expression :
//div[#class='nd-article__info-block']|//time[#datetime]
Code :
List<WebElement> list=driver.findElements(By.xpath("//div[#class='nd-article__info-block']|//time[#datetime]"));
List<String> els_text=new ArrayList<>();
for(int i=0; i<list.size(); i++){
els_text.add(list.get(i).getText());
System.out.println(list.get(i).getText());
}
Is it possible to get the contents of an XML tag as a String in Java using Simple XML?
I'm trying to do it using a Converter. I can obatin <tag1> as an InputNode object, but there is no API to retrieve the contents as String. I could iterate the children with InputNode.getNext() and reconstruct the content by recursively retrieving name, attributes, values, etc... but I would never be sure that it would match the original XML.
Example:
<root>
<tag1>
<unknownTag>Unknown</unknownTag>
<otherUnknownTag>
<children1>hello</children1>
<children2>bye</children2>
</otherUnknownTag>
</tag1>
<tag2>
...
</tag2>
</root>
I would like to retrieve the following contents of <tag1> as a String (and prevent deserialisation for all <tag1> children):
<unknownTag>Unknown</unknownTag>
<otherUnknownTag>
<children1>hello</children1>
<children2>bye</children2>
</otherUnknownTag>
The contents of <tag1> are not known at deserialisation time.
As far as I know it is possible only partially. This is how far I've got:
public String getNodeAsString(InputNode node) throws Exception {
StringBuilder builder = new StringBuilder();
String value = node.getValue();
if (value != null) {
builder.append(value);
}
InputNode child = node.getNext();
while (child != null) {
builder.append("<").append(child.getName());
for (String attribute : child.getAttributes()) {
builder.append(" ")
.append(attribute)
.append("=\"")
.append(child.getAttribute(attribute).getValue())
.append("\"");
}
builder.append(">")
.append(child.getValue())
.append("</").append(child.getName()).append(">");
value = node.getValue();
if (value != null) {
builder.append(value);
}
child = child.getNext();
}
return builder.toString();
}
This kind of works but has two flaws:
The order of attributes is not preserved because SimpleXML puts attributes to map and the attributes iteration is ordered in the same order as map keys.
This cannot parse tags nested in direct child of the InputNode or at least I don't know how to get list of children of child node.
I want to get the number of elements in an xml, which have specific name eg. name="while")
The problem is that if I use the following code- I only get top level elements that have this name--
for ( Iterator i = root.elementIterator( "while" ); i.hasNext(); ) {
Element foo = (Element) i.next();
But any lower level "while" element is not part of the iterator...
What is the most efficient way of obtaining all elements (whether top level or lower level) that have name="while"? Do I have to parse through all elements in document for this purpose?
You can use XPath for that using //while or the name() function and a wildcard node *: //*[name() = 'while']
List list = document.selectNodes("//*[name() = 'while']"); // or "//while"
int numberOfNodes = list.size();
for (Iterator iter = list.iterator(); iter.hasNext(); ) {
// do something
}
This regex worked for me:
document.selectNodes(".//*")
Element Iterator works only for next level in xPath. To be able to parse all XML and get all Element you have to use some recursion. For example next code will return list of all nodes with "name"
public List<Element> getElementsByName(String name, Element parent, List<Element> elementList)
{
if (elementList == null)
elementList = new ArrayList<Element>();
for ( Iterator i = parent.elementIterator(); i.hasNext(); ) {
Element current = (Element) i.next();
if (current.getName().equalsIgnoreCase(name))
{
elementList.add(current);
}
getElementsByName(name, current, elementList);
}
return elementList;
}
I am trying to parse an xml file in which there is a group element "patent-assignee" which contains some elements- name, address1, address2,city,state, postcode, country.
While values will always be there for "name" and "address1" the other elements may or may not have values.
I have navigated to a single patent-assignee element, and now want to check if this record has value for address2 (and other fields) or not.
Some relevant code is given below--
el_patentassignees= (Element) npassignee.item(ncount);
//now el_patentassignee has in it the content of one patent assignee element
el_assigneeaddress2= (Element) el_patentassignees.getElementsByTagName("address2").item(0);
val_assigneeaddress2= el_assigneeaddress2.getTextContent();
Iterate through all child nodes of el_assigneeaddress2, then, if you see a Text node, take the value:
NodeList nodeList = el_assigneeaddress2.getChildNodes();
for (int i = 0; i < nodeList.getLength(), i++) {
Node child = nodeList.item(i);
if (child.getName().equals("#text")) {
val_assigneeaddress2= child.getTextContent();
break;
}
}
I want to modify xml file using dom ,but when I make node.getNodeValue(); it returns null !I don't know why? my xml file contains the following tags:
[person] which contains child [name] which contains childs [firstname ,middleInitial ,lastName] childs
I want to update First name , middleInitial and last name using dom
this is my java dom processing file:
NodeList refPeopleList = doc.getElementsByTagName("person");
for (int i = 0; i < refPeopleList.getLength(); i++) {
NodeList personList = refPeopleList.item(i).getChildNodes();
for (int personDetalisCnt = 0; personDetalisCnt < refPeopleList.getLength(); personDetalisCnt++) {
{
currentNode = personList.item(personDetalisCnt);
String nodeName = currentNode.getNodeName();
System.out.println("node name is " + nodeName);
if (nodeName.equals("name")) {
System.out.println("indise name");
NodeList nameList = currentNode.getChildNodes();
for(int cnt=0;cnt<nameList.getLength();cnt++)
{
currentNode=nameList.item(cnt);
if(currentNode.getNodeName().equals("firstName"))
{
System.out.println("MODIFID NAME :"+currentNode.getNodeValue()); //prints null
System.out.println("indide fname"+" node name is "+currentNode.getNodeName()); //prints firstName
String nodeValue="salma";
currentNode.setNodeValue(nodeValue);
System.out.println("MODIFID NAME :"+currentNode.getNodeValue());//prints null
}
}
}
}
Rather than calling getNodeValue() / setNodeValue() on the <firstName> element node, try getting the firstName element's text node child, and call getNodeValue() / setNodeValue() on it.
Try
if(currentNode.getNodeName().equals("firstName"))
{
Node textNode = currentNode.getFirstChild();
System.out.println("Initial value:" + textNode.getNodeValue());
String nodeValue="salma";
textNode.setNodeValue(nodeValue);
System.out.println("Modified value:" + textNode.getNodeValue());
}
From the DOM spec,
The attributes nodeName, nodeValue and
attributes are included as a mechanism
to get at node information without
casting down to the specific derived
interface. In cases where there is no
obvious mapping of these attributes
for a specific nodeType (e.g.,
nodeValue for an Element or attributes
for a Comment), this returns null.
Similarly in the Java docs for the Node interface, the table near the top shows that the nodeValue of an element is null.
This is why using getNodeValue on an element will always return null, and why you need to use getFirstChild() first in order to get the text node (assuming there are no other child nodes). If there is a mixture of element and text child nodes, you can use getNodeType() to check which child is which (text is type 3).
Is it firstName or firstname (watch the case).