I have the following code
try {
String xml = "<ADDITIONALIDENT><FEATURE MID=\"TEST\"><NAME>ONE NAME</NAME><VALUE>ONE VALUE</VALUE></FEATURE><FEATURE MID=\"TEST\"><NAME>TWO NAME</NAME><VALUE>TWO VALUE</VALUE></FEATURE><FEATURE MID=\"TEST\"><NAME>THREE NAME</NAME><VALUE>THREE VALUE</VALUE></FEATURE><FEATURE MID=\"TEST\"><NAME>FOUR NAME</NAME><VALUE>FOUR VALUE</VALUE></FEATURE><FEATURE MID=\"TEST\"><NAME>FIVE NAME</NAME><VALUE>FIVE VALUE</VALUE></FEATURE></ADDITIONALIDENT>";
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
dbFactory.setNamespaceAware(true);
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document document = dBuilder.newDocument();
document = dBuilder.parse(new InputSource(new StringReader(xml)));
NodeList featureList = document.getElementsByTagName("FEATURE");
for (int i = 0; i < featureList.getLength(); i++) {
Element featureElement = (Element) featureList.item(i);
NodeList nameList = featureElement.getElementsByTagName("NAME");
NodeList valueList = featureElement.getElementsByTagName("VALUE");
System.out.println("THIS IS NAME: " + nameList.item(0).getTextContent());
System.out.println("THIS IS VALUE: " + valueList.item(0).getTextContent());
}
} catch (Exception e) {
e.printStackTrace();
}
It works fine and it finds the correct values, but I don't think I am doing it the right way. I feel like I shouldn't be using lists within the actual featureList Element.
Is there a way to get the values without making two lists?
<ADDITIONALIDENT>
<FEATURE MID="TEST">
<NAME>ONE NAME</NAME>
<VALUE>ONE VALUE</VALUE>
</FEATURE>
<FEATURE MID="TEST">
<NAME>TWO NAME</NAME>
<VALUE>TWO VALUE</VALUE>
</FEATURE>
<ADDITIONALIDENT>
try with following solution,
try {
String xml = "YOUR_XML_CONTEN";
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
dbFactory.setNamespaceAware(true);
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document document = dBuilder.newDocument();
document = dBuilder.parse(new InputSource(new StringReader(xml)));
NodeList featureList = document.getElementsByTagName("FEATURE");
for (int i = 0; i < featureList.getLength(); i++) {
Element featureElement = (Element) featureList.item(i);
System.out.println("THIS IS NAME: " +
featureElement.getElementsByTagName("NAME").item(0).getTextContent());
System.out.println("THIS IS VALUE: " +
featureElement.getElementsByTagName("VALUE").item(0).getTextContent());
}
} catch (Exception e) {
e.printStackTrace();
}
output,
THIS IS NAME: ONE NAME
THIS IS VALUE: ONE VALUE
THIS IS NAME: TWO NAME
THIS IS VALUE: TWO VALUE
THIS IS NAME: THREE NAME
THIS IS VALUE: THREE VALUE
THIS IS NAME: FOUR NAME
THIS IS VALUE: FOUR VALUE
THIS IS NAME: FIVE NAME
THIS IS VALUE: FIVE VALUE
Good question. There are several ways to query an XML document using Java:
a) Parse the XML document into a Document and use getElementsByTagName to extract the nodes that you need. This is your current approach. It is OK for simple documents but it does not scale well because the Document class knows nothing about the structure of the document. The getElementsByTagName() method has to assume that any tag that it finds might occur more than once.
But you can fix that...
b) Generate Java classes for your specific document structure. This requires you to have an XML schema that describes the structure of your XML. You can then use JAXB to generate Java classes to process your specific XML format. In your example, the generated code would know (from the schema) that there is exactly one instance of NAME and VALUE within each FEATURE tag. The getter methods for NAME and VALUE would return a single Node, so your code would not need to use arrays for single-occurrence elements.
See https://docs.oracle.com/javase/tutorial/jaxb/intro/index.html for more details.
c) Use the XPath support that is built into Java to extract exactly the nodes that you need. XPath is designed for processing XML documents and is very powerful and flexible.
See How to read XML using XPath in Java for more details.
Option a) is hardly ever used for processing non-trivial XML documents. Both b) and c) are very common.
Related
I use the worldweatheronline API. The service gives xml in the following form:
<hourly>
<tempC>-3</tempC>
<weatherDesc>rain</weatherDesc>
<precipMM>0.0</precipMM>
</hourly>
<hourly>
<tempC>5</tempC>
<weatherDesc>no</weatherDesc>
<precipMM>0.1</precipMM>
</hourly>
Can I somehow get all the nodes <hourly> in which <tempC>> 0 and <weatherDesc> = rain?
How to exclude from the response the nodes that are not interesting to me <hourly>?
This is quite feasible using XPath.
You can filter a document based on element values, attribute values and other criteria.
Here is a working example that gets the elements according to the first point in the question:
try (InputStream is = Files.newInputStream(Paths.get("C:/temp/test.xml"))) {
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document xmlDocument = builder.parse(is);
XPath xPath = XPathFactory.newInstance().newXPath();
// get hourly elements that have tempC child element with value > 0 and weatherDesc child element with value = "rain"
String expression = "//hourly[tempC>0 and weatherDesc=\"rain\"]";
NodeList hours = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET);
for (int i = 0; i < hours.getLength(); i++) {
System.out.println(hours.item(i) + " " + hours.item(i).getTextContent());
}
} catch (Exception e) {
e.printStackTrace();
}
I think you should create xsd from xml and generate JAXB classes.Using those JAXB class you can easily unmarshal the xml and process your logic.
I am able to parse something like this:
<tag>value</tag>
via:
File inputFile = new File("input.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(inputFile);
doc.getDocumentElement().normalize();
NodeList nodes = doc.getElementsByTagName("tag");
String value = nodes.getLength() > 0 ? nodes.item(0).getTextContent().trim() : "";
System.out.println(value); // prints 'value'
I tried several things, but I am not able to successfully parse 'value' from this:
<tag><value/></tag>
It seems to be valid XML, but I don't know what this format is or how to parse the value.
Any suggestions are welcome!
EDIT:
From the comments, it looks like <value/> is just another tag within <tag>. To get 'value' I have to get the child tags name:
NodeList nodes = doc.getElementsByTagName("tag");
String value = nodes.item(0).getChildNodes().item(0).getNodeName();
Is there a name for a value in this format (one in <> markers ending in /)?
To get the node name value inside the node <tag>, do as follow:
System.out.println(nodes.item(0).getFirstChild().getNodeName());
To get the text content of value do as follow (<tag><value>5</value></tag>):
System.out.println(nodes.item(0).getFirstChild().getTextContent());
I have an XML with the following structure.
<message>
<header>
</header>
<body>
</body>
<end>
</end>
</message>
Each header,body and end nodes contain fields that i need to extract into separate hash maps. What is the best way to go about this without using external libraries? The end result is to display a two-column view of the entire message. (field name, value)
You can use DocumentBuilderFactory and DocumentBuilder that comes along with java Api.
For Example, refer link.
It depends on the structure of your data and your hashmap: what is the key, what if the value.
Nevertheless, DOM and XPATH do the job:
String xml= // your xml
DocumentBuilderFactory builderFactory =DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(xml)));
String expression="//header"; // Same for body, ...
XPathExpression expr = xpath.compile(expression) ;
NodeList nodes = (NodeList) expr.evaluate(document, XPathConstants.NODESET);
for (int k = 0; k < nodes.getLength(); k++) {
// Do what you want with that
hope it helps
I am having part of HTML page given below and want to extract the content of div tag its id is hiddenDivHL using DOM Parser:
Part Of a HTML Page:
<div id='hiddenDivHL' style='display:none'>http://74.127.61.106/udayavaniIpad/details.php?home=0&catid=882&newsid=123069[InnerSep]http://www.udayavani.com/udayavani_cms/gall_content/2012/1/2012_1$thumbimg117_Jan_2012_000221787.jpg[InnerSep]ಯುವಜನತೆಯಿಂದ ಭವ್ಯಭಾರತ[OuterSep]
So far I have used the below code but I am unable to use getElementById.How to do that?
DOM Parser:
try {
URL url = new URL("http://74.127.61.106/udayavaniIpad/details_android.php?home=1&catid=882&newsid=27593");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new InputSource(url.openStream()));
doc.getDocumentElement().normalize();
NodeList nodeList = doc.getElementsByTagName("item");
/** Assign textview array lenght by arraylist size */
name = new TextView[nodeList.getLength()];
website = new TextView[nodeList.getLength()];
category = new TextView[nodeList.getLength()];
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
name[i] = new TextView(this);
Element fstElmnt = (Element) node;
NodeList nameList = fstElmnt.getElementsByTagName("hiddenDivHL");
Element nameElement = (Element) nameList.item(0);
nameList = nameElement.getChildNodes();
name[i].setText("Name = "
+ ((Node) nameList.item(0)).getNodeValue());
layout.addView(name[i]);
}
} catch (Exception e) {
System.out.println("XML Pasing Excpetion = " + e);
}
/** Set the layout view to display */
setContentView(layout);
}
XPath is IMHO the most common and easiest way to navigate the DOM in Java.
try{
URL url = new URL("http://74.127.61.106/udayavaniIpad/details_android.php?home=1& catid=882&newsid=27593");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new InputSource(url.openStream()));
doc.getDocumentElement().normalize();
XPath xpath = XPathFactory.newInstance().newXPath();
String expression = "/item/div[#id='hiddenDivHL']";
Node node = (Node) xpath.evaluate(expression, doc, XPathConstants.NODE);
} catch (Exception e) {
System.out.println("XML Pasing Excpetion = " + e);
}
I'm not sure if the XPath expression is right, but the link is here: http://developer.android.com/reference/javax/xml/xpath/package-summary.html
There are 2 differences between getElementById and getElementsByName:
getElementById requires a single unique id in your document, whereas getElementsByName can fetch several occurances of the same name.
getElementById is a method (or function) of the document object. You can only access it by using document.getElementById(..).
Your code seems to violate both these requirements, you seem to go through a loop of nodes and expect a hiddenDivHL id in each node list. So the id is not unique. Second your root point is not the document but the root point of each node in that list.
If you know you have a single instance with that id try document.getElementById.
I didn't really get the question.
a) Do you mean getting more elements by document.getElementById('hiddenDivHL')?
so my answer would be that, in a HTML-Document, the id has to be reserved for one element only.
b) If you just want to catch that element?
what exactly does not work? what are you trying to achieve? I fear I don't really get the point.
You have to call fstElmnt.getElementsByTagName("div"); to get all div's elements and them check if their attribute id is equal hiddenDivHL.
The easiest way i can think of is to use jSoup library, what it does is parse the DOM for you and lets you select elements using a css style (or jquery style) selector.
in this example you would do something like this
Document doc = Jsoup.connect("http://74.127.61.106/udayavaniIpad/details_android.php?home=1&catid=882&newsid=27593").get();
String divContents = doc.select("#hiddenDivHL").first().text();
Why are you unable to use getElementById()? It is in JavaSE 7 and JavaSE6/5/1.4.2, since 'DOM Level 2'.
To get the contents of an element in JavaScript:
var el = document.getElementById('hiddenDivHL');
var contents = el.innerHTML;
alert("Found " + contents.length + " characters of content.");
See your example on jsfiddle.
I think the confusion is due to the fact that your question is tagged JavaScript, but the code you posted is Java. They are different languages, and JavaScript people will only be confused by that parser. I haven't used Java in years so I can't really help you there.
I have an XML document, and an XPath expression for that doc. I have to update the doc by using XPath at runtime.
How can I do this using Java?
The below is my xml:
<?xml version="1.0" encoding="ISO-8859-1"?>
<PersonList>
<Person>
<Name>Sonu Kapoor</Name>
<Age>24</Age>
<Gender>M</Gender>
<PostalCode>54879</PostalCode>
</Person>
<Person>
<Name>Jasmin</Name>
<Age>28</Age>
<Gender>F</Gender>
<PostalCode>78745</PostalCode>
</Person>
<Person>
<Name>Josef</Name>
<Age>232</Age>
<Gender>F</Gender>
<PostalCode>53454</PostalCode>
</Person>
</PersonList>
I have to change the values of name and age under //PersonList/Person[2]/Name.
Use setNodeValue. First, get a NodeList, for example:
myNodeList = (NodeList) xpath.compile("//MyXPath/text()")
.evaluate(myXmlDoc, XPathConstants.NODESET);
Then set the value of e.g. the first node:
myNodeList.item(0).setNodeValue("Hi mom!");
More examples e.g. here.
As mentioned in two other answers here, as well as in your previous question: technically, XPath is not a way to "update" an XML document, but only to locate nodes within an XML document. But I presume the above is what you want.
EDIT: Responding to your comment... Are you asking how to write your DOM to an XML file after you've finished editing the DOM? If so, here are two examples of how to do it:
http://www.java2s.com/Code/Java/XML/WriteDOMout.htm
http://download.oracle.com/javaee/1.4/tutorial/doc/JAXPXSLT4.html
You can delete the file and create a new one.
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(
new InputSource("data.xml"));
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xpath.evaluate("//employee/name[text()='old']", doc,
XPathConstants.NODESET);
for (int idx = 0; idx < nodes.getLength(); idx++) {
nodes.item(idx).setTextContent("new value");
}
Transformer xformer = TransformerFactory.newInstance().newTransformer();
xformer.transform(new DOMSource(doc), new StreamResult(new File("data_new.xml")));
XPath is used to select parts of an XML document.It has no provision for updating. But since it returns DOM objects (Elements, if memory serves, or maybe Nodes) you can then use DOM methods for altering the document.
XPath can be used to select nodes in a document, not for modification
You apply the xpath expression to your document and get an element (in your case). Once you have this Element, you can use the Element methods to change values (name and age in your case)
Starting from a NodeList it should work like that:
NodeList nodes = getNodeListFromXPathExpression(); // you know how
if (nodes.length == 0)
return; // empty nodelist, xpath didn't select anything
Node first = node.getItem(0); // take the first from the list, your element
// this is a shortcut for your example:
// first is the actual selected element (a node)
// .getFirst() returns the first child node, the "text node" (="Jasmine", ="28")
// .setNodeValue() replace the actual value of that text node with a new string
first.getFirstChild().setNodeValue("New Name or new age");
Consider using XQuery Update instead of XPath. This allows you to write
replace value of node //PersonList/Person[2]/Name with "Anonymous"
This is much easier than using the Java DOM API.
I've created a small project for using XPATH to create/update XML:
https://github.com/shenghai/xmodifier
the code to change your xml is like:
Document document = readDocument("personList.xml");
XModifier modifier = new XModifier(document);
modifier.addModify("//PersonList/Person[2]/Name", "newName");
modifier.modify();
This is a super cool function where you can able to modify any tag value for any XML document using its xpath. You need to pass three arguments xml,xpathExpression and newValue and it returns the XML file as String with modified value.
If you want to pass XML as file, you need to change the function accordingly. But the logic will be same.
public String updateXML(String xml, String xpathExpression, String newValue)
{
try {
//Creating document builder
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document document = builder.parse(new org.xml.sax.InputSource(new StringReader(xml)));
//Evaluating xpath expression using Element
XPath xpath = XPathFactory.newInstance().newXPath();
Element element = (Element)xpath.evaluate(xpathExpression, document, XPathConstants.NODE);
//Setting value in the text
element.setTextContent(value);
//Transformation of document to xml
StringWriter stringWriter = new StringWriter();
Transformer xformer = TransformerFactory.newInstance().newTransformer();
xformer.transform(new DOMSource(document), new StreamResult(stringWriter));
xml = stringWriter.toString();
}
catch (Exception e)
{
e.printStackTrace();
}
return xml;
}
Here is the code to change the content with vtd-xml... vtd-xml is unique in that it is the only API that offers incremental update capability.
import com.ximpleware.*;
import java.io.*;
public class changeName {
public static void main(String s[]) throws VTDException,java.io.UnsupportedEncodingException,java.io.IOException{
VTDGen vg = new VTDGen();
if (!vg.parseFile("input.xml", false))
return;
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
XMLModifier xm = new XMLModifier(vn);
ap.selectXPath("//PersonList/Person[2]");
int i=0;
while((i=ap.evalXPath())!=-1){
if (vn.toElement(VTDNav.FIRST_CHILD,"Name")){
int k=vn.getText();
if (i!=-1)
xm.updateToken(k, "Jonathan");
vn.toElement(VTDNav.PARENT);
}
if (vn.toElement(VTDNav.FIRST_CHILD,"Age")){
int k=vn.getText();
if (i!=-1)
xm.updateToken(k, "42");
vn.toElement(VTDNav.PARENT);
}
}
xm.output("new.xml");
}
}