I use the worldweatheronline API. The service gives xml in the following form:
<hourly>
<tempC>-3</tempC>
<weatherDesc>rain</weatherDesc>
<precipMM>0.0</precipMM>
</hourly>
<hourly>
<tempC>5</tempC>
<weatherDesc>no</weatherDesc>
<precipMM>0.1</precipMM>
</hourly>
Can I somehow get all the nodes <hourly> in which <tempC>> 0 and <weatherDesc> = rain?
How to exclude from the response the nodes that are not interesting to me <hourly>?
This is quite feasible using XPath.
You can filter a document based on element values, attribute values and other criteria.
Here is a working example that gets the elements according to the first point in the question:
try (InputStream is = Files.newInputStream(Paths.get("C:/temp/test.xml"))) {
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document xmlDocument = builder.parse(is);
XPath xPath = XPathFactory.newInstance().newXPath();
// get hourly elements that have tempC child element with value > 0 and weatherDesc child element with value = "rain"
String expression = "//hourly[tempC>0 and weatherDesc=\"rain\"]";
NodeList hours = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET);
for (int i = 0; i < hours.getLength(); i++) {
System.out.println(hours.item(i) + " " + hours.item(i).getTextContent());
}
} catch (Exception e) {
e.printStackTrace();
}
I think you should create xsd from xml and generate JAXB classes.Using those JAXB class you can easily unmarshal the xml and process your logic.
Related
I have the following code
try {
String xml = "<ADDITIONALIDENT><FEATURE MID=\"TEST\"><NAME>ONE NAME</NAME><VALUE>ONE VALUE</VALUE></FEATURE><FEATURE MID=\"TEST\"><NAME>TWO NAME</NAME><VALUE>TWO VALUE</VALUE></FEATURE><FEATURE MID=\"TEST\"><NAME>THREE NAME</NAME><VALUE>THREE VALUE</VALUE></FEATURE><FEATURE MID=\"TEST\"><NAME>FOUR NAME</NAME><VALUE>FOUR VALUE</VALUE></FEATURE><FEATURE MID=\"TEST\"><NAME>FIVE NAME</NAME><VALUE>FIVE VALUE</VALUE></FEATURE></ADDITIONALIDENT>";
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
dbFactory.setNamespaceAware(true);
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document document = dBuilder.newDocument();
document = dBuilder.parse(new InputSource(new StringReader(xml)));
NodeList featureList = document.getElementsByTagName("FEATURE");
for (int i = 0; i < featureList.getLength(); i++) {
Element featureElement = (Element) featureList.item(i);
NodeList nameList = featureElement.getElementsByTagName("NAME");
NodeList valueList = featureElement.getElementsByTagName("VALUE");
System.out.println("THIS IS NAME: " + nameList.item(0).getTextContent());
System.out.println("THIS IS VALUE: " + valueList.item(0).getTextContent());
}
} catch (Exception e) {
e.printStackTrace();
}
It works fine and it finds the correct values, but I don't think I am doing it the right way. I feel like I shouldn't be using lists within the actual featureList Element.
Is there a way to get the values without making two lists?
<ADDITIONALIDENT>
<FEATURE MID="TEST">
<NAME>ONE NAME</NAME>
<VALUE>ONE VALUE</VALUE>
</FEATURE>
<FEATURE MID="TEST">
<NAME>TWO NAME</NAME>
<VALUE>TWO VALUE</VALUE>
</FEATURE>
<ADDITIONALIDENT>
try with following solution,
try {
String xml = "YOUR_XML_CONTEN";
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
dbFactory.setNamespaceAware(true);
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document document = dBuilder.newDocument();
document = dBuilder.parse(new InputSource(new StringReader(xml)));
NodeList featureList = document.getElementsByTagName("FEATURE");
for (int i = 0; i < featureList.getLength(); i++) {
Element featureElement = (Element) featureList.item(i);
System.out.println("THIS IS NAME: " +
featureElement.getElementsByTagName("NAME").item(0).getTextContent());
System.out.println("THIS IS VALUE: " +
featureElement.getElementsByTagName("VALUE").item(0).getTextContent());
}
} catch (Exception e) {
e.printStackTrace();
}
output,
THIS IS NAME: ONE NAME
THIS IS VALUE: ONE VALUE
THIS IS NAME: TWO NAME
THIS IS VALUE: TWO VALUE
THIS IS NAME: THREE NAME
THIS IS VALUE: THREE VALUE
THIS IS NAME: FOUR NAME
THIS IS VALUE: FOUR VALUE
THIS IS NAME: FIVE NAME
THIS IS VALUE: FIVE VALUE
Good question. There are several ways to query an XML document using Java:
a) Parse the XML document into a Document and use getElementsByTagName to extract the nodes that you need. This is your current approach. It is OK for simple documents but it does not scale well because the Document class knows nothing about the structure of the document. The getElementsByTagName() method has to assume that any tag that it finds might occur more than once.
But you can fix that...
b) Generate Java classes for your specific document structure. This requires you to have an XML schema that describes the structure of your XML. You can then use JAXB to generate Java classes to process your specific XML format. In your example, the generated code would know (from the schema) that there is exactly one instance of NAME and VALUE within each FEATURE tag. The getter methods for NAME and VALUE would return a single Node, so your code would not need to use arrays for single-occurrence elements.
See https://docs.oracle.com/javase/tutorial/jaxb/intro/index.html for more details.
c) Use the XPath support that is built into Java to extract exactly the nodes that you need. XPath is designed for processing XML documents and is very powerful and flexible.
See How to read XML using XPath in Java for more details.
Option a) is hardly ever used for processing non-trivial XML documents. Both b) and c) are very common.
got a little problem. I have the following code:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("result1.xml");
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//element");
String elements = (String) expr.evaluate(doc, XPathConstants.STRING);
What i get :
jcruz0#exblog.jp
Cheryl
Blake
195115
What i want:
<person>
<email>jcruz0#exblog.jp</email>
<firstname>Cheryl</firstname>
<lastname>Blake</lastname>
<number>195115</number>
</person>
So as you can see i want the full XML tree. Not just the NodeValue.
Maybe somebody knows the trick.
Thanks for any help.
You got the string value of the selected XML element because you specified XPathConstants.STRING to XPathExpression.evaluate().
Instead, specify a return type of XPathConstants.NODE if you know for sure that your XPath will select a single element,
String elements = (String) expr.evaluate(doc, XPathConstants.NODE);
or XPathConstants.NODESET for multiple elements, which you would then iterate over to process as necessary.
Something like this can be done.
XPathExpression expr = xpath.compile("/person");
NodeList elements = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < elements.getLength(); i++) {
// the person node
System.out.println(elements.item(i).getNodeName());
for (int x = 0; x < elements.item(i).getChildNodes().getLength(); x++) {
// the elements under person
if (elements.item(i).getChildNodes().item(x).getNodeType() == Node.ELEMENT_NODE) {
System.out.println("\t" + elements.item(i).getChildNodes().item(x).getNodeName() + " - " + elements.item(i).getChildNodes().item(x).getTextContent());
}
}
}
Output
person
email - jcruz0#exblog.jp
firstname - Cheryl
lastname - Blake
number - 195115
You can use the nodes to do what you want, or wrap them in < and > if you just want to print them.
I have a XML file that contains two elements: Project and Layer. I want to get attribute idLayer with the highest number using java. My code is not working properly:
public int GetMaxID() throws JAXBException {
try {
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
String expression = "//Project/Layer/#idLayer[not(. <=../preceding-sibling::Layer/#idLayer) and not(. <=../following-sibling::Layer/#idLayer)]";
XPathExpression xPathExpression = xPath.compile(expression);
InputSource doc = newInputSource(newInputStreamReader(newFileInputStream(newFile("Projects//asdad//ProjectDataBase.xml"))));
NodeList elem1List = (NodeList) xPathExpression.evaluate(doc, XPathConstants.NODESET);
int maxId = elem1List.getLength();//give me 0
} catch (Exception e) {
e.printStackTrace();
}
return -1;
}
My XML code:
<tns:Project xmlns:tns="http://www.example.org/ProjectDataBase" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.example.org/ProjectDataBase ProjectDataBase.xsd ">
<tns:Layer idLayer="1">
<tns:LayerName>tns:LayerName1</tns:LayerName>
</tns:Layer>
<tns:Layer idLayer="2">
<tns:LayerName>tns:LayerName2</tns:LayerName>
</tns:Layer>
<tns:Layer idLayer="3">
<tns:LayerName>tns:LayerName3</tns:LayerName>
</tns:Layer>
</tns:Project>
Can you point me to the right direction?
Your problem is the tns namespace. You don't use it in your XPath expression, therefore it cannot select anything.
There are countless examples of how to register XML namespaces with JDOM, for example this one.
Also, your XPath is way too complicated.
Use //tns:Project/tns:Layer[not(#idLayer < ../tns:Layer/#idLayer)]/#idLayer.
Mind that this does not give the maximum node, but all the maximum nodes - there could be more than one.
I need some advice on how to parse XML with Java where there are multiple nodes that have the same tag. For example, if I have an XML file that looks like this:
<?xml version="1.0"?>
<TrackResponse>
<TrackInfo ID="EJ958083578US">
<TrackSummary>Your item was delivered at 8:10 am on June 1 in Wilmington DE 19801.</TrackSummary>
<TrackDetail>May 30 11:07 am NOTICE LEFT WILMINGTON DE 19801.</TrackDetail>
<TrackDetail>May 30 10:08 am ARRIVAL AT UNIT WILMINGTON DE 19850.</TrackDetail>
<TrackDetail>May 29 9:55 am ACCEPT OR PICKUP EDGEWATER NJ 07020.</TrackDetail>
</TrackInfo>
</TrackResponse>
I am able to get the "TrackSummary" but I do not know how to handle the "TrackDetail", since there is more than 1. There could be more than the 3 on that sample XML so I need a way to handle that.
So far I have this code:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource is = new InputSource(new StringReader(xmlResponse));
Document dom = builder.parse(is);
//Get the ROOT: "TrackResponse"
Element docEle = dom.getDocumentElement();
//Get the CHILD: "TrackInfo"
NodeList nl = docEle.getElementsByTagName("TrackInfo");
String summary = "";
//Make sure we found the child node okay
if (nl != null && nl.getLength() > 0)
{
//In the event that there is more then one node, loop
for (int i = 0 ; i < nl.getLength(); i++)
{
summary = getTextValue(docEle,"TrackSummary");
Log.d("SUMMARY", summary);
}
return summary;
}
How would I handle the whole 'multiple TrackDetail nodes' ordeal? I'm new to XML parsing so I am a bit unfamiliar on how to tackle things like this.
You can try like this :
public Map getValue(Element element, String str) {
NodeList n = element.getElementsByTagName(str);
for (int i = 0; i < n.getLength(); i++) {
System.out.println(getElementValue(n.item(i)));
}
return list/MapHere;
}
If you are free to change your implementation then i would suggest you to use implementation given here.
you can collect the trackdetail in string array and when you are in XmlPullParser.END_TAG check for trackinfo tag end and then stop
You can refer below code for that.
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(f);
Element root = doc.getDocumentElement();
NodeList nodeList = doc.getElementsByTagName("TrackInfo");
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i); // this is node under track info
// do your stuff
}
for more information you can go through below link.
How to parse same name tag in xml using dom parser java?
It may help.
I am having part of HTML page given below and want to extract the content of div tag its id is hiddenDivHL using DOM Parser:
Part Of a HTML Page:
<div id='hiddenDivHL' style='display:none'>http://74.127.61.106/udayavaniIpad/details.php?home=0&catid=882&newsid=123069[InnerSep]http://www.udayavani.com/udayavani_cms/gall_content/2012/1/2012_1$thumbimg117_Jan_2012_000221787.jpg[InnerSep]ಯುವಜನತೆಯಿಂದ ಭವ್ಯಭಾರತ[OuterSep]
So far I have used the below code but I am unable to use getElementById.How to do that?
DOM Parser:
try {
URL url = new URL("http://74.127.61.106/udayavaniIpad/details_android.php?home=1&catid=882&newsid=27593");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new InputSource(url.openStream()));
doc.getDocumentElement().normalize();
NodeList nodeList = doc.getElementsByTagName("item");
/** Assign textview array lenght by arraylist size */
name = new TextView[nodeList.getLength()];
website = new TextView[nodeList.getLength()];
category = new TextView[nodeList.getLength()];
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
name[i] = new TextView(this);
Element fstElmnt = (Element) node;
NodeList nameList = fstElmnt.getElementsByTagName("hiddenDivHL");
Element nameElement = (Element) nameList.item(0);
nameList = nameElement.getChildNodes();
name[i].setText("Name = "
+ ((Node) nameList.item(0)).getNodeValue());
layout.addView(name[i]);
}
} catch (Exception e) {
System.out.println("XML Pasing Excpetion = " + e);
}
/** Set the layout view to display */
setContentView(layout);
}
XPath is IMHO the most common and easiest way to navigate the DOM in Java.
try{
URL url = new URL("http://74.127.61.106/udayavaniIpad/details_android.php?home=1& catid=882&newsid=27593");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new InputSource(url.openStream()));
doc.getDocumentElement().normalize();
XPath xpath = XPathFactory.newInstance().newXPath();
String expression = "/item/div[#id='hiddenDivHL']";
Node node = (Node) xpath.evaluate(expression, doc, XPathConstants.NODE);
} catch (Exception e) {
System.out.println("XML Pasing Excpetion = " + e);
}
I'm not sure if the XPath expression is right, but the link is here: http://developer.android.com/reference/javax/xml/xpath/package-summary.html
There are 2 differences between getElementById and getElementsByName:
getElementById requires a single unique id in your document, whereas getElementsByName can fetch several occurances of the same name.
getElementById is a method (or function) of the document object. You can only access it by using document.getElementById(..).
Your code seems to violate both these requirements, you seem to go through a loop of nodes and expect a hiddenDivHL id in each node list. So the id is not unique. Second your root point is not the document but the root point of each node in that list.
If you know you have a single instance with that id try document.getElementById.
I didn't really get the question.
a) Do you mean getting more elements by document.getElementById('hiddenDivHL')?
so my answer would be that, in a HTML-Document, the id has to be reserved for one element only.
b) If you just want to catch that element?
what exactly does not work? what are you trying to achieve? I fear I don't really get the point.
You have to call fstElmnt.getElementsByTagName("div"); to get all div's elements and them check if their attribute id is equal hiddenDivHL.
The easiest way i can think of is to use jSoup library, what it does is parse the DOM for you and lets you select elements using a css style (or jquery style) selector.
in this example you would do something like this
Document doc = Jsoup.connect("http://74.127.61.106/udayavaniIpad/details_android.php?home=1&catid=882&newsid=27593").get();
String divContents = doc.select("#hiddenDivHL").first().text();
Why are you unable to use getElementById()? It is in JavaSE 7 and JavaSE6/5/1.4.2, since 'DOM Level 2'.
To get the contents of an element in JavaScript:
var el = document.getElementById('hiddenDivHL');
var contents = el.innerHTML;
alert("Found " + contents.length + " characters of content.");
See your example on jsfiddle.
I think the confusion is due to the fact that your question is tagged JavaScript, but the code you posted is Java. They are different languages, and JavaScript people will only be confused by that parser. I haven't used Java in years so I can't really help you there.