Java XML Parse/Query

Java XML Parse/Query - java

I have such XML structure, when I use NodeList nList = doc.getElementsByTagName("stock"); it return me 3 stocks, 2 main stock tags and one which is under substocks. I want to get only two stock which is on upper level and ignore all which is under substock tags.
Is it possible in Java to make something like LINQ query in C#, say return me elements only where name is equals to "Sony".
Thanks!
<city>
<stock>
<name>Sony</name>
</stock>
<stock>
<name>Panasonic</name>
<substocks>
<stock>
<name>Panasonic Shop 2</name>
</stock>
</substocks>
</stock>
</city>

I recommend you to use XPath with javax.xml.xpath package:
final InputStream is = new FileInputStream('your.xml');
final DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
final DocumentBuilder builder = factory.newDocumentBuilder();
final Document doc = builder.parse(is);
final XPathFactory xPathfactory = XPathFactory.newInstance();
final XPath xpath = xPathfactory.newXPath();
final XPathExpression expr = xpath.compile("/city/stock/name[text()='Sony']");
and then:
final NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);

Take a look on XPath and its java implementation JXPath. Other possible approach is parsing XML using JAXB and operating objects list using LambdaJ.

There is also dom4j library which has powerful navigation with XPath:
import org.dom4j.Document;
import org.dom4j.io.SAXReader;
SAXReader reader = new SAXReader();
Document document = reader.read("test.xml");
List list = document.selectNodes("/city/stock/name[text()='Sony']");
for (Iterator iter = list.iterator(); iter.hasNext(); ) {
// TODO: place you logic here
}
More examples are here

Try jcabi-xml (see this blog post) with a one-liner:
Collection<XML> found = new XMLDocument("your document here").nodes(
"/city/stock/name[text()='Sony']"
);

Related

How to read child XML using XPath in Java

The XML file is as :
Xml File
Code I have written:
List queryXmlUsingXpathAndReturnList(String xml, String xpathExpression) {
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance()
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder()
Document doc = dBuilder.parse(new ByteArrayInputStream(xml.getBytes(StandardCharsets.UTF_8)))
doc.getDocumentElement().normalize()
XPath xPath = XPathFactory.newInstance().newXPath()
NodeList nodeList = (NodeList) xPath.compile(xpathExpression).evaluate(doc, XPathConstants.NODESET)
List returnElements = new ArrayList<>()
nodeList.each { n ->
returnElements.add(n.getTextContent())
}
When I am passing the xpath as:
/Envelope/Body/CommandResponseData/OperationResult/Operation/ParameterList/ListParameter/StringElement
It returns all the values.
But I want to return only the ListParameter values whose name="PackageTypeList".
For that I am using the xpath as:
/Envelope/Body/CommandResponseData/OperationResult/Operation/ParameterList/ListParameter[#name='PackageTypeList']/StringElement
But it returns list as null.

I guess you miss "CommandResult" between "CommandResponseData" and "OperationResult" in your XPath-Expression.

Need to extract element from XML

Below is the XML:-
<?xml version="1.0" encoding="utf-8"?>
<Stock>
<Identification>
<AccountID></AccountID>
<CustomerId></CustomerId>
</Identification>
<Product>
<ArticleName>Monitors</ArticleName>
<BaseUnit></BaseUnit>
<Notes></Notes>
<ID>11f13e2e-ae97-45b5-a9a9-23fa7f6bb767</ID>
<ID>b22834c0-a570-4e6b-97c3-5067a14d118d</ID>
<ID>ed458593-5e1a-4dc1-94f0-a66eeef2dd79</ID>
<ID>d25584a9-1db2-48cf-9a70-9b81e5a7e7f2</ID>
<LogisticalInfo>
<BaseUnit></BaseUnit>
<Compoundheight>4.78</Compoundheight>
<Compoundwidth>5.67</Compoundwidth>
<Compounddepth></Compounddepth>
<Compoundweight></Compoundweight>
<CompoundweightUnit>g</CompoundweightUnit>
<TotalHeight>30.5</TotalHeight>
<Totalwidth>542.7</Totalwidth>
<Totaldepth>37.5</Totaldepth>
<TotalWeight>2840</TotalWeight>
<height>mm</height>
<Weight>g</Weight>
<Depth>mm</Depth>
</LogisticalInfo>
</Product>
I would like to extract Compoundheight and Compounddepth, Below is the part of code to extract it but it is throwing error:java.lang.StringIndexOutOfBoundsException.
stringXmlDocument = productHeader + toStringXml(node, true) + productTrailer;
int CompoundheightCodeStart = stringXmlDocument.indexOf("<Compoundheight>");
int CompoundheightCodeEnd = stringXmlDocument.indexOf("</Compoundheight>"); height_packed=Double.parseDouble(stringXmlDocument.substring(CompoundheightCodeStart+13,CompoundheightCodeEnd));
int CompounddepthCodeStart = stringXmlDocument.indexOf("<Compounddepth>");
int CompounddepthCodeEnd = stringXmlDocument.indexOf("</Compounddepth>");depth_packed=Double.parseDouble(stringXmlDocument.substring(CompounddepthCodeStart+12, CompounddepthCodeEnd));

A better approach than indexOf would be to use Java XPath implementation for navigating XML.
See the Java XPath implementation, and the XPath Specification.
An example:
// parse the xml into a Document
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputStream inputStream = this.class.getResourceAsStream("test.xml");
Document document = builder.parse(inputStream);
// Obtain a specific element from within the Document
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
String articleName = xPath.evaluate("/Stock/Product/ArticleName", document);
System.out.println("ArticleName is: " + articleName);

How do I return a subsection of an XML request based on an XPath expression

I have written code that enables me to a subsection of an xml request based on a given XPath, however, it is only the value between the tags that are returned and not the tags.
I want both values and elements to be returned based on a given xpath.
For example, in this xml:
?xml version="1.0"?>
<company>
<staff1>
<name>john</name>
<phone>465456433</phone>
<email>gmail1</email>
<area>area1</area>
<city>city1</city>
</staff1>
<staff2>
<name>mary</name>
<phone>4655556433</phone>
<email>gmail2</email>
<area>area2</area>
<city>city2</city>
</staff2>
<staff3>
<name>furvi</name>
<phone>4655433</phone>
<email>gmail3</email>
<area>area3</area>
<city>city3</city>
</staff3>
</company>
my XPath would only return the value of the first staff element i.e.
John
465456433
gmail1
area1
city1
It does not return the tags associated to it i.e, it should return the following:
<staff1>
<name>john</name>
<phone>465456433</phone>
<email>gmail1</email>
<area>area1</area>
<city>city1</city>
</staff1>
Here is my code:
InputSource inputSource = new InputSource(new StringReader(xmlString));
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
documentBuilderFactory.setNamespaceAware(true);
String RecordCategory;
Document doc = documentBuilderFactory.newDocumentBuilder().parse(inputSource);
// Create XPathFactory object
XPathFactory xpathFactory = XPathFactory.newInstance();
// Create XPath object
XPath xpath = xpathFactory.newXPath();
System.out.println("TESTING XPATH");
xpath.setNamespaceContext(new NamespaceContext() {
#Override
public String getNamespaceURI(String prefix) {
...
});
XPathExpression expr = xpath.compile("//staff1[1]");
Staff1 = (String) expr.evaluate(doc,XPathConstants.STRING);
System.out.println("staff1: " + staff1);
Anyone have any idea on what I could do to resolve this issue?

Your Java call to XPathExpression.evaluation() is returning the string value of the node selected by your XPath expression. If you instead want to return the node selected by your XPath expression, change
Staff1 = (String) expr.evaluate(doc, XPathConstants.STRING);
to
Node node = (Node) expr.evaluate(doc, XPathConstants.NODE);
See this answer for how to pretty print node.

Getting current Node Value with XPath in Java

I'm building an application in Java and I have some problems with getting some values
The idea is that I have an XML document in the cloud (in this case Last.fm API), and I want to retrieve a value of a node, wich have an attribute. This value is a string, and I want to get it using the attribute
An example for Last.Fm XML is the following:
<track>
<id>1019817</id>
<name>Believe</name>
<mbid/>
<url>http://www.last.fm/music/Cher/_/Believe</url>
<duration>240000</duration>
<streamable fulltrack="1">1</streamable>
<listeners>69572</listeners>
<playcount>281445</playcount>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
<album position="1">
<artist>Cher</artist>
<title>Believe</title>
<mbid>61bf0388-b8a9-48f4-81d1-7eb02706dfb0</mbid>
<url>http://www.last.fm/music/Cher/Believe</url>
<image size="small">http://userserve-ak.last.fm/serve/34/8674593.jpg</image>
<image size="medium">http://userserve-ak.last.fm/serve/64/8674593.jpg</image>
<image size="large">http://userserve-ak.last.fm/serve/126/8674593.jpg</image>
</album>
<toptags>
<tag>
<name>pop</name>
<url>http://www.last.fm/tag/pop</url>
</tag>
...
</toptags>
<wiki>
<published>Sun, 27 Jul 2008 15:44:58 +0000</published>
<summary>...</summary>
<content>...</content>
</wiki>
</track>
So my idea is to get for example the image value with the attribute "medium"
I've done the following code using XMLPath:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document documento = builder.parse("http://ws.audioscrobbler.com/2.0/?method=track.getInfo&api_key=" + apikey + "&artist=cher&track=believe");
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//lfm/track/album/image[#size='medium']");
NodeList nlLastFm = (NodeList) expr.evaluate(documento, XPathConstants.NODESET);
Element eLastFm = (Element) nlLastFm.item(0);
Log.i(TAG, "eLastFm: " + nlLastFm.item(0));
coverUrl = parser.getValue(eLastFm, "image");
But the problem is that it doesn't return correctly the value. I searched a lot of other posts related but they didn't solve my problem...
Could anybody help me?
Thanks for your help!

Try:
String value = nlLastFm.item(0).getTextContent()
A little test code(with smaller piece of your xml and xpath edited accordingly)
public static void main(String[] args) throws ParserConfigurationException,
SAXException, IOException, XPathExpressionException {
String xml = "<album position=\"1\"><artist>Cher</artist><title>Believe</title><mbid>61bf0388-b8a9-48f4-81d1-7eb02706dfb0</mbid><url>http://www.last.fm/music/Cher/Believe</url><image size=\"small\">http://userserve-ak.last.fm/serve/34/8674593.jpg</image><image size=\"medium\">http://userserve-ak.last.fm/serve/64/8674593.jpg</image><image size=\"large\">http://userserve-ak.last.fm/serve/126/8674593.jpg</image></album>";
InputStream stream = new ByteArrayInputStream(xml.getBytes("UTF-8"));
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document documento = builder.parse(stream);
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//album/image[#size='medium']");
NodeList nlLastFm = (NodeList) expr.evaluate(documento,
XPathConstants.NODESET);
String coverUrl = nlLastFm.item(0).getTextContent();
System.out.println(coverUrl);
}
Outputs http://userserve-ak.last.fm/serve/64/8674593.jpg

Convert XML String to ArrayList

Seems like a basic question but I can't find this anywhere. Basically I've got a list of XML links like so: (all in one string)
I already have the "string" var which contains all the XML. Just extracting the HTML strings.
<?xml version="1.0" encoding="UTF-8"?>
<fql_query_response xmlns="http://api.facebook.com/1.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" list="true">
<photo>
<src_small>http://photos-a.ak.fbcdn.net/hphotos-ak-ash4/486603_10151153207000351_1200565882_t.jpg</src_small>
</photo>
<photo>
<src_small>http://photos-c.ak.fbcdn.net/hphotos-ak-ash3/578919_10150988289678715_1110488833_t.jpg</src_small>
</photo>
I want to convert these into a arrayList, so something like URLArray[0] would be the first address as a string.
Can anyone tell me how to do this thanks?

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource is = new InputSource( new StringReader( xmlString) );
Document doc = builder.parse( is );
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
xpath.setNamespaceContext(new PersonalNamespaceContext());
XPathExpression expr = xpath.compile("//src_small/text()");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
List<String> urls = new ArrayList<String>();
for (int i = 0; i < nodes.getLength(); i++) {
urls.add (nodes.item(i).getNodeValue());
System.out.println(nodes.item(i).getNodeValue());
}

You are right, there should be some other resources out there that can help you. Maybe your searches just do not use the right keywords.
You basically have 2 choices:
Use an XML processing library. SAX, DOM, XPATH, & xmlreader are some keywords you can use to find some.
Just ignore the fact that your string is xml and perform normal string operations on it. splits, iterate through it, regular expressions, ect...

Yes for that you have to perform XML Parsing.
then store that in ArrayList.
ex:
ArrayList<String> aList = new ArrayList<String>();
aList.add("your string");

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java XML Parse/Query - java

Take a look on XPath and its java implementation JXPath. Other possible approach is parsing XML using JAXB and operating objects list using LambdaJ.

Try jcabi-xml (see this blog post) with a one-liner: Collection<XML> found = new XMLDocument("your document here").nodes( "/city/stock/name[text()='Sony']" );

Related

How to read child XML using XPath in Java

Need to extract element from XML

How do I return a subsection of an XML request based on an XPath expression

Getting current Node Value with XPath in Java

Convert XML String to ArrayList

Categories

Resources