Need to extract element from XML

Need to extract element from XML - java

Below is the XML:-
<?xml version="1.0" encoding="utf-8"?>
<Stock>
<Identification>
<AccountID></AccountID>
<CustomerId></CustomerId>
</Identification>
<Product>
<ArticleName>Monitors</ArticleName>
<BaseUnit></BaseUnit>
<Notes></Notes>
<ID>11f13e2e-ae97-45b5-a9a9-23fa7f6bb767</ID>
<ID>b22834c0-a570-4e6b-97c3-5067a14d118d</ID>
<ID>ed458593-5e1a-4dc1-94f0-a66eeef2dd79</ID>
<ID>d25584a9-1db2-48cf-9a70-9b81e5a7e7f2</ID>
<LogisticalInfo>
<BaseUnit></BaseUnit>
<Compoundheight>4.78</Compoundheight>
<Compoundwidth>5.67</Compoundwidth>
<Compounddepth></Compounddepth>
<Compoundweight></Compoundweight>
<CompoundweightUnit>g</CompoundweightUnit>
<TotalHeight>30.5</TotalHeight>
<Totalwidth>542.7</Totalwidth>
<Totaldepth>37.5</Totaldepth>
<TotalWeight>2840</TotalWeight>
<height>mm</height>
<Weight>g</Weight>
<Depth>mm</Depth>
</LogisticalInfo>
</Product>
I would like to extract Compoundheight and Compounddepth, Below is the part of code to extract it but it is throwing error:java.lang.StringIndexOutOfBoundsException.
stringXmlDocument = productHeader + toStringXml(node, true) + productTrailer;
int CompoundheightCodeStart = stringXmlDocument.indexOf("<Compoundheight>");
int CompoundheightCodeEnd = stringXmlDocument.indexOf("</Compoundheight>"); height_packed=Double.parseDouble(stringXmlDocument.substring(CompoundheightCodeStart+13,CompoundheightCodeEnd));
int CompounddepthCodeStart = stringXmlDocument.indexOf("<Compounddepth>");
int CompounddepthCodeEnd = stringXmlDocument.indexOf("</Compounddepth>");depth_packed=Double.parseDouble(stringXmlDocument.substring(CompounddepthCodeStart+12, CompounddepthCodeEnd));

A better approach than indexOf would be to use Java XPath implementation for navigating XML.
See the Java XPath implementation, and the XPath Specification.
An example:
// parse the xml into a Document
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputStream inputStream = this.class.getResourceAsStream("test.xml");
Document document = builder.parse(inputStream);
// Obtain a specific element from within the Document
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
String articleName = xPath.evaluate("/Stock/Product/ArticleName", document);
System.out.println("ArticleName is: " + articleName);

Related

How to read child XML using XPath in Java

The XML file is as :
Xml File
Code I have written:
List queryXmlUsingXpathAndReturnList(String xml, String xpathExpression) {
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance()
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder()
Document doc = dBuilder.parse(new ByteArrayInputStream(xml.getBytes(StandardCharsets.UTF_8)))
doc.getDocumentElement().normalize()
XPath xPath = XPathFactory.newInstance().newXPath()
NodeList nodeList = (NodeList) xPath.compile(xpathExpression).evaluate(doc, XPathConstants.NODESET)
List returnElements = new ArrayList<>()
nodeList.each { n ->
returnElements.add(n.getTextContent())
}
When I am passing the xpath as:
/Envelope/Body/CommandResponseData/OperationResult/Operation/ParameterList/ListParameter/StringElement
It returns all the values.
But I want to return only the ListParameter values whose name="PackageTypeList".
For that I am using the xpath as:
/Envelope/Body/CommandResponseData/OperationResult/Operation/ParameterList/ListParameter[#name='PackageTypeList']/StringElement
But it returns list as null.

I guess you miss "CommandResult" between "CommandResponseData" and "OperationResult" in your XPath-Expression.

How do I return a subsection of an XML request based on an XPath expression

I have written code that enables me to a subsection of an xml request based on a given XPath, however, it is only the value between the tags that are returned and not the tags.
I want both values and elements to be returned based on a given xpath.
For example, in this xml:
?xml version="1.0"?>
<company>
<staff1>
<name>john</name>
<phone>465456433</phone>
<email>gmail1</email>
<area>area1</area>
<city>city1</city>
</staff1>
<staff2>
<name>mary</name>
<phone>4655556433</phone>
<email>gmail2</email>
<area>area2</area>
<city>city2</city>
</staff2>
<staff3>
<name>furvi</name>
<phone>4655433</phone>
<email>gmail3</email>
<area>area3</area>
<city>city3</city>
</staff3>
</company>
my XPath would only return the value of the first staff element i.e.
John
465456433
gmail1
area1
city1
It does not return the tags associated to it i.e, it should return the following:
<staff1>
<name>john</name>
<phone>465456433</phone>
<email>gmail1</email>
<area>area1</area>
<city>city1</city>
</staff1>
Here is my code:
InputSource inputSource = new InputSource(new StringReader(xmlString));
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
documentBuilderFactory.setNamespaceAware(true);
String RecordCategory;
Document doc = documentBuilderFactory.newDocumentBuilder().parse(inputSource);
// Create XPathFactory object
XPathFactory xpathFactory = XPathFactory.newInstance();
// Create XPath object
XPath xpath = xpathFactory.newXPath();
System.out.println("TESTING XPATH");
xpath.setNamespaceContext(new NamespaceContext() {
#Override
public String getNamespaceURI(String prefix) {
...
});
XPathExpression expr = xpath.compile("//staff1[1]");
Staff1 = (String) expr.evaluate(doc,XPathConstants.STRING);
System.out.println("staff1: " + staff1);
Anyone have any idea on what I could do to resolve this issue?

Your Java call to XPathExpression.evaluation() is returning the string value of the node selected by your XPath expression. If you instead want to return the node selected by your XPath expression, change
Staff1 = (String) expr.evaluate(doc, XPathConstants.STRING);
to
Node node = (Node) expr.evaluate(doc, XPathConstants.NODE);
See this answer for how to pretty print node.

How to retrieve XML element attribute having namespaces using java?

I have the below xml and I am trying to retrieve the value of id under BoostBuryDimensionValue tag using the java code but returns nothing.
Can some one help me on this. Thanks in advance.
Input XML
<?xml version="1.0" encoding="UTF-8"?>
<ContentItem type="OrganicZoneContent" xmlns="http://endeca.com/schema/content/2008" >
<TemplateId>OrganicResults</TemplateId>
<Name>OrganicResults</Name>
<Property name="navigation_records">
<BoostBury rollupKey="grp_id" recspecField="grp_id" xmlns="http://endeca.com/schema/content/xtags/2010">
<BoostBuryRecords>
<BoostBuryRecord recordType="CRITERIA" boostBuryType="BOOST">
<BoostBurySearch terms="null" key="null"/>
<BoostBuryDimensionValues>
<BoostBuryDimensionValue id="4294965238" name="career" dimensionName="Occasion"/>
</BoostBuryDimensionValues>
</BoostBuryRecord>
</BoostBuryRecords>
</BoostBury>
</Property>
</ContentItem>
and the java code i am using is
public static void main(String[] args) throws Exception {
InputStream xml = new FileInputStream("tempinput.xml");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(xml);
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("ContentItem/Property[#name='navigation_records']/BoostBury/BoostBuryRecords/BoostBuryRecord/BoostBuryDimensionValues/BoostBuryDimensionValue/#id");
Object result = expr.evaluate(doc, XPathConstants.STRING);
System.out.println("BoostBuryDimensionValue id = " + result);
}

Getting current Node Value with XPath in Java

I'm building an application in Java and I have some problems with getting some values
The idea is that I have an XML document in the cloud (in this case Last.fm API), and I want to retrieve a value of a node, wich have an attribute. This value is a string, and I want to get it using the attribute
An example for Last.Fm XML is the following:
<track>
<id>1019817</id>
<name>Believe</name>
<mbid/>
<url>http://www.last.fm/music/Cher/_/Believe</url>
<duration>240000</duration>
<streamable fulltrack="1">1</streamable>
<listeners>69572</listeners>
<playcount>281445</playcount>
<artist>
<name>Cher</name>
<mbid>bfcc6d75-a6a5-4bc6-8282-47aec8531818</mbid>
<url>http://www.last.fm/music/Cher</url>
</artist>
<album position="1">
<artist>Cher</artist>
<title>Believe</title>
<mbid>61bf0388-b8a9-48f4-81d1-7eb02706dfb0</mbid>
<url>http://www.last.fm/music/Cher/Believe</url>
<image size="small">http://userserve-ak.last.fm/serve/34/8674593.jpg</image>
<image size="medium">http://userserve-ak.last.fm/serve/64/8674593.jpg</image>
<image size="large">http://userserve-ak.last.fm/serve/126/8674593.jpg</image>
</album>
<toptags>
<tag>
<name>pop</name>
<url>http://www.last.fm/tag/pop</url>
</tag>
...
</toptags>
<wiki>
<published>Sun, 27 Jul 2008 15:44:58 +0000</published>
<summary>...</summary>
<content>...</content>
</wiki>
</track>
So my idea is to get for example the image value with the attribute "medium"
I've done the following code using XMLPath:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document documento = builder.parse("http://ws.audioscrobbler.com/2.0/?method=track.getInfo&api_key=" + apikey + "&artist=cher&track=believe");
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//lfm/track/album/image[#size='medium']");
NodeList nlLastFm = (NodeList) expr.evaluate(documento, XPathConstants.NODESET);
Element eLastFm = (Element) nlLastFm.item(0);
Log.i(TAG, "eLastFm: " + nlLastFm.item(0));
coverUrl = parser.getValue(eLastFm, "image");
But the problem is that it doesn't return correctly the value. I searched a lot of other posts related but they didn't solve my problem...
Could anybody help me?
Thanks for your help!

Try:
String value = nlLastFm.item(0).getTextContent()
A little test code(with smaller piece of your xml and xpath edited accordingly)
public static void main(String[] args) throws ParserConfigurationException,
SAXException, IOException, XPathExpressionException {
String xml = "<album position=\"1\"><artist>Cher</artist><title>Believe</title><mbid>61bf0388-b8a9-48f4-81d1-7eb02706dfb0</mbid><url>http://www.last.fm/music/Cher/Believe</url><image size=\"small\">http://userserve-ak.last.fm/serve/34/8674593.jpg</image><image size=\"medium\">http://userserve-ak.last.fm/serve/64/8674593.jpg</image><image size=\"large\">http://userserve-ak.last.fm/serve/126/8674593.jpg</image></album>";
InputStream stream = new ByteArrayInputStream(xml.getBytes("UTF-8"));
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document documento = builder.parse(stream);
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//album/image[#size='medium']");
NodeList nlLastFm = (NodeList) expr.evaluate(documento,
XPathConstants.NODESET);
String coverUrl = nlLastFm.item(0).getTextContent();
System.out.println(coverUrl);
}
Outputs http://userserve-ak.last.fm/serve/64/8674593.jpg

Java XML Parse/Query

I have such XML structure, when I use NodeList nList = doc.getElementsByTagName("stock"); it return me 3 stocks, 2 main stock tags and one which is under substocks. I want to get only two stock which is on upper level and ignore all which is under substock tags.
Is it possible in Java to make something like LINQ query in C#, say return me elements only where name is equals to "Sony".
Thanks!
<city>
<stock>
<name>Sony</name>
</stock>
<stock>
<name>Panasonic</name>
<substocks>
<stock>
<name>Panasonic Shop 2</name>
</stock>
</substocks>
</stock>
</city>

I recommend you to use XPath with javax.xml.xpath package:
final InputStream is = new FileInputStream('your.xml');
final DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
final DocumentBuilder builder = factory.newDocumentBuilder();
final Document doc = builder.parse(is);
final XPathFactory xPathfactory = XPathFactory.newInstance();
final XPath xpath = xPathfactory.newXPath();
final XPathExpression expr = xpath.compile("/city/stock/name[text()='Sony']");
and then:
final NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);

Take a look on XPath and its java implementation JXPath. Other possible approach is parsing XML using JAXB and operating objects list using LambdaJ.

There is also dom4j library which has powerful navigation with XPath:
import org.dom4j.Document;
import org.dom4j.io.SAXReader;
SAXReader reader = new SAXReader();
Document document = reader.read("test.xml");
List list = document.selectNodes("/city/stock/name[text()='Sony']");
for (Iterator iter = list.iterator(); iter.hasNext(); ) {
// TODO: place you logic here
}
More examples are here

Try jcabi-xml (see this blog post) with a one-liner:
Collection<XML> found = new XMLDocument("your document here").nodes(
"/city/stock/name[text()='Sony']"
);

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Need to extract element from XML - java

Related

How to read child XML using XPath in Java

How do I return a subsection of an XML request based on an XPath expression

How to retrieve XML element attribute having namespaces using java?

Getting current Node Value with XPath in Java

Java XML Parse/Query

Categories

Resources