Parse xml without tagname

Parse xml without tagname - java

I have a xml file
<Response>
<StatusCode>0</StatusCode>
<StatusDetail>OK</StatusDetail>
<AccountInfo>
<element1>value</element1>
<element2>value</element2>
<element3>value</element2>
<elementN>value</elementN>
</AccountInfo>
</Response>
And I want parse my elements in AccountInfo, but I dont know elements tag names.
Now Im using and have this code for tests, but in future I will recieve more elemenets in AccountInfo and I dont know how many or there names
String name="";
String balance="";
Node accountInfo = document.getElementsByTagName("AccountInfo").item(0);
if (accountInfo.getNodeType() == Node.ELEMENT_NODE){
Element accountInfoElement = (Element) accountInfo;
name = accountInfoElement.getElementsByTagName("Name").item(0).getTextContent();
balance = accountInfoElement.getElementsByTagName("Balance").item(0).getTextContent();
}

Heres 2 ways you can do it:
Node accountInfo = document.getElementsByTagName("AccountInfo").item(0);
NodeList children = accountInfo.getChildNodes();
or you can do
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList children = (NodeList) xPath.evaluate("//AccountInfo/*", document.getDocumentElement(), XPathConstants.NODESET);
Once you have your NodeList you can loop through them.
for(int i=0;i<children.getLength();i++) {
if(children.item(i).getNodeType() == Node.ELEMENT_NODE) {
Element elem = (Element)children.item(i);
// If your document is namespace aware use localName
String localName = elem.getLocalName();
// Tag name returns the localName and the namespace prefix
String tagName= elem.getTagName();
// do stuff with the children
}
}

Related

Java: split node with splitText() method

My aim is to make XML input, replace some text in node to XML DOM element and produce XML output. My XML input and expected output can be found here, in this SO question.
Here is my java code:
private static void textTransformCitations(Document document)
{
XPath xPath = XPathFactory.newInstance().newXPath();
String expression = "/article/body/sec/p/text()";
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); i++)
{
Node textNode = nodeList.item(i);
Matcher m = Pattern.compile("\\[(\\d+)\\]").matcher(textNode.getNodeValue());
while (m.find())
{
Text number = textNode.splitText(m.start(1));
textNode = number.splitText(m.group(1).length());
Element xref = document.createElement("xref");
xref.setAttribute("rid", "bib" + m.group(1));
xref.setAttribute("ref-type", "bibr");
number.getParentNode().replaceChild(number, xref);
xref.appendChild(number);
}
}
} // Added by edit!
Obviously the problem is that splitText() can be used only for Text interface:
textNode.splitText
which textNode variable is not. But I have explicitly stated to retrieve text from nodes with XPath.
What can I do to make this code working?
How can I use the splitText method in this case?

Change the statement Node textNode = nodeList.item(i); to Text textNode = (Text)nodelist.item(i);.

Java, XPath Expression to read all node names, node values, and attributes

I need help in make an xpath expression to read all node names, node values, and attributes in an xml string. I made this:
private List<String> listOne = new ArrayList<String>();
private List<String> listTwo = new ArrayList<String>();
public void read(String xml) {
try {
// Turn String into a Document
Document document = DocumentBuilderFactory.newInstance()
.newDocumentBuilder().parse(new ByteArrayInputStream(xml.getBytes()));
// Setup XPath to retrieve all tags and values
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nodeList = (NodeList) xPath.evaluate("//text()[normalize-space()='']", document, XPathConstants.NODESET);
// Iterate through nodes
for(int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
listOne.add(node.getNodeName());
listTwo.add(node.getNodeValue());
// Another list to hold attributes
}
} catch(Exception e) {
LogHandle.info(e.getMessage());
}
}
I found the expression //text()[normalize-space()=''] online; however, it doesn't work. When I get try to get the node name from listOne, it is just #text. I tried //, but that doesn't work either. If I had this XML:
<Data xmlns="Somenamespace.nsc">
<Test>blah</Test>
<Foo>bar</Foo>
<Date id="2">12242016</Date>
<Phone>
<Home>5555555555</Home>
<Mobile>5555556789</Mobile>
</Phone>
</Data>
listOne[0] should hold Data, listOne[1] should hold Test, listTwo[1] should hold blah, etc... All the attributes will be saved in another parallel list.
What expression should xPath evaluate?
Note: The XML String can have different tags, so I can't hard code anything.
Update: Tried this loop:
NodeList nodeList = (NodeList) xPath.evaluate("//*", document, XPathConstants.NODESET);
// Iterate through nodes
for(int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
listOne.add(i, node.getNodeName());
// If null then must be text node
if(node.getChildNodes() == null)
listTwo.add(i, node.getTextContent());
}
However, this only gets the root element Data, then just stops.

//* will select all element nodes, //#* all attribute nodes. However, an element node does not have a meaningful node value in the DOM, so you would need to read out getTextContent() instead of getNodeValue.
As you seem to consider an element with child elements to have a "null" value I think you need to check whether there are any child elements:
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
docBuilderFactory.setNamespaceAware(true);
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse("sampleInput1.xml");
XPathFactory fact = XPathFactory.newInstance();
XPath xpath = fact.newXPath();
NodeList allElements = (NodeList)xpath.evaluate("//*", doc, XPathConstants.NODESET);
ArrayList<String> elementNames = new ArrayList<>();
ArrayList<String> elementValues = new ArrayList<>();
for (int i = 0; i < allElements.getLength(); i++)
{
Node currentElement = allElements.item(i);
elementNames.add(i, currentElement.getLocalName());
elementValues.add(i, xpath.evaluate("*", currentElement, XPathConstants.NODE) != null ? null : currentElement.getTextContent());
}
for (int i = 0; i < elementNames.size(); i++)
{
System.out.println("Name: " + elementNames.get(i) + "; value: " + (elementValues.get(i)));
}
For the sample input
<Data xmlns="Somenamespace.nsc">
<Test>blah</Test>
<Foo>bar</Foo>
<Date id="2">12242016</Date>
<Phone>
<Home>5555555555</Home>
<Mobile>5555556789</Mobile>
</Phone>
</Data>
the output is
Name: Data; value: null
Name: Test; value: blah
Name: Foo; value: bar
Name: Date; value: 12242016
Name: Phone; value: null
Name: Home; value: 5555555555
Name: Mobile; value: 5555556789

Couldn't able to read the attribute using DOM parser

i am having issues when reading the attribute of a link,
this is the structure of my xml,
<entry>
<updated>
<title>
<link href="">
</entry>
i managed to read the date and title correctly but the href attribute of the link is not working.
Here is my code,
NodeList nList = doc.getElementsByTagName("entry");
System.out.println("============================");
for (int temp = 0; temp < nList.getLength(); temp++)
{
Node node = nList.item(temp);
System.out.println(""); //Just a separator
if (node.getNodeType() == Node.ELEMENT_NODE)
{
Element eElement = (Element) node;
System.out.println("Date : " + eElement.getElementsByTagName("updated").item(0).getTextContent());
System.out.println("Title : " + eElement.getElementsByTagName("title").item(0).getTextContent());
// The below code is for reading href attribute of link,
NodeList node1 = eElement.getElementsByTagName("link");
Element eElement1 = (Element) node1;
System.out.println(eElement1.getAttribute("href"));
}
}
I am creating a new nodelist for the attributes of link but the code is not working.
error:
java.lang.ClassCastException: com.sun.org.apache.xerces.internal.dom.DeepNodeListImpl cannot be cast to org.w3c.dom.Element
at Demo.main(Demo.java:45)

A NodeList is not an Element and cannot be cast to one (successfully), so this code isn't going to work:
NodeList node1 = eElement.getElementsByTagName("link");
Element eElement1 = (Element) node1;
A NodeList is, as the name suggests, a list of nodes (and in your case, the nodes will be Elements). So this code would work for the first link:
NodeList list = eElement.getElementsByTagName("link");
Element eElement1 = (Element) list.item(0);
...whereupon your getAttribute should work fine, as Element has getAttribute.
Side note: If your library has support for newer query functions, you could also do this:
String href = ((Element)eElement.querySelector("entry")).getAttribute("href");
...because querySelector returns just the first match (not a list) (or null if no matches; if that's a possibility, add a guard to the above). But I don't know how well querySelector is supported outside of browsers yet.

// The below code is for reading href attribute of link,
NodeList node1 = eElement.getElementsByTagName("link");
Element eElement1 = (Element) node1;
NodeList will give you Node object not Element, you can get href value as follows,
String hrefValue = nodeList.item(0).
getAttributes().getNamedItem("href").getNodeValue();

Building DOM document from xml string gives me a null document

I'm trying to use the DOM library to parse a string in xml format. For some reason my document contains nulls and I run into issues trying to parse it. The string variable 'response' is not null and I am able to see the string when in debug mode.
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource is = new InputSource(new StringReader(response));
Document doc = builder.parse(is);
NodeList nodes = doc.getElementsByTagName("BatchFile");;
for (int i = 0; i < nodes.getLength(); i++) {
Element element = (Element) nodes.item(i);
NodeList batchItem = element.getChildNodes();
String uri = batchItem.item(0).getNodeValue();
String id = batchItem.item(1).getNodeValue();
String fqName = batchItem.item(2).getNodeValue();
}
Highlighting over the line Document doc = builder.parse(is); after it has run shows the result of [#document: null].
Edit: I've managed to not got an empty doc now but the string values are still null (at end of code). How would I get the value of something like this
<GetBatchFilesResult>
<BatchFile>
<Uri>uri</Uri>
<ID>id</ID>
<FQName>file.zip</FQName>
</BatchFile>
</GetBatchFilesResult>

You can also use getTextContent(). getNodeValue will return null for elements. Besides, you'd better use getElementsByTagName, since white spaces are also treated as one of the child nodes.
Element element = (Element) nodes.item(i);
String uri = element.getElementsByTagName("Uri").item(0).getTextContent();
String id = element.getElementsByTagName("ID").item(0).getTextContent();
String fqName = element.getElementsByTagName("FQName").item(0).getTextContent();
Check Node API document to see what type of nodes will return null for getNodeValue.

I found the solution. Seems stupid that you have to do it this way to get a value from a node.
Element element = (Element) nodes.item(i);
NodeList batchItem = element.getChildNodes();
Element uri = (Element) batchItem.item(0);
Element id = (Element) batchItem.item(1);
Element fqName = (Element) batchItem.item(2);
NodeList test = uri.getChildNodes();
NodeList test1 = id.getChildNodes();
NodeList test2 = fqName.getChildNodes();
String strURI= test.item(0).getNodeValue();
String strID= test1.item(0).getNodeValue();
String strFQName= test2.item(0).getNodeValue();

Child elements of DOM

I have this XML file:
<scene>
<texture file="file1.dds"/>
<texture file="file2.dds"/>
...
<node name="cube">
<texture name="stone" unit="0" sampler="anisotropic"/>
</node>
</scene>
I need all child element of 'scene' that are named "texture", but with this code:
Element rootNode = document.getDocumentElement();
NodeList childNodes = rootNode.getElementsByTagName("texture");
for (int nodeIx = 0; nodeIx < childNodes.getLength(); nodeIx++) {
Node node = childNodes.item(nodeIx);
if (node.getNodeType() == Node.ELEMENT_NODE) {
// cool stuff here
}
}
i also get the 'texture' elements which are inside 'node'.
How can i filter these out? Or how can i get only the elements that are direct childs of 'scene'?

You can do it using Xpath, consider the following example taken from the JAXP Specification 1.4 (which I recommend you to consult for this):
// parse the XML as a W3C Document
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
org.w3c.Document document = builder.parse(new File("/widgets.xml"));
// evaluate the XPath expression against the Document
XPath xpath = XPathFactory.newInstance().newXPath();
String expression = "/widgets/widget[#name='a']/#quantity";
Double quantity = (Double) xpath.evaluate(expression, document, XPathConstants.NUMBER);

I found myself a solution that works fine:
Element parent = ... ;
String childName = "texture";
NodeList childs = parent.getChildNodes();
for (int nodeIx = 0; nodeIx < childs.getLength(); nodeIx++) {
Node node = childs.item(nodeIx);
if (node.getNodeType() == Node.ELEMENT_NODE
&& node.getNodeName().equals(name)) {
// cool stuff here
}
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Parse xml without tagname - java

Related

Java: split node with splitText() method

Java, XPath Expression to read all node names, node values, and attributes

Couldn't able to read the attribute using DOM parser

Building DOM document from xml string gives me a null document

Child elements of DOM

Categories

Resources