Parsing XSI with Java

Parsing XSI with Java - java

I'm trying to parse XML strings into a Document that I can use for easy searching. But when I run into certain kinds of XML, it doesn't seem to work. The document is never constructed, and is null when it encounters an XML message like I have at the bottom. An excpetion is not thrown by anything in my try/catch
My code currently looks like this:
Document convertMessageToDoc(String message){
Document doc = null;
try {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(message));
doc = db.parse(is);
}
catch (Exception e) {
//e.printStackTrace();
doc = null;
}
return doc;
}
What are some ways that I would be able to work with something like this:
<ns1:SubmitFNOLResponse xmlns:ns1="http://website.com/">
<ns1:FNOLReporting xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="ns1:FNOLReporting">
<ns1:FNOLResponse>
<ns1:FNOLStatusInfo>
<ns1:StatusCode>0</ns1:StatusCode>
<ns1:StatusMessages />
</ns1:FNOLStatusInfo>
</ns1:FNOLResponse>
</ns1:FNOLReporting>
</ns1:SubmitFNOLResponse>

It looks like your document is not "well formed". You need a single root element where you have two sibling "ns1:Prod" tags at the root.

Your document is not well-formed XML. Once it is, everything appears to work as expected.
String message =
"<ns1:Prods xmlns:ns1='/foo'>"// xmlns:ns1='uri'>"
+ "<ns1:Prod>"
+ " <ns1:ProductID>316</ns1:ProductID>"
+ " <ns1:Name>Blade</ns1:Name>"
+ "</ns1:Prod>"
+ "<ns1:Prod>"
+ " <ns1:ProductID>317</ns1:ProductID>"
+ " <ns1:Name>LL Crankarm</ns1:Name>"
+ " <ns1:Color>Black</ns1:Color>"
+ "</ns1:Prod>"
+ "</ns1:Prods>";
Document doc = null;
try {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
dbf.setValidating(false);
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(message));
doc = db.parse(is);
NodeList sections = doc.getElementsByTagName("ns1:Prod");
int numSections = sections.getLength();
for (int i = 0; i < numSections; i++) {
Element section = (Element) sections.item(i);
NodeList prodinfos = section.getChildNodes();
for (int j = 0; j < prodinfos.getLength(); j++) {
Node info = prodinfos.item(j);
if (info.getNodeType() != Node.TEXT_NODE) {
System.out.println(info.getNodeName() + ": " + info.getTextContent());
}
}
System.out.println("");
}
} catch (Exception e) {
e.printStackTrace();
doc = null;
}
// Outputs
ns1:ProductID: 316
ns1:Name: Blade
ns1:ProductID: 317
ns1:Name: LL Crankarm
ns1:Color: Black

Related

Java get tag name of a Node

I need to read a small xml file and validate it's content against a hardcoded HashMap with key= tag and value= text inside tag.
I can not get the tag name of the Node.
If I convert the Node to Element I get a cast exception.
I am reading using the DOOM classes:
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
NodeList list = doc.getElementsByTagName("MergeOptions");
if (list.getLength() == 0)
{
//throw
}
NodeList config = list.item(0).getChildNodes();
for (int i = 0; i <= config.getLength() - 1; i++)
{
Node setting = config.item(i);
String nodeName = setting.getNodeValue();
String value = setting.getTextContent();
if (defaultMergeOptions.containsKey(nodeName) == false)
{
//throw
}
if (defaultMergeOptions.get(nodeName).equals(value))
{
//throw
}
Xml file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<MergeOptions>
<sometagName>false</sometagName>
</MergeOptions>

I am helping you with the following code structure. Once you see the tag name and the value, you can apply the logic to compare from HashMap key or value.
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class Test1 {
public static void main(String[] args) throws Exception {
String xmlFile = "test.xml";
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
Element root = doc.getDocumentElement();
System.out.println(root.getNodeName());
NodeList list = root.getChildNodes();
for (int i = 0; i < list.getLength(); i++) {
Node node = list.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE)
{
System.out.println(node.getNodeName() + " : " + node.getTextContent());
}
}
}
}

I have tried to run your code, it works fine, no class cast exceptions.
Note how I used the element in the for loop the get the name, value or the existsnce of a possible children.
final String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n" +
"<MergeOptions>\n<sometagName>false</sometagName>\n</MergeOptions>";
final InputStream xsmlStream = new ByteArrayInputStream(xml.getBytes());
final DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
final DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
final Document doc = dBuilder.parse(xsmlStream);
final NodeList nodes = doc.getElementsByTagName("MergeOptions");
for (int i = 0; i < nodes.getLength(); i++) {
final Element element = (Element) nodes.item(i);
System.out.println(element.hasChildNodes());
System.out.println(element.getNodeValue());
System.out.println(element.getTagName());
}
Using hash map is with node names as keys is a bit tricky, 'cause if your XML file have multiple node names with same names and different values, the HashMap will only store only one unique keys thus validate only one of the same name nodes. The other same name nodes but with different values will be not valid.

Well I did something diffrent.
Seems to work:
IntegrationTest.getInstance().getLogger().log(Level.INFO, "Reading merge-que file: " + xmlFile.getAbsolutePath());
try
{
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
for (Entry<String, String> entry : defaultMergeOptions.entrySet())
{
String tagName = entry.getKey();
NodeList list = doc.getElementsByTagName(tagName);
if (list.getLength() != 1)
{
IntegrationTest.getInstance().getLogger().log(Level.SEVERE, TestResult.FAIL, "Merge option [{0}] has invalid content. Tag [{1}] missing or to many",
new Object[] { xmlFile.getName(), tagName });
result = TestResult.FAIL;
continue;
}
if (!defaultMergeOptions.get(tagName).equals(list.item(0).getTextContent()))
{
IntegrationTest.getInstance().getLogger().log(Level.WARNING, TestResult.FAIL, "Merge option [{0}] has diffrent content for tag [{1}].",
new Object[] { xmlFile.getCanonicalPath(), tagName });
result = TestResult.FAIL;
}
}
}
catch (Exception e)
{
IntegrationTest.getInstance().getLogger().log(Level.SEVERE, SBUtil.stackTraceToString(e.getStackTrace()));
throw new IntegrationTestException(e);
}
}

Unable to parse XML using java

I have an XML string got as a response. But I am unable to reach at Response Code and remarks. Can anybody help me to get the response code.
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
<s:Body>
<GetIMEIInfoResponse xmlns="http://tempuri.org/">
<GetIMEIInfoResult>
<![CDATA[
<SerialsDetail>
<Item>
<ResponseCode>2</ResponseCode>
<Remark>Invalid Input</Remark>
</Item>
</SerialsDetail>
]]>
</GetIMEIInfoResult>
</GetIMEIInfoResponse>
</s:Body>
</s:Envelope>
Thats how I am trying to do
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try {
builder = factory.newDocumentBuilder();
Document doc = builder.parse(new InputSource(new StringReader(response)));
NodeList list = doc.getElementsByTagName("Remark");
System.out.println(list.getLength());
Node n = list.item(0);
System.out.println(n.getTextContent());
} catch (Exception e) {
e.printStackTrace();
}

You are asking for an element with name "Remark", but you document does not contain such an element. Instead, it contains only an "GetIMEIInfoResult" element with a bunch of text in it. This text happens to be xml. But in order to access the contents of the inner piece of XML, you have to parse the contents of the "GetIMEIInfoResult" in the same way that you've parsed the entire document.
Here is how you can do it:
import java.io.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.InputSource;
public class NestedCDATA {
private static String response =
"<s:Envelope xmlns:s=\"http://schemas.xmlsoap.org/soap/envelope/\">" +
" <s:Body>" +
" <GetIMEIInfoResponse xmlns=\"http://tempuri.org/\">" +
" <GetIMEIInfoResult>" +
" <![CDATA[" +
" <SerialsDetail>" +
" <Item>" +
" <ResponseCode>2</ResponseCode>" +
" <Remark>Aawwwwwwww yeaaaah!</Remark>" +
" </Item>" +
" </SerialsDetail>" +
" ]]>" +
" </GetIMEIInfoResult>" +
" </GetIMEIInfoResponse>" +
" </s:Body>" +
"</s:Envelope>";
public static String getCdata(Node parent) {
NodeList cs = parent.getChildNodes();
for(int i = 0; i < cs.getLength(); i++){
Node c = cs.item(i);
if(c instanceof CharacterData) {
CharacterData cdata = (CharacterData)c;
String content = cdata.getData().trim();
if (content.length() > 0) {
return content;
}
}
}
return "";
}
public static void main(String[] args) {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try {
builder = factory.newDocumentBuilder();
Document doc = builder.parse(new InputSource(new StringReader(response)));
Node cdataParent = doc.getElementsByTagName("GetIMEIInfoResult").item(0);
DocumentBuilder cdataBuilder = factory.newDocumentBuilder();
Document cdataDoc = cdataBuilder.parse(new InputSource(new StringReader(
getCdata(cdataParent)
)));
Node remark = cdataDoc.getElementsByTagName("Remark").item(0);
System.out.println("Content of Remark in CDATA: " + getCdata(remark));
} catch (Exception e) {
e.printStackTrace();
}
}
}
Result: "Content of Remark in CDATA: Aawwwwwwww yeaaaah!".
Here is another interesting question for you: why does your service output XML with XML in it? XML all by itself is already nested enough. Is it really necessary to wrap parts of it in CDATA?

The problem of the XML is that the data in the tag GetIMEIInfoResult is CDATA. This causes the builder not to recognize it as XML. To access the data in the tag GetIMEIInfoResult you can use the following:
Element infoResult = (Element) list.item(0);
String elementData = getCharacterDataOfNode(infoResult.getFirstChild());
public static String getCharacterDataOfNode(Node node) {
String data = "";
if (node instanceof CharacterData) {
data = ((CharacterData) node).getData();
}
return data;
}
Then you have to parse that data again with a DocumentBuilder where you can access the tag Remark. To get the content you have again work with the getCharacterDataOfNode() method.

Document.toString() is "[#document: null]" even though XML was parsed

Consider this example
#Test
public void testXML() {
final String s = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><results>\n" +
" <status>OK</status>\n" +
" <usage>By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html</usage>\n" +
" <url/>\n" +
" <language>english</language>\n" +
" <docSentiment>\n" +
" <type>neutral</type>\n" +
" </docSentiment>\n" +
"</results> ";
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try
{
builder = factory.newDocumentBuilder();
Document doc = builder.parse( new InputSource( new StringReader( s ) ) );
System.out.println(doc.toString());
} catch (Exception e) {
e.printStackTrace();
}
}
When I run this example
System.out.println(doc.toString()); turns out to be [#document: null].
I also validated this XML online and no errors were found. What am I missing?
What I need?
I need to find out value of <docSentiment> in this XML
Thanks

As per MadProgrammer's advice, I managed to get the value.
Note: Even though [#document: null] was shown, the document was not null, in reality.
#Test
public void testXML() {
final String s = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><results>\n" +
" <status>OK</status>\n" +
" <usage>By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html</usage>\n" +
" <url/>\n" +
" <language>english</language>\n" +
" <docSentiment>\n" +
" <type>neutral</type>\n" +
" </docSentiment>\n" +
"</results>";
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try
{
builder = factory.newDocumentBuilder();
Document doc = builder.parse( new InputSource( new StringReader( s ) ) );
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//docSentiment/type");
NodeList nl = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
System.out.println("Sentiment:" + ((DTMNodeList) nl).getDTMIterator().toString());
} catch (Exception e) {
e.printStackTrace();
}
}
and I go the output as
Sentiment:neutral

Read inside a Tag using XPath Java

Hye I am new to read XML File using Java my problem is that I have been trying to read an xml and between a specific tag I want to get the required data I am using XPath and my query is:
String expression = "/ADOXML/MODELS/MODEL/MODELATTRIBUTES/ATTRIBUTE[#type='STRING']";
It works fine and my specific Tag to read from is:
<ATTRIBUTE name="Description" type="STRING"> SOME TEXT </ATTRIBUTE>
But I want to read the data inside only these types of Tags so that my output should be:
SOME TEXT
inside the tag!
can somebody help me how can I do this Please I am new to xml reading! Trying my best as:
String expression = "/ADOXML/MODELS/MODEL/MODELATTRIBUTES/ATTRIBUTE[#name='Description' and ./type/text()='STRING']";
But it wont give me any output!
thanks in advance
My Code:
DocumentBuilderFactory builderFactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder builder = null;
try {
builder = builderFactory.newDocumentBuilder();
org.w3c.dom.Document document = builder.parse(
new FileInputStream("c:\\y.xml"));
XPath xPath = XPathFactory.newInstance().newXPath();
String expression = "/ADOXML/MODELS/MODEL/MODELATTRIBUTES/ATTRIBUTE[#name='Description'and #type='STRING']";
System.out.println(expression);
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); i++) {
System.out.println(nodeList.item(i).getFirstChild().getNodeValue());
}
} catch (ParserConfigurationException | SAXException | IOException e) {
System.out.print(e);
}
There is a problem with my code cant figure out what!

This code works fine for me with the changed XPath to:
"/ADOXML/MODELS/MODEL/MODELATTRIBUTES/ATTRIBUTE[#name='Description'][#type='STRING']":
private static final String EXAMPLE_XML =
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
"<ADOXML adoversion=\"Version 5.1\" username=\"kvarga\" database=\"adonisdb\" time=\"08:55\" date=\"30.11.2013\" version=\"3.1\">" +
"<MODELS>" +
"<MODEL version=\"\" applib=\"ADONIS BPMS BP Library 5.1\" libtype=\"bp\" modeltype=\"Business process model\" name=\"Product development\" id=\"mod.25602\">" +
"<MODELATTRIBUTES>" +
"<ATTRIBUTE name=\"Version number\" type=\"STRING\"> </ATTRIBUTE>" +
"<ATTRIBUTE name=\"Author\" type=\"STRING\">kvarga</ATTRIBUTE>" +
"<ATTRIBUTE name=\"Description\" type=\"STRING\">I WANT THIS PARA 2</ATTRIBUTE>" +
"</MODELATTRIBUTES>" +
"</MODEL>" +
"</MODELS>" +
"</ADOXML>";
public static void main(String[] args) {
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = null;
try {
builder = builderFactory.newDocumentBuilder();
Document document = builder.parse(new ByteArrayInputStream(EXAMPLE_XML.getBytes()));
XPath xPath = XPathFactory.newInstance().newXPath();
String expression = "/ADOXML/MODELS/MODEL/MODELATTRIBUTES/ATTRIBUTE[#name='Description'][#type='STRING']";
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); i++) {
System.out.println("###" + nodeList.item(i).getFirstChild().getNodeValue() + "###");
}
} catch (Exception e) {
System.out.print(e);
}
}
OUTPUT:
###I WANT THIS PARA 2###

The mentioned code works fine.
You can try other way also to get the text node -
String expression = "/ADOXML/MODELS/MODEL/MODELATTRIBUTES/ATTRIBUTE/text()";
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(document, XPathConstants.NODESET);
System.out.println(nodeList.item(0).getNodeValue());

getNodeName() operation on an XML node returns #text

<person>
<firstname>
<lastname>
<salary>
</person>
This is the XML I am parsing. When I try printing the node names of child elements of person,
I get
text
firstname
text
lastname
text
salary
How do I eliminate #text being generated?
Update -
Here is my code
try {
NodeList nl = null;
int l, i = 0;
File fXmlFile = new File("file.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
dbFactory.setValidating(false);
dbFactory.setIgnoringElementContentWhitespace(true);
dbFactory.setNamespaceAware(true);
dbFactory.setIgnoringComments(true);
dbFactory.setCoalescing(true);
InputStream in;
in = new FileInputStream(fXmlFile);
Document doc = dBuilder.parse(in);
doc.getDocumentElement().normalize();
Node n = doc.getDocumentElement();
System.out.println(dbFactory.isIgnoringElementContentWhitespace());
System.out.println(n);
if (n != null && n.hasChildNodes()) {
nl = n.getChildNodes();
for (i = 0; i < nl.getLength(); i++) {
System.out.println(nl.item(i).getNodeName());
}
}
} catch (Exception e) {
e.printStackTrace();
}

setIgnoringElementContentWhitespace only works if you use setValidating(true), and then only if the XML file you are parsing references a DTD that the parser can use to work out which whitespace-only text nodes are actually ignorable. If your document doesn't have a DTD it errs on the safe side and assumes that no text nodes can be ignored, so you'll have to write your own code to ignore them as you traverse the child nodes.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Parsing XSI with Java - java

It looks like your document is not "well formed". You need a single root element where you have two sibling "ns1:Prod" tags at the root.

Related

Java get tag name of a Node

Unable to parse XML using java

Document.toString() is "[#document: null]" even though XML was parsed

Read inside a Tag using XPath Java

getNodeName() operation on an XML node returns #text

Categories

Resources