Java XML Parsing into a List and Grabbing Nodes

Java XML Parsing into a List and Grabbing Nodes - java

I am parsing an XML document and I need to put every child in to a List and then once it is in a List I need to be able to grab a specific child node from an index in the List. My code so far only grabs every child node but I don't know how to put it in a List, looping through it doesn't seem to work. Here is what I have so far:
public static void main(String[] args){
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
try {
URL url = new URL ("http://feeds.cdnak.neulion.com/fs/nhl/mobile/feeds/data/20140401.xml");
URLConnection connection = url.openConnection();
InputStream is = connection.getInputStream();
// use the factory to create a documentbuilder
DocumentBuilder builder = factory.newDocumentBuilder();
// create a new document from input stream
Document doc = builder.parse(is); // get the first element
Element element = doc.getDocumentElement();
System.out.println(element);
// get all child nodes
NodeList nodes = element.getChildNodes();
// print the text content of each child
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println("" + nodes.item(i).getTextContent());
} } catch (Exception ex) {
ex.printStackTrace();
}
}

Related

How do I get relevant information from a JSON output?

I am trying to parse an XML file into JSON. While I am able to parse successfully using a HashMap, the XML file I am using has a lot of irrelevant information which also gets reflected in the JSON.
My XML file is a topology file, basically topology for network elements and their respective processes. So, it is divided into parent and child nodes. Most of the relevant information I seek lie with the parent nodes and I want to disregard the child nodes by whatever means, so that only the parent nodes are available in the JSON.
Here's the code I wrote to parse. I have tried to write code to get child enter code herenodes but I can't figure out how to remove them(like what conditions I can use):
static String nodeType1,nodeType;
static String nodeName1,nodeName;
static String nodeIP1,nodeIP;
public static void main(String[] args) {
try { File fXmlFile = new File("SystemTopology.txt");
DocumentBuilderFactory dbFactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
NodeList nList = doc.getElementsByTagName("SNOSNE") ;
Map<String, Object> data = new HashMap<String, Object>();
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
Element el = (Element) nNode;
nodeType = el.getAttribute("snostype");
nodeName = el.getAttribute("cimname");
nodeIP = el.getAttribute("snoshostip");
NodeList list = el.getChildNodes();
for (int i = 0; i < list.getLength(); i++) {
Node nNode1 = list.item(i);
if(list.item(i).getNodeType() == Node.ELEMENT_NODE){
Element element = (Element) list.item(i);
nodeType1 = element.getAttribute("snostype");
nodeName1 = element.getAttribute("cimname");
nodeIP1 = element.getAttribute("snoshostip");
if(!nodeIP1.isEmpty()) {
System.out.println(nodeType1);
System.out.println(nodeName1);
System.out.println(nodeIP1);
}
}
}
//Need to add conditions here that would get only child nodes
if(!nodeIP.isEmpty()) {
data.put(nodeName, nodeType+","+nodeIP);
}
}
JSONObject json = new JSONObject(data);
System.out.printf( "JSON: %s", json.toString(2));
}
catch (Exception excp)
{
System.out.println("topology file not found " + excp.getMessage());
}
Topology file looks like:
<SNOSNE cimname="EDA_01" snoshostip="1.1.1.1" snostype="EDA">
<SNOSNE cimname="Resources" snoshostip="1.1.1.1" snostype="EDA">
</SNOSNE>
<SNOSNE cimname="CPU" snoshostip="1.1.1.1" snostype="EDA">
</SNOSNE>
...
...
...
</SNOSNE>
Expected output needs to contain only the parent with cimname="EDA_01". And all child nodes need to be disregarded in JSON output.

How to get data from XML node?

I am struggling to get the data out of the following XML node. I use DocumentBuilder to parse XML and I usually get the value of a node by defining the node but in this case I am not sure how the node would be.
<Session.openRs status="success" sessionID="19217B84:AA3649FE:B211FF37:E61A78F1:7A35D91D:48E90C41" roleBasedSecurity="1" entityID="1" />
This is how I am getting the values for other tags by the tag name.
public List<NYProgramTO> getNYPPAData() throws Exception {
this.getConfiguration();
List<NYProgramTO> to = dao.getLatestNYData();
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
Document document = null;
// Returns chunkSize
/*List<NYProgramTO> myList = getNextChunk(to);
ExecutorService executor = Executors.newFixedThreadPool(myList.size());
myList.stream().parallel()
.forEach((NYProgramTO nyTo) ->
{
executor.execute(new NYExecutorThread(nyTo, migrationConfig , appContext, dao));
});
executor.shutdown();
executor.awaitTermination(300, TimeUnit.SECONDS);
System.gc();*/
try {
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource source = new InputSource();
for(NYProgramTO nyProgram: to) {
String reqXML = nyProgram.getRequestXML();
String response = RatingRequestProcessor.postRequestToDC(reqXML, URL);
// dao.storeData(nyProgram);
System.out.println(response);
if(response != null) {
source.setCharacterStream(new StringReader(response));
document = builder.parse(source);
NodeList list = document.getElementsByTagName(NYPG3Constants.SERVER);
for(int iterate = 0; iterate < list.getLength(); iterate++){
Node node = list.item(iterate);
if(node.getNodeType() == Node.ELEMENT_NODE) {
Element element = (Element) node;
nyProgram.setResponseXML(response);
nyProgram.setFirstName(element.getElementsByTagName(NYPG3Constants.F_NAME).item(0).getTextContent());
nyProgram.setLastName(element.getElementsByTagName(NYPG3Constants.L_NAME).item(0).getTextContent());
nyProgram.setPolicyNumber(element.getElementsByTagName(NYPG3Constants.P_NUMBER).item(0).getTextContent());
nyProgram.setZipCode(element.getElementsByTagName(NYPG3Constants.Z_CODE).item(0).getTextContent());
nyProgram.setDateOfBirth(element.getElementsByTagName(NYPG3Constants.DOB).item(0).getTextContent());
nyProgram.setAgencyCode(element.getElementsByTagName(NYPG3Constants.AGENCY_CODE).item(0).getTextContent());
nyProgram.setLob(element.getElementsByTagName(NYPG3Constants.LINE_OF_BUSINESS).item(0).getTextContent());
if(element.getElementsByTagName(NYPG3Constants.SUBMISSION_NUMBER).item(0) != null){
nyProgram.setSubmissionNumber(element.getElementsByTagName(NYPG3Constants.SUBMISSION_NUMBER).item(0).getTextContent());
} else {
nyProgram.setSubmissionNumber("null");
}
I need to get the value for sessionId. What I want to know is the node, I am sure it can't be .I am retrieving the values via tag names so what would be the tag name in this case?
Thanks in advance

You should consider using XPath. At least for me, is so much easy to use and, in your case, in order to get sessionID you could try something like this:
XPath xPath = XPathFactory.newInstance().newXPath();
String expression = "/Session.openRs/#sessionID";
String sessionID = xPath.evaluate(expression,document);
You can obtain 'document' like this:
Document document = builder.newDocumentBuilder();
Hope this can help!!

Filling Vector from NodeList

I am trying to fill a String Vector by data from a NodeList (which values is String too), but it doesn't work and Vector is still empty.
What am I doing wrong and how to fix it?
Thanks in advance!
Document doc = parseFile(xml);
Vector <String> x = new Vector <>();
NodeList list = doc.getElementsByTagName("Stuff");
for (int i = 0; i < list.getLength(); i++) {
x.addElement(list.item(i).getFirstChild().getNodeValue());
}
public Document parseFile(File file) {
try {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
doc = (Document) builder.parse(file);
} catch (Exception e) { e.printStackTrace(); }
return doc;
}

Made mistake in tag name in NodeList list = doc.getElementsByTagName("Stuff");
Looking for wrong "Stuff" :)
Sorry, thank you all for help anyway

How to get all attributes from each element separately?

Here's some basic xml doc:
<h1>My Heading</h1>
<p align = "center"> My paragraph
<img src="smiley.gif" alt="Smiley face" height="42" width="42"></img>
<img src="sad.gif" alt="Sad face" height="45" width="45"></img>
<img src="funny.gif" alt="Funny face" height="48" width="48"></img>
</p>
<p>My para</p>
What am i trying to do is find element, all his attributes and save attribute name + attribute value for each element. Here's my code so far:
private Map <String, String> tag = new HashMap <String,String> ();
public Map <String, String> findElement () {
try {
FileReader fRead = new FileReader (sourcePage);
BufferedReader bRead = new BufferedReader (fRead);
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance ();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder ();
Document doc = docBuilder.parse(new FileInputStream (new File (sourcePage)));
XPathFactory xFactory = XPathFactory.newInstance ();
XPath xPath = xFactory.newXPath ();
NodeList nl = (NodeList) xPath.evaluate("//img/#*", doc, XPathConstants.NODESET);
for( int i=0; i<nl.getLength (); i++) {
Attr attr = (Attr) nl.item(i);
String name = attr.getName();
String value = attr.getValue();
tag.put (name,value);
}
bRead.close ();
fRead.close ();
}
catch (Exception e) {
e.printStackTrace();
System.err.println ("An error has occured.");
}
Problem appears when i am looking for img's attributes, because of identical attributes. HashMap is not suitable for this, for its overwriting of values with the same key. Maybe i'm using wrong expression to find all attributes. Is there any other way, how to get attributes names and values of nth img element?

First, let's level the field a little. I cleaned up your code a bit to have a compiling starting point. I removed the unnecessary code and fixed the method by my best guess of what it is supposed to do. And I generized it a little to make it accept one tagName parameter. It's still the same code and does the same mistake, but now it compiles (Java 7 features used for convenience, switch it back to Java 6 if you want). I also split the try-catch into multiple blocks just for the sake of it:
public Map<String, String> getElementAttributesByTagName(String tagName) {
Document document;
try (InputStream input = new FileInputStream(sourcePage)) {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
document = docBuilder.parse(input);
} catch (IOException | ParserConfigurationException | SAXException e) {
throw new RuntimeException(e);
}
NodeList attributeList;
try {
XPath xPath = XPathFactory.newInstance().newXPath();
attributeList = (NodeList)xPath.evaluate("//descendant::" + tagName + "[1]/#*", document, XPathConstants.NODESET);
} catch (XPathExpressionException e) {
throw new RuntimeException(e);
}
Map<String, String> tagInfo = new HashMap<>();
for (int i = 0; i < attributeList.getLength(); i++) {
Attr attribute = (Attr)attributeList.item(i);
tagInfo.put(attribute.getName(), attribute.getValue());
}
return tagInfo;
}
When run against your example code above, it returns:
{height=48, alt=Funny face, width=48, src=funny.gif}
The solution depends on what is your expected output. You either want
To get the attributes of only one of the <img> elements (say, the first one)
To get a list of all <img> elements and their attributes
For the first solution, it's enough to change your XPath expression to
//descendant::img[1]/#*
or
//descendant::" + tagName + "[1]/#*
with the tagName parameter. Beware, that this is not the same as //img[1]/#* even though it returns the same element in this particular case.
When changed this way, the method returns:
{height=42, alt=Smiley face, width=42, src=smiley.gif}
which are correctly returned attributes of the first <img> element.
Note that you don't even have to use XPath expression for this kind of work. Here's a non-XPath version:
public Map<String, String> getElementAttributesByTagNameNoXPath(String tagName) {
Document document;
try (InputStream input = new FileInputStream(sourcePage)) {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
document = docBuilder.parse(input);
} catch (IOException | ParserConfigurationException | SAXException e) {
throw new RuntimeException(e);
}
Node node = document.getElementsByTagName(tagName).item(0);
NamedNodeMap attributeMap = node.getAttributes();
Map<String, String> tagInfo = new HashMap<>();
for (int i = 0; i < attributeMap.getLength(); i++) {
Node attribute = attributeMap.item(i);
tagInfo.put(attribute.getNodeName(), attribute.getNodeValue());
}
return tagInfo;
}
The second solution needs to change things a bit. We want to return the attributes of all <img> elements in the document. Multiple elements means we'll use a List which will hold multiple Map<String, String> instances, where every Map represents one <img> element.
A complete XPath version in case you actually need some complex XPath expression:
public List<Map<String, String>> getElementsAttributesByTagName(String tagName) {
Document document;
try (InputStream input = new FileInputStream(sourcePage)) {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
document = docBuilder.parse(input);
} catch (IOException | ParserConfigurationException | SAXException e) {
throw new RuntimeException(e);
}
NodeList nodeList;
try {
XPath xPath = XPathFactory.newInstance().newXPath();
nodeList = (NodeList)xPath.evaluate("//" + tagName, document, XPathConstants.NODESET);
} catch (XPathExpressionException e) {
throw new RuntimeException(e);
}
List<Map<String, String>> tagInfoList = new ArrayList<>();
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
NamedNodeMap attributeMap = node.getAttributes();
Map<String, String> tagInfo = new HashMap<>();
for (int j = 0; j < attributeMap.getLength(); j++) {
Node attribute = attributeMap.item(j);
tagInfo.put(attribute.getNodeName(), attribute.getNodeValue());
}
tagInfoList.add(tagInfo);
}
return tagInfoList;
}
To get rid of the XPath part, you can simply switch it to a one-liner:
NodeList nodeList = document.getElementsByTagName(tagName);
Both these versions, when run against your test case above with an "img" parameter, return this: (formatted for clarity)
[ {height=42, alt=Smiley face, width=42, src=smiley.gif},
{height=45, alt=Sad face, width=45, src=sad.gif },
{height=48, alt=Funny face, width=48, src=funny.gif } ]
which is a correct list of all the <img> elements.

try using
Map <String, ArrayList<String>> tag = new HashMap <String, ArrayList<String>> ();

You can use a map inside the map:
Map<Map<int, String>, String> // int = "some index" 0,1,etc.. & String1(the value of the second Map) =src & String2(the value of the original Map) =smiley.gif
OR
You can inverse it and consider that when using it, like :
Map<String, String> // String1=key=smiley.gif , String2=value=src

How to retrieve XML including tags using the DOM parser

I am using org.w3c.dom to parse an XML file. Then I need to return the ENTIRE XML for a specific node including the tags, not just the values of the tags. I'm using the NodeList because I need to count how many records are in the file. But I also need to read the file wholesale from the beginning and then write it out to a new XML file. But my current code only prints the value of the node, but not the node itself. I'm stumped.
public static void main(String[] args) {
try {
File fXmlFile = new File (args[0]);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
NodeList listOfRecords = doc.getElementsByTagName("record");
int totalRecords = listOfRecords.getLength();
System.out.println("Total number of records : " + totalRecords);
int amountToSplice = queryUser();
for (int i = 0; i < amountToSplice; i++) {
String stringNode = listOfRecords.item(i).getTextContent();
System.out.println(stringNode);
}
} catch (Exception e) {
e.printStackTrace();
}
}

getTextContent() will only "return the text content of this node and its descendants" i.e. you only get the content of the 'text' type nodes. When parsing XML it's good to remember there are several different types of node, see XML DOM Node Types.
To do what you want, you could create a utility method like this...
public static String nodeToString(Node node)
{
Transformer t = TransformerFactory.newInstance().newTransformer();
t.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
t.setOutputProperty(OutputKeys.INDENT, "yes");
StringWriter sw = new StringWriter();
t.transform(new DOMSource(node), new StreamResult(sw));
return sw.toString();
}
Then loop and print like this...
for (int i = 0; i < amountToSplice; i++)
System.out.println(nodeToString(listOfRecords.item(i)));

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java XML Parsing into a List and Grabbing Nodes - java

Related

How do I get relevant information from a JSON output?

How to get data from XML node?

Filling Vector from NodeList

How to get all attributes from each element separately?

How to retrieve XML including tags using the DOM parser

Categories

Resources