JAVA : Parsing the Xml Value using javax.ml and Xpath option

JAVA : Parsing the Xml Value using javax.ml and Xpath option - java

I have an XmlParserClass to get values from the xml file which looks like this.
<?xml version="1.0" encoding="utf-8" ?>
<HomePageData>
<LogoTopLeft>//*[#id='corp_logo']</LogoTopLeft>
<SingInLink>//*[#id='login']</SingInLink>
<SingUpLink>//*[#id='signup']</SingUpLink>
</HomePageData>
And the method in my class file looks like this:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(file);
XPath xp = XPathFactory.newInstance().newXPath();
try{
String value = xp.evaluate("/LogoTopLeft/text()", doc);
return value;
} catch(XPathExpressionException e)
{
return null;
}
I am not able to get the expected data from the xml file using this class file. It just reaches the try block and then come to the catch to return "null". Most of the question in stackoverflow has been answered with a for loop to collect all nodes, but I need to take one data at a time not allelements at one stretch and Also, I need to return this value to another class file which will accept only STRINGS and so I cant pass NodeList or any other elements
P.S - The xml file is present in a different location other than the parsefile. I stored the class path value "/projName/src/com/core/path/indexPage.xml" in a file and passed it.

You just need to fix your XPath. /LogoTopLeft is looking for the element at XML root whereas it's a child element. So, either use //LogoTopLeft or specify the full path as /HomePageData/LogoTopLeft
String logo = xp.evaluate("//LogoTopLeft/text()", doc);
String signIn = xp.evaluate("//SignInLink/text()", doc);
String signUp = xp.evaluate("//SignUpLink/text()", doc);
System.out.println( "logo = " + logo +
"; signIn = " + signIn +
"; signUp = " + signUp);
/* prints:
logo = //*[#id='corp_logo']; signIn = //*[#id='login']; signUp = //*[#id='signup']
*/
EDIT : (My Test Code)
Document doc = DocumentBuilderFactory.newInstance()
.newDocumentBuilder().parse(new File("input.xml"));
XPath xp = XPathFactory.newInstance().newXPath();
try {
String logo = xp.evaluate("/HomePageData/LogoTopLeft/text()", doc);
String signIn = xp.evaluate("//SignInLink/text()", doc);
String signUp = xp.evaluate("//SignUpLink/text()", doc);
System.out.println( "logo = " + logo +
"; signIn = " + signIn +
"; signUp = " + signUp);
} catch (XPathExpressionException e) {
e.printStackTrace();
}
input.xml (placed in project directory)
<?xml version="1.0" encoding="utf-8" ?>
<HomePageData>
<LogoTopLeft>//*[#id='corp_logo']</LogoTopLeft>
<SignInLink>//*[#id='login']</SignInLink>
<SignUpLink>//*[#id='signup']</SignUpLink>
</HomePageData>

Related

Unable to parse element attribute with XOM

I'm attempting to parse an RSS field using the XOM Java library. Each entry's image URL is stored as an attribute for the <img> element, as seen below.
<rss version="2.0">
<channel>
<item>
<title>Decision Paralysis</title>
<link>https://xkcd.com/1801/</link>
<description>
<img src="https://imgs.xkcd.com/comics/decision_paralysis.png"/>
</description>
<pubDate>Mon, 20 Feb 2017 05:00:00 -0000</pubDate>
<guid>https://xkcd.com/1801/</guid>
</item>
</channel>
</rss>
Attempting to parse <img src=""> with .getFirstChildElement("img") only returns a null pointer, making my code crash when I try to retrieve <img src= ...>. Why is my program failing to read in the <img> element, and how can I read it in properly?
import nu.xom.*;
public class RSSParser {
public static void main() {
try {
Builder parser = new Builder();
Document doc = parser.build ( "https://xkcd.com/rss.xml" );
Element rootElement = doc.getRootElement();
Element channelElement = rootElement.getFirstChildElement("channel");
Elements itemList = channelElement.getChildElements("item");
// Iterate through itemList
for (int i = 0; i < itemList.size(); i++) {
Element item = itemList.get(i);
Element descElement = item.getFirstChildElement("description");
Element imgElement = descElement.getFirstChildElement("img");
// Crashes with NullPointerException
String imgSrc = imgElement.getAttributeValue("src");
}
}
catch (Exception error) {
error.printStackTrace();
System.exit(1);
}
}
}

There is no img element in the item. Try
if (imgElement != null) {
String imgSrc = imgElement.getAttributeValue("src");
}
What the item contains is this:
<description><img
src="http://imgs.xkcd.com/comics/us_state_names.png"
title="Technically DC isn't a state, but no one is too
pedantic about it because they don't want to disturb the snakes
."
alt="Technically DC isn't a state, but no one is too pedantic about it because they don't want to disturb the snakes." />
</description>
That's not an img elment. It's plain text.

I managed to come up with a somewhat hacky solution using regex and pattern matching.
// Iterate through itemList
for (int i = 0; i < itemList.size(); i++) {
Element item = itemList.get(i);
String descString = item.getFirstChildElement("description").getValue();
// Parse image URL (hacky)
String imgSrc = "";
Pattern pattern = Pattern.compile("src=\"[^\"]*\"");
Matcher matcher = pattern.matcher(descString);
if (matcher.find()) {
imgSrc = descString.substring( matcher.start()+5, matcher.end()-1 );
}
}

Java based lotus agent how to change content type

I have a java agent in lotus notes in which i receive an xml and parse this.
this is the code:
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
import java.io.File;
import java.util.ArrayList;
import java.util.Date;
import java.util.Iterator;
import java.util.Arrays;
public class JavaAgent extends AgentBase {
private static PrintWriter pw;
public void NotesMain() {
try {
Session session = getSession();
AgentContext agentContext = session.getAgentContext();
Database db = agentContext.getCurrentDatabase();
//domino 03
boolean ProductieDatabase = false;
String ApplicatieServer = "CN=Server01";
String OrderDB = "General\\Order.nsf";
String TestOrderDB = "General\\TestOrder.nsf";
String Relations= "General\\Relations.nsf";
String Tabel= "General\\Tabel.nsf";
org.w3c.dom.Document domdoc = null;
lotus.domino.Document doc = agentContext.getDocumentContext();
Item requestContent = null;
StringBuffer sb = new StringBuffer();
pw = getAgentOutput();
//Open databases
Database RelationDB= session.getDatabase(ApplicatieServer,Relations, false);
Database TabelDB= session.getDatabase(ApplicatieServer,Tabel, false);
//Open order database
Database TabelDB;
if(ProductieDatabase == true)
{
TabelDB= session.getDatabase(ApplicatieServer,OrderDB, false);
}
else
{
TabelDB= session.getDatabase(ApplicatieServer,TestOrderDB, false);
}
//Maak nieuw request document aan
lotus.domino.Document RequestDoc = db.createDocument();
//Alle velden uit de http post toevoegen aan document
Vector items = doc.getItems();
for (int j=0; j<items.size(); j++)
{
// System.out.println ("Testorders 3");
Item item = (Item)items.elementAt(j);
String fldName = item.getName();
String fldValue = item.getValueString();
RequestDoc.replaceItemValue(fldName, fldValue);
if ( fldName.matches("(?i).*request_content.*") )
{
sb.append( fldValue );
}
}
RequestDoc.replaceItemValue("Form", "Response");
RequestDoc.replaceItemValue("Status", "0");
RequestDoc.replaceItemValue("ID_Request", RequestDoc.getUniversalID());
Date currentDate=new Date(System.currentTimeMillis());
SimpleDateFormat formatterDatum=new SimpleDateFormat("yyyyMMdd");
SimpleDateFormat formatterTijd=new SimpleDateFormat("HHmmss");
String HubwooBestandspad = "C:\\OCI\\asd\\";
String Bestnaam = cxvxcv+ "asdasd" + RequestDoc.getUniversalID() + "_" + formatterDatum.format(currentDate) + "_" + formatterTijd.format(currentDate) + ".xml";
Stream outStream = session.createStream();
if (outStream.open(Bestnaam, "ASCII")) {
if (outStream.getBytes() == 0) {
outStream.writeText(sb.toString());
outStream.close();
}
else
System.out.println("Output file exists and has content");
}
else{
System.out.println("Output file open failed");
}
File input = new File(Bestnaam);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
domdoc = dBuilder.parse(input);
domdoc.getDocumentElement().normalize();
NodeList nList = domdoc.getElementsByTagName("ItemOut");
String test = "TEST OCI AGENT";
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
Element eElement = (Element) nNode;
System.out.println("\nCurrent Element :" + nList.getLength());
test = test + "**Delivery Date**" + eElement.getAttribute("requestedDeliveryDate") + "!END! [" + temp + "]";
test = test + " " + eElement.getElementsByTagName("SupplierPartID").item(0).getTextContent();
test = test + " " + eElement.getElementsByTagName("Description").item(0).getTextContent();
}
RequestDoc.replaceItemValue("JSON", sb.toString());
RequestDoc.replaceItemValue("JSON", test);
// OM HELE XML FILE TE KRIJGEN
RequestDoc.replaceItemValue("XmlTU",sb.toString());
RequestDoc.save();
//verander content-type
pw.println("Content-Type: text/xml");
//verander charset
pw.println("charset: UFT-8");
//stuur 200 OK terug
pw.println("200 OK");
}
catch(Exception e) {
e.printStackTrace();
}
}
}
How can i change sucessfully the content type to text/xml and the charset to utf-8. I was thinking about creating an extra class which extends the HttpServlet and which has the doPost method. But how can i then implement that object in to this agent?
Now this service works, with firefox poster i also receive response 200 OK and also content-type= text/xml.. But if this service is getting called from a .net envorinment then the .net shows a protocol violation error.
Can someone help?
code of the .net application:
Public Function SubmitOrder(ByVal OrderXML, ByVal URL)
Dim objXMLHTTP 'MSXML2.ServerXMLHTTP
'We provide our own error handling
On Error Resume Next
'The MSXML2.ServerXMLHTTP object allows xml to be posted to
'external webpages.
Set objXMLHTTP = CreateObject("MSXML2.ServerXMLHTTP.6.0")
If Err.Number <> 0 Then
SubmitOrder = "ERROR: Failed to create ServerXMLHTTP - " & Err.Description
Exit Function
End If
'Specify the webpage we wish to use
objXMLHTTP.Open "POST", URL, False
If Err.Number <> 0 Then
SubmitOrder = "ERROR: Failed to open URL - " & Err.Description
Exit Function
End If
'The type of information we are sending
objXMLHTTP.setRequestHeader "Content-Type", "text/xml"
If Err.Number <> 0 Then
SubmitOrder = "ERROR: Failed to setRequestHeader - " & Err.Description
Exit Function
End If
'Send the information
objXMLHTTP.send replace(OrderXML,"UTF-16","UTF-8")
If Err.Number <> 0 Then
SubmitOrder = "ERROR: Failed to send data to " & URL & " - " & Err.Description & ", Error number --> " & Err.Number
Exit Function
End If
'Return the result
SubmitOrder = objXMLHTTP.responseText
end Function
public Function CheckSuccess(ByVal Response)
'Check for any internal errors
If Left(Response, 6) = "ERROR:" Then
CheckSuccess = False
Exit Function
End If
'If the response includes the text code="200" then the
'export was a success
CheckSuccess = instr(Response, "code=""200""") <> 0
End Function

Parsing XML in Java to extract all nodes & attributes

I am stuck on an issue trying to parse some XML documents to obtain the output i require.
Take this sample XML:
<root>
<ZoneRule Name="After" RequiresApproval="false">
<Zone>
<WSAZone ConsecutiveDayNumber="1">
<DaysOfWeek>
<WSADaysOfWeek Saturday="false"/>
</DaysOfWeek>
<SelectedLimits>
</SelectedLimits>
<SelectedHolidays>
</SelectedHolidays>
</WSAZone>
</Zone>
</ZoneRule>
<ZoneRule Name="Before" RequiresApproval="false">
<Zone>
<WSAZone ConsecutiveDayNumber="3">
<DaysOfWeek>
<WSADaysOfWeek Saturday="true"/>
</DaysOfWeek>
<SelectedLimits>
</SelectedLimits>
<SelectedHolidays>
</SelectedHolidays>
</WSAZone>
</Zone>
</ZoneRule>
</root>
What i am attempting to do is to be able to ignore the root tag (this is working so no problems here), and treat each of the "ZoneRule's" as its own individual block.
Once i have each ZoneRule isolated i need to extract all of the nodes and attributes to allow me to to create a string to query a database to check if it exists (this part is also working).
The issue i am having is that in my code i cannot separate out each individual ZoneRule block, for some reason it is being processed all as one.
My sample code is as follows:
public String testXML = "";
int andCount = 0;
public void printNote(NodeList nodeList) {
for (int count = 0; count < nodeList.getLength(); count++) {
Node tempNode = nodeList.item(count);
// make sure it's element node.
if (tempNode.getNodeType() == Node.ELEMENT_NODE) {
if (tempNode.hasAttributes()))) {
// get attributes names and values
NamedNodeMap nodeMap = tempNode.getAttributes();
for (int i = 0; i < nodeMap.getLength(); i++) {
Node node = nodeMap.item(i);
if (andCount == 0) {
testXML = testXML + "XMLDataAsXML.exist('//" + tempNode.getNodeName() + "[./#" + node.getNodeName() + "=\"" + node.getNodeValue() + "\"]')=1 \n";
} else {
testXML = testXML + " and XMLDataAsXML.exist('//" + tempNode.getNodeName() + "[./#" + node.getNodeName() + "=\"" + node.getNodeValue() + "\"]')=1 \n";
}
andCount = andCount + 1;
}
}
if (tempNode.hasChildNodes()) {
// loop again if has child nodes
printNote(tempNode.getChildNodes());
}
}
}
}
private void jButton2ActionPerformed(java.awt.event.ActionEvent evt) {
try {
File file = new File("C:\\Test.xml");
DocumentBuilder dBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = dBuilder.parse(file);
//System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
if (doc.hasChildNodes()) {
printNote(doc.getChildNodes());
}
} catch (Exception e) {
System.out.println(e.getMessage());
}
System.out.println(testXML);
}
Which produces this output (both nodes combined).
XMLDataAsXML.exist('//ZoneRule[./#Name="After"]')=1
and XMLDataAsXML.exist('//ZoneRule[./#RequiresApproval="false"]')=1
and XMLDataAsXML.exist('//WSAZone[./#ConsecutiveDayNumber="1"]')=1
and XMLDataAsXML.exist('//WSADaysOfWeek[./#Saturday="false"]')=1
and XMLDataAsXML.exist('//ZoneRule[./#Name="Before"]')=1
and XMLDataAsXML.exist('//ZoneRule[./#RequiresApproval="false"]')=1
and XMLDataAsXML.exist('//WSAZone[./#ConsecutiveDayNumber="3"]')=1
and XMLDataAsXML.exist('//WSADaysOfWeek[./#Saturday="true"]')=1
What i am actually after is this (excuse the incomplete SQL statements):
XMLDataAsXML.exist('//ZoneRule[./#Name="After"]')=1
and XMLDataAsXML.exist('//ZoneRule[./#RequiresApproval="false"]')=1
and XMLDataAsXML.exist('//WSAZone[./#ConsecutiveDayNumber="1"]')=1
and XMLDataAsXML.exist('//WSADaysOfWeek[./#Saturday="false"]')=1
XMLDataAsXML.exist XMLDataAsXML.exist('//ZoneRule[./#Name="Before"]')=1
and XMLDataAsXML.exist('//ZoneRule[./#RequiresApproval="false"]')=1
and XMLDataAsXML.exist('//WSAZone[./#ConsecutiveDayNumber="3"]')=1
and XMLDataAsXML.exist('//WSADaysOfWeek[./#Saturday="true"]')=1
The XML that will be parsed will not always be exactly like above so i cannot use hardcoded xPaths etc - i need to dynamically loop through the document, looking for the ZoneRule node as my base (i will dynamically generate this value based on the file received) and then extract all the required info.
I am completely open to better methods than what i have tried above.
Thanks very much.

In your code, the testXML and andCount are declared outside the printNote method and are not being reset during iterations.
You start with the first ZoneRule, generate the correct text during the first for iterations (lets forget about the recursion) and now you move to the next ZoneRule, but testXML contains the whole generated text and the andCount is lager then 0 so you keep attaching the text generated for the next ZoneRule.
You should reset the andCount and testXML at the beggining of each iteriation of the for loop. But then you 'recursive' children would not be rendered correctly.
So either you need two methods one to deal with top level ZoneRule elements and another for its children, or much better, instead of appending to text to shared variable, you should redisng your method so they would return String value which then can be appended correctly (with and or without, withou new line or without) at the place when it is recursively callled.

Reading all the namespaces in a DOM document

I want to read all the namespaces in a DOM document.
My input XML file is:
<a:Sample xmlns:a="http://a.org/"
xmlns:b="http://b.org/">
<a:Element b:Attribute="text"> </a:Element>
</a:Sample>
I want to get all the prefixes with their associated namespaces in the given input XML.
I have a method with the following definition.
public Document check(Document srcfile) {
Document naReport = null;
if(srcfile != null) {
// Parse the document using builder.
if (srcfile instanceof DocumentTraversal) {
DocumentTraversal dt = (DocumentTraversal) srcfile;
NodeIterator i = dt.createNodeIterator(srcfile, NodeFilter.SHOW_ELEMENT, null, false);
System.out.println(srcfile.getPrefix());
System.out.println(srcfile.getNamespaceURI());
Element element = (Element) i.nextNode();
while (element != null) {
String prefix = element.getPrefix();
if (prefix != null) {
String uri = element.getNamespaceURI();
System.out.println("Prefix: " + prefix);
System.out.println("URI: " + uri);
// bindings.put(prefix, uri);
}
element = (Element) i.nextNode();
}
}
}
return naReport;
}
But, when I run my program, I'm getting the following output:
Prefix: a
URI: http://a.org/
Prefix: a
URI: http://a.org/
Could someone help me.

You will need to loop over the attributes of each element inside your main element loop:
NamedNodeMap map = element.getAttributes();
for (int iattr=0; iattr<map.getLength(); iattr++) {
Attr attr = (Attr)map.item(iattr);
if (attr.getNamespaceURI() != null) {
System.out.println("Attr " + attr.getName() + " - " + attr.getNamespaceURI());
}
}

Java XML DOM error when adding elements

I am trying to replicate this XML:
<?xml version="1.0"?>
<AccessRequest xml:lang="en-US">
<AccessLicenseNumber>YourLicenseNumber</AccessLicenseNumber>
<UserId>YourUserID</UserId>
<Password>YourPassword</Password>
</AccessRequest>
<?xml version="1.0"?>
<AddressValidationRequest xml:lang="en-US">
<Request>
<TransactionReference>
<CustomerContext>Your Test Case Summary Description</CustomerContext>
<XpciVersion>1.0</XpciVersion>
</TransactionReference>
<RequestAction>XAV</RequestAction>
<RequestOption>3</RequestOption>
</Request>
<AddressKeyFormat>
<AddressLine>AIRWAY ROAD SUITE 7</AddressLine>
<PoliticalDivision2>SAN DIEGO</PoliticalDivision2>
<PoliticalDivision1>CA</PoliticalDivision1>
<PostcodePrimaryLow>92154</PostcodePrimaryLow>
<CountryCode>US</CountryCode>
</AddressKeyFormat>
</AddressValidationRequest>
I am using one class to build the request:
public UpsRequestBuilder()
{
try
{
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
doc = docBuilder.newDocument();
}
catch(Exception e)
{
System.out.println(e.getMessage());
}
}
public void accessRequestBuilder(String accessKey, String username, String password)
{
Element accessRequest = doc.createElement("AccessRequest");
doc.appendChild(accessRequest);
Element license = doc.createElement("AccessLicenseNumber");
accessRequest.appendChild(license);
license.setTextContent(accessKey);
Element userId = doc.createElement("UserId");
accessRequest.appendChild(userId);
userId.setTextContent(username);
Element pass = doc.createElement("Password");
accessRequest.appendChild(pass);
pass.setTextContent(password);
System.out.println("completed Requestbuilder");
}
public void addAddress(Address address)
{
Element addressKeyFormat = doc.createElement("AddressKeyFormat");
doc.appendChild(addressKeyFormat);
Element addressLine = doc.createElement("AddressLine");
addressKeyFormat.appendChild(addressLine);
addressLine.setTextContent(address.getState() + ' ' + address.getStreet2());
Element city = doc.createElement("PoliticalDivision2");
addressKeyFormat.appendChild(city);
city.setTextContent(address.getCity());
Element state = doc.createElement("PoliticalDivision1");
addressKeyFormat.appendChild(state);
state.setTextContent(address.getState());
Element zip = doc.createElement("PostcodePrimaryLow");
addressKeyFormat.appendChild(zip);
zip.setTextContent(address.getZip());
Element country = doc.createElement("CountryCode");
addressKeyFormat.appendChild(country);
country.setTextContent(address.getCountry());
System.out.println("completed addAddress");
}
public void validateAddressRequest(String customerContextString, String action)
{
Element addressValidation = doc.createElement("AddressValidationRequest");
doc.appendChild(addressValidation);
Element transactionReference = doc.createElement("TransactionReference");
addressValidation.appendChild(transactionReference);
Element customerContext = doc.createElement("CustomerContext");
Element version = doc.createElement("XpciVersion");
transactionReference.appendChild(customerContext);
customerContext.setTextContent(customerContextString); //TODO figure out a way to optionally pass context text
transactionReference.appendChild(version);
version.setTextContent("1.0");//change this if the api version changes
Element requestAction = doc.createElement("RequestAction");
addressValidation.appendChild(requestAction);
requestAction.setTextContent(action);
System.out.println("completed validateAddressRequest");
}
And this is the function that uses it:
public void validateAddress(Address address)
{
UpsRequestBuilder request = new UpsRequestBuilder();
request.accessRequestBuilder(accessKey, username, password);
request.validateAddressRequest("", "3");
request.addAddress(address);
System.out.println(request.toString());
}
When I try and print out the XML from this, I get the error "HIERARCHY_REQUEST_ERR: An attempt was made to insert a node where it is not permitted." It happens in the validateAddressRequest function when I try and add the addressValidation element to the document (doc). Here is the exact line:
doc.appendChild(addressValidation);
what is the problem with adding this element to the document?

what is the problem with adding this element to the document?
You're trying to add it at the top level of the document. You can't do that, as the document already has a root element. Any XML document can only have a single root element.
The XML you've shown at the top of your question isn't a single XML document - it's two.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

JAVA : Parsing the Xml Value using javax.ml and Xpath option - java

Related

Unable to parse element attribute with XOM

Java based lotus agent how to change content type

Parsing XML in Java to extract all nodes & attributes

Reading all the namespaces in a DOM document

Java XML DOM error when adding elements

Categories

Resources