Parsing dom error in java - java

I am trying to parse an XML and then insert it an Excel File.
If I run my code it works even with errors but I cannot make any modification to it because I still got errors. Here is my code:
public class Parsing {
private void parseXmlFile(){
//get the factory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
//using Factory get an instance of document builder
DocumentBuilder db = dbf.newDocumentBuilder();
//parse using builder to get DOM representation
dom = db.parse("Employee.xml"); }
} catch )
}
}
What is wrong with this?
Can someone help me? I've been searching all over google and it's eating my nerves.

it should be like this :-
private void parseXmlFile(){
//get the factory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
//using Factory get an instance of document builder
DocumentBuilder db = dbf.newDocumentBuilder();
//parse using builder to get DOM representation
Document dom = db.parse("Employee.xml");
} catch(IOException ex ){ // OR Any Specific Exception should be catched here
// your error handling code here
}
}
Also Employee.xml should be in the current directory or give complete abosulte path of Employee.xml file also.

Related

How to get xml page by url

Ok, so I got some url link like https://stackoverflow.com/ and I'm trying to parse it in document but getting error. Why? Because this is not xml file, so the question is how can I get data as xml if i got only url?
My code:
public class URLReader {
public static void main(String[] args) throws Exception {
// or if you prefer DOM:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new URL("https://stackoverflow.com/").openStream());
int nodes = doc.getChildNodes().getLength();
System.out.println(nodes + " nodes found");
}
}
To parse HTML you may use JSOUP: https://jsoup.org/
This library provides also some features to transform HTML to XHTML, which some sort of XML:
Document document = Jsoup.parse(html);
document.outputSettings().syntax(Document.OutputSettings.Syntax.xml);
document.outputSettings().escapeMode(org.jsoup.nodes.Entities.EscapeMode.xhtml);
String xhtml=document.html();

Java parse a xml with file drop

Having the filedrop already implemented in my code, I need to parse the xml file I drop in the main().
Main()
case "XML":
text.append("Processing file type XML: "+files[i].getCanonicalPath() + "\n" );
ReadXml read_xml = new ReadXml();
read_xml.read(files[i].getCanonicalPath(), text);
break;
ReadXml.java
public class ReadXml {
ProgramDocument programDocument = new ProgramDocument();
public void read(String FILE, javax.swing.JTextArea text ) {
try {
JAXBContext context = JAXBContext.newInstance(ProgramDocument.class);
Unmarshaller u = context.createUnmarshaller();
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(FILE);
Object o = u.unmarshal( doc );
doc.getDocumentElement().normalize();
text.append("Account : " +doc.getElementsByTagName("Account").item(0));
}
catch(Exception e) {
text.append("XML file not parsed correctly.\n");
}
}
}
I am not able to print anything, and when I am, I see "NULL" or just empty row or some path#numbers
I am not a developer, I just need to try opening a xml a send contents to a DB, but this is too far already.
EDIT: added part of xml
<?xml version="1.0" encoding="UTF-8"?>
<ARRCD Version="48885" Release="38">
<Identification v="ORCOZIO"/>
<Version v="013"/>
<Account v="OCTO">
<Type v="MAJO"/>
<Date v="2016-05-14"/>
</AARCD>
There are no elements tagged "Account" in the element "Account".
What you want to read here are the Attributes of Account, not other elements.
Thus you should use eElement.getAttribute("v") if you want to read attribute v, not getElementsByTagName()

XPath expression unable to match

I'm trying to parse out some information from XML using XPath in Java (v 1.7). My XML looks like this:
<soap:Fault xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<faultcode>code</faultcode>
<faultstring>string</faultstring>
<detail>detail</detail>
</soap:Fault>
My code:
final InputSource inputSource = new InputSource(new StringReader(xmlContent));
final DocumentBuilder documentBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
final Document document = documentBuilder.parse(inputSource);
final XPath xPath = XPathFactory.newInstance().newXPath();
final String faultCode = xPath.compile("/soap:Fault/faultcode/text()[1]").evaluate(document);
I have tried the XPath expression in an online checker with the XML content and it suggests that a match is made. However, when I run it in a wee stand-alone program, I get no value in "faultCode".
This issue is probably something simple, but I am unable to identify what the problem is.
Thanks for any assistance.
You should bind the namespace prefix "soap" to the URI "http://schemas.xmlsoap.org/soap/envelope/" using the XPath.setNamespaceContext() method.
First you need a namespace aware document builder factory:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
Then you need a namespace context:
NamespaceContext nsContext = new NamespaceContext() {
#Override
public Iterator getPrefixes(String namespaceURI) {
return null;
}
#Override
public String getPrefix(String namespaceURI) {
return "soap";
}
#Override
public String getNamespaceURI(String prefix) {
return "http://schemas.xmlsoap.org/soap/envelope/";
}
};
xPath.setNamespaceContext(nsContext);
With these additions, your code should work.
Regarding namespace contexts, I suggest you read http://www.ibm.com/developerworks/xml/library/x-nmspccontext/index.html.
It may be because of the namespaces effect. You can try namespace independent tags by matching with the local-name giving the syntax below:
/*[local-name()='Fault' and namespace-uri()='http://schemas.xmlsoap.org/soap/envelope/']/*[local-name()='faultcode']/text()[1]

Java DOM parser returns null document

I have an HTML template which I want to read in:
<html>
<head>
<title>TEST</title>
</head>
<body>
<h1 id="hey">Hello, World!</h1>
</body>
</html>
I want find the tag with the id hey and then paste in new stuff (e.g. new tags). For this purpose I use the DOM parser. But my code returns me null:
public static void main(String[] args) {
try {
File file = new File("C:\\Users\\<username>\\Desktop\\template.html");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(file);
doc.getDocumentElement().normalize();
System.out.println(doc.getElementById("hey")); // returns null
} catch (Exception e) {
e.printStackTrace();
}
}
What am I doing wrong?
You are trying to parse a piece of XML with the Java XML API, that is very compliant with the XML specification and doesn't help the casual developer.
In XML an attribute named id is not automatically of ID type, and thus the XML implementation doesn't get it with .getElementById(). Either you use another library (Jsoup for example), or instruct the parser to treat id as an ID (via the DTD) or you use custom code.
I modified your example to using jsoup
public static void main(String[] args) {
try {
File file = new File("C:\\Users\\<username>\\Desktop\\template.html");
Document doc = Jsoup.parse(file, "UTF8");
Element elementById = doc.getElementById("hey");
System.out.println("hey ="+doc.getElementById("hey").ownText());
System.out.println("hey ="+doc.getElementById("hey"));
} catch (Exception e) {
e.printStackTrace();
}
}

How to prevent XML Injection like XML Bomb and XXE attack

I am developing an android application with
android:minSdkVersion="14"
In this app in need to parse an xml.For that I am using a DOM parser like this
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = null;
Document doc = null;
try {
dBuilder = dbFactory.newDocumentBuilder();
} catch (ParserConfigurationException e) {
e.printStackTrace();
}
But when the code is checked for security I got two security issues on line
dBuilder = dbFactory.newDocumentBuilder();, which are
1.XML Entity Expansion Injection (XML Bomb)
2.XML External Entity Injection (XXE attack)
After some researching I added the line
dbFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
But now I am getting an exception when this line is executed
javax.xml.parsers.ParserConfigurationException: http://javax.xml.XMLConstants/feature/secure-processing
Can anybody help me?
Did you try the following snippet from OWASP page?
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException; // catching unsupported features
...
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
// This is the PRIMARY defense. If DTDs (doctypes) are disallowed, almost all XML entity attacks are prevented
// Xerces 2 only - http://xerces.apache.org/xerces2-j/features.html#disallow-doctype-decl
String FEATURE = "http://apache.org/xml/features/disallow-doctype-decl";
dbf.setFeature(FEATURE, true);
// If you can't completely disable DTDs, then at least do the following:
// Xerces 1 - http://xerces.apache.org/xerces-j/features.html#external-general-entities
// Xerces 2 - http://xerces.apache.org/xerces2-j/features.html#external-general-entities
FEATURE = "http://xml.org/sax/features/external-general-entities";
dbf.setFeature(FEATURE, false);
// Xerces 1 - http://xerces.apache.org/xerces-j/features.html#external-parameter-entities
// Xerces 2 - http://xerces.apache.org/xerces2-j/features.html#external-parameter-entities
FEATURE = "http://xml.org/sax/features/external-parameter-entities";
dbf.setFeature(FEATURE, false);
// and these as well, per Timothy Morgan's 2014 paper: "XML Schema, DTD, and Entity Attacks" (see reference below)
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);
// And, per Timothy Morgan: "If for some reason support for inline DOCTYPEs are a requirement, then
// ensure the entity settings are disabled (as shown above) and beware that SSRF attacks
// (http://cwe.mitre.org/data/definitions/918.html) and denial
// of service attacks (such as billion laughs or decompression bombs via "jar:") are a risk."
// remaining parser logic
...
catch (ParserConfigurationException e) {
// This should catch a failed setFeature feature
logger.info("ParserConfigurationException was thrown. The feature '" +
FEATURE +
"' is probably not supported by your XML processor.");
...
}
catch (SAXException e) {
// On Apache, this should be thrown when disallowing DOCTYPE
logger.warning("A DOCTYPE was passed into the XML document");
...
}
catch (IOException e) {
// XXE that points to a file that doesn't exist
logger.error("IOException occurred, XXE may still possible: " + e.getMessage());
...
}
String jaxbContext = "com.fnf.dfbatch.jaxb";
JAXBContext jc = null;
Unmarshaller u = null;
String FEATURE_GENERAL_ENTITIES = "http://xml.org/sax/features/external-general-entities";
String FEATURE_PARAMETER_ENTITIES = "http://xml.org/sax/features/external-parameter-entities";
try {
jc = JAXBContext.newInstance(jaxbContext);
u = jc.createUnmarshaller();
/*jobsDef = (BatchJobs) u.unmarshal(DfBatchDriver.class
.getClassLoader().getResourceAsStream(
DfJobManager.configFile));*/
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature(FEATURE_GENERAL_ENTITIES, false);
dbf.setFeature(FEATURE_PARAMETER_ENTITIES, false);
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);
DocumentBuilder db = dbf.newDocumentBuilder();
Document document = db.parse(DfBatchDriver.class
.getClassLoader().getResourceAsStream(
DfJobManager.configFile));
jobsDef = (BatchJobs) u.unmarshal(document);

Categories

Resources