how to parse XML document? - java

I have xml document in variable (not in file). How can i get data storaged in that? I don't have any additional file with that, i have it 'inside' my sourcecode. When i use
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(XML);
(XML is my xml variable), i get an error
java.io.FileNotFoundException: C:\netbeans\app-s7013\<network ip_addr="10.0.0.0\8" save_ip="true"> File not found.

Read your XML into a StringReader, wrap it in an InputSource, and pass that to your DocumentBuilder:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new InputSource(new StringReader(xml)));

Assuming that XML is a String, don't be confused by the version that takes a string - the string is a URL, not your input!
What you need is the version that takes an input stream.
You need to create an input stream based on a string (I'll try and find code sample, but you can Google for that). Usually a StringReader is involved.

Related

Can I parse XML in Java without taking XML file input from outside?

Generally using DOM, SAX or XPath etc parser we do take input from outside Java code like this:
File inputFile = new File("C:\\Users\\DELL\\Desktop\\catalog.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(inputFile);
So can you parse XML file without taking input like this? I want to write my XML code alongside Java code.
Use DocumentBuilder.parse(new InputStream(new StringReader(xml))) where xml is a string containing the XML to be parsed.
That's if you really must use DOM. I can't imagine why anyone uses it any more, when alternatives such as JDOM2 are so much better.

Java replacing ampersand in XML file

I am processing a list of URLS containing XML files. My problem is that some of them are not well formed because they contain "&"(ampersand) characters,l so my code cannot parse it correctly.
<elementType>CK037 - AT&ZN -SET</elementType>
How could I avoid this?? Should I first read the XML as a String and replace the "&" with "amp;" ?? Are there any other more appropiate solutions for my problem??
This is my code:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
Document doc = null;
try {
doc = factory.newDocumentBuilder().parse(new URL(inputURLString).openStream());
(...)
Thanks in advance.

Http charset vs xml encoding (utf-8, utf-16, etc)

Which one I should use to parse the xml file. what is the recommended approach to the parse http-xml file. my approach is read xml as String and use DocumentBuilder to parse the String.
Is this right approach.
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
Document doc = null;
InputSource is = null;
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
is = new InputSource(new StringReader(xmlString));
doc = dBuilder.parse(is);
XML specifies its own encoding in <!xml encoding="..."> defaulting to UTF-8.
Using a StringReader using a String, already assumes that the reading has been done in a guessed encoding. That seems less recommendable, than using a pure binary format, like File or InputStream.
Another factor is the document base, to find included documents, xsd, dtd. There the usage of an XML catalog might help, storing such files offline.

org.xml.sax.SAXParseException: cvc-elt.1: Cannot find the declaration of element 'tns:root_element'

I have spent past 2 hours on this. Am unable to figure out why this error is occurring. I have a simple xsd and xml code
xml file:
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
<element name="root_element" type="string"/>
</schema>
xsd file:
<?xml version="1.0" encoding="UTF-8"?>
<root_element>"asd"</root_element>
My java code is:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
SchemaFactory s_factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
dbf.setSchema(s_factory.newSchema(new File(schemafile)));
dbf.setValidating(true);
dbf.setFeature("http://apache.org/xml/features/validation/schema", true);
DocumentBuilder db = dbf.newDocumentBuilder();
CommodityPropsErrorHandler cp_eh = new CommodityPropsErrorHandler();
db.setErrorHandler(cp_eh);
Document doc = db.parse(new File(props_file));
Any comments would be helpful. regards
I think that main issue is with:
dbf.setValidating(true);
According to Java API, DocumentBuilderFactory.setValidating:
Specifies that the parser produced by this code will validate
documents as they are parsed. By default the value of this is set to
false.
Note that "the validation" here means a validating parser as defined
in the XML recommendation. In other words, it essentially just
controls the DTD validation. (except the legacy two properties
defined in JAXP 1.2.)
To use modern schema languages such as W3C XML Schema or RELAX NG
instead of DTD, you can configure your parser to be a non-validating
parser by leaving the setValidating(boolean) method false, then
use the setSchema(Schema) method to associate a schema to a parser.
Also you don't need:
dbf.setFeature("http://apache.org/xml/features/validation/schema", true);
Your working code probably is just (however I don't know what is in CommodityPropsErrorHandler class):
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
SchemaFactory s_factory =
SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
dbf.setSchema(s_factory.newSchema(new File(schemafile)));
DocumentBuilder db = dbf.newDocumentBuilder();
CommodityPropsErrorHandler cp_eh = new CommodityPropsErrorHandler();
db.setErrorHandler(cp_eh);
Document doc = db.parse(new File(props_file));
Here is second, alternative approach with previous dbf.setValidating(true); (that is, using this two properties from JAXP, mentioned in Java API):
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
dbf.setValidating(true);
dbf.setAttribute("http://java.sun.com/xml/jaxp/properties/schemaLanguage",
XMLConstants.W3C_XML_SCHEMA_NS_URI);
dbf.setAttribute("http://java.sun.com/xml/jaxp/properties/schemaSource",
new File(schemafile));
DocumentBuilder db = dbf.newDocumentBuilder();
CommodityPropsErrorHandler cp_eh = new CommodityPropsErrorHandler();
db.setErrorHandler(cp_eh);
Document doc = db.parse(new File(props_file));
This line is for making validation namespace aware. Otherwise it will give Element not present in the doc.
dbf.setNamespaceAware(true);

Java XML parser?

I'm currently converting a program I wrote in Visual Basic .NET (the 2005 variety) into Java. It used built-in XML methods to parse and generate the user's saved data, does Java have an equivalent feature built in or am I going to have to change file processing implementations? (I'd rather not, there's a lot of code I'd have to change.)
Yes, Java can parse XML. Here's an example that takes in a String that contains XML and builds a Document object out of it:
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
InputSource inputSource = new InputSource(new StringReader(xml));
Document document = documentBuilder.parse(inputSource);
You can then use the XPath API to query the dom. Here's a tutorial/writeup about it.
As far as serializing objects to XML, the official implementation is JAXB and it is part of Java since 1.6. Here's a simple example. It will let you serialize and deserialize to and from XML.
You can also create a DOM object manually and add nodes to it, but it's a little more tedious:
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.newDocument();
Element rootNode = document.createElement("root");
Element childNode = document.createElement("child");
childNode.setTextContent("I am a child node");
childNode.setAttribute("attr", "value");
rootNode.appendChild(childNode);
document.appendChild(rootNode);
I'm assuming that you mean that the properties/structure was generated through the classes/beans themselves? If so, then the answer is no [without an third party component]. I've used XStream before, and that is about the closest that I've gotten to .NET's XML Class serialization.

Categories

Resources