I am processing a list of URLS containing XML files. My problem is that some of them are not well formed because they contain "&"(ampersand) characters,l so my code cannot parse it correctly.
<elementType>CK037 - AT&ZN -SET</elementType>
How could I avoid this?? Should I first read the XML as a String and replace the "&" with "amp;" ?? Are there any other more appropiate solutions for my problem??
This is my code:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
Document doc = null;
try {
doc = factory.newDocumentBuilder().parse(new URL(inputURLString).openStream());
(...)
Thanks in advance.
Related
Generally using DOM, SAX or XPath etc parser we do take input from outside Java code like this:
File inputFile = new File("C:\\Users\\DELL\\Desktop\\catalog.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(inputFile);
So can you parse XML file without taking input like this? I want to write my XML code alongside Java code.
Use DocumentBuilder.parse(new InputStream(new StringReader(xml))) where xml is a string containing the XML to be parsed.
That's if you really must use DOM. I can't imagine why anyone uses it any more, when alternatives such as JDOM2 are so much better.
Which one I should use to parse the xml file. what is the recommended approach to the parse http-xml file. my approach is read xml as String and use DocumentBuilder to parse the String.
Is this right approach.
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
Document doc = null;
InputSource is = null;
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
is = new InputSource(new StringReader(xmlString));
doc = dBuilder.parse(is);
XML specifies its own encoding in <!xml encoding="..."> defaulting to UTF-8.
Using a StringReader using a String, already assumes that the reading has been done in a guessed encoding. That seems less recommendable, than using a pure binary format, like File or InputStream.
Another factor is the document base, to find included documents, xsd, dtd. There the usage of an XML catalog might help, storing such files offline.
(Disclaimer: using Rhino inside RingoJS)
Let's say I have a document with an element , I don't see how I can append nodes as string to this element. In order to parse the string to xml nodes and then append them to the node, I tried to use documentFragment but I couldn't get anywhere. In short, I need something as easy as .NET's .innerXML but it's not in the java api.
var dbFactory = javax.xml.parsers.DocumentBuilderFactory.newInstance();
var dBuilder = dbFactory.newDocumentBuilder();
var doc = dBuilder.newDocument();
var el = doc.createElement('test');
var nodesToAppend = '<foo bar="1">Hi <baz>there</baz></foo>';
el.appendChild(???);
How can I do this without using any third party library ?
[EDIT] It's not obvious in the example but I'm not supposed to know the content of variable 'nodesToAppend'. So please, don't point me to tutorials about how to create elements in an xml document.
You can do this in java - you should be able to derive the Rhino equivalent:
DocumentBuilderFactory dbFactory = javax.xml.parsers.DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.newDocument();
Element el = doc.createElement('test');
doc.appendChild(el);
String xml = "<foo bar=\"1\">Hi <baz>there</baz></foo>";
Document doc2 = builder.parse(new ByteArrayInputStream(xml.getBytes()));
Node node = doc.importNode(doc2.getDocumentElement(), true);
el.appendChild(node);
Since doc and doc2 are two different Documents the trick is to import the node from one document to another, which is done with the importNode api above
I think your question is like this question and there is answer on it :
Java: How to read and write xml files?
OR see this link http://www.mkyong.com/java/how-to-create-xml-file-in-java-dom/
In my main activity I have this call:
InputStream stream = http_conn.getInputStream();
ParseXML.Login(stream);
I know the input stream is working as I can create a buffered reader, creating a string that I can send to the UI. The issue is this reports the entire XML document that is being returned to me.
Within my ParseXML class Login method, I have the following:
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(stream);
doc.getDocumentElement().normalize();
So far so good, I think? I am new to using parsers, but basically the layout of my XML document is as follows:
<?xml version="1.0" encoding="UTF-8"?>
<string xmlns="http://www.xxx.com/asmx">TOKEN HERE</string>
I have seen examples in which you can retrieve various items from deeper with an XML file, as per the example here: http://www.mkyong.com/java/how-to-read-xml-file-in-java-dom-parser/
I'm not only new to XML parsers but new to java as well, I just can't figure out how to pull that string out of the XML document!
Thanks
I don't know if I'm understanding but if you want to get only TOKEN HERE try doc.getDocumentElement().getTextContent()
I have xml document in variable (not in file). How can i get data storaged in that? I don't have any additional file with that, i have it 'inside' my sourcecode. When i use
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(XML);
(XML is my xml variable), i get an error
java.io.FileNotFoundException: C:\netbeans\app-s7013\<network ip_addr="10.0.0.0\8" save_ip="true"> File not found.
Read your XML into a StringReader, wrap it in an InputSource, and pass that to your DocumentBuilder:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new InputSource(new StringReader(xml)));
Assuming that XML is a String, don't be confused by the version that takes a string - the string is a URL, not your input!
What you need is the version that takes an input stream.
You need to create an input stream based on a string (I'll try and find code sample, but you can Google for that). Usually a StringReader is involved.