I'm trying to read an xml file on from an android app using XOM as the XML library. I'm trying this:
Builder parser = new Builder();
Document doc = parser.build(context.openFileInput(XML_FILE_LOCATION));
But I'm getting nu.xom.ParsingException: Premature end of file. even when the file is empty.
I need to parse a very simple XML file, and I'm ready to use another library instead of XOM so let me know if there's a better one. or just a solution to the problem using XOM.
In case it helps, I'm using xerces to get the parser.
------Edit-----
PS: The purpose of this wasn't to parse an empty file, the file just happened to be empty on the first run which showed this error.
If you follow this post to the end, it seems that this has to do with xerces and the fact that its an empty file, and they didn't reach a solution on xerces side.
So I handled the issue as follows:
Document doc = null;
try {
Builder parser = new Builder();
doc = parser.build(context.openFileInput(XML_FILE_LOCATION));
}catch (ParsingException ex) { //other catch blocks are required for other exceptions.
//fails to open the file with a parsing error.
//I create a new root element and a new document.
//I fill them with xml data (else where in the code) and save them.
Element root = new Element("root");
doc = new Document(root);
}
And then I can do whatever I want with doc. and you can add extra checks to make sure that the cause is really an empty file (like check the file size as indicated by one of sam's comments on the question).
An empty file is not a well-formed XML document. Throwing a ParsingException is the right thing to do here.
Related
I have an issue that I get some some response as a String.
This String could be a normal string,number etc.. or an .xml file.
Now ,when I get an xml file, I want to treat it differently.
I am not able to distinguish between a string or an .xml file.
Also, this xml file could have some syntatic error.
Please suggest , how do I go ahead
Code is like this:
Document document = reader.read(new StringReader(xml));
where xml can be a string or an xml file itself.
If xml is a string , it is fine but if it is an xml file and with some syntax error then it should throw exception
If it is a proper XML document it should begin with a XML declaration. If that's there, it's intended to be a conforming XML document. If that's not there it cannot be a conforming XML document.
If you are using a coding language like C#, then you can use - XmlDocument.loadxml -
http://msdn.microsoft.com/en-us/library/system.xml.xmldocument.loadxml.aspx
This will throw error if the string is not in correct xml format.
I am working on an Android application that parses one or more XML feeds based on user preferences. Is it possible to parse (using SAX Parser) more than one XML feed at once by providing the parser with an array of URLs of my XML feeds?
If no, what would be an alternative way of listing the parsed items from different XML feeds in one list? An intuitive approach is to use java.io.SequenceInputStream to merge the two input streams. However, this throws a NullPointerException:
try {
URL urlOne = new URL("http://example.com/feedone.xml");
URL urlTwo = new URL("http://example.com/feedtwo.xml");
InputStream streamOne = urlOne.openStream();
InputStream streamTwo = urlTwo.openStream();
InputStream streamBoth = new SequenceInputStream(streamOne, streamTwo);
InputSource sourceBoth = new InputSource(streamBoth);
//Parsing
stream = xmlHandler.getStream();
}
catch (Exception error) {
error.printStackTrace();
}
List<Item> content = stream.getList();
return content;
The tactic of appending the streams before parsing is not likely to work well, as the appended XML will not be valid XML. As each XML input has its own root element, the appended XML will have multiple roots, which is not permitted in XML. Additionally it's likely to have multiple XML headers like
<?xml version="1.0" encoding="UTF-8"?>
which is also invalid.
While it's possible to preprocess the input to work around these issues, you're likely better off parsing them separately and dealing with getting the results combined later.
It's possible to make a SAX parser add the parsed elements to an existing list of elements. If you post code in your question showing how you're parsing a single file, we might be able to help figure out how to adjust it to your need for multiple inputs.
I have this XML file which doesn't have a root node. Other than manually adding a "fake" root element, is there any way I would be able to parse an XML file in Java? Thanks.
I suppose you could create a new implementation of InputStream that wraps the one you'll be parsing from. This implementation would return the bytes of the opening root tag before the bytes from the wrapped stream and the bytes of the closing root tag afterwards. That would be fairly simple to do.
I may be faced with this problem too. Legacy code, eh?
Ian.
Edit: You could also look at java.io.SequenceInputStream which allows you to append streams to one another. You would need to put your prefix and suffix in byte arrays and wrap them in ByteArrayInputStreams but it's all fairly straightforward.
Your XML document needs a root xml element to be considered well formed. Without this you will not be able to parse it with an xml parser.
One way is to provide your own dummy wrapper without touching the original 'xml' (the not well formed 'xml') Need the word for that:
Syntax
<!DOCTYPE some_root_elem SYSTEM "/home/ego/some.dtd"
[
<!ENTITY entity-name "Some value to be inserted at the entity">
]
Example:
<!DOCTYPE dummy [
<!ENTITY data SYSTEM "http://wherever-my-data-is">
]>
<dummy>
&data;
</dummy>
You could use another parser like Jsoup. It can parse XML without a root.
I think even if any API would have an option for this, it will only return you the first node of the "XML" which will look like a root and discard the rest.
So the answer is probably to do it yourself. Scanner or StringTokenizer might do the trick.
Maybe some html parsers might help, they are usually less strict.
Here's what I did:
There's an old java.io.SequenceInputStream class, which is so old that it takes Enumeration rather than List or such.
With it, you can prepend and append the root element tags (<div> and </div> in my case) around your no-root XML stream. (You shouldn't do it by concatenating Strings due to performance and memory reasons.)
public void tryExtractHighestHeader(ParserContext context)
{
String xhtmlString = context.getBody();
if (xhtmlString == null || "".equals(xhtmlString))
return;
// The XHTML needs to be wrapped, because it has no root element.
ByteArrayInputStream divStart = new ByteArrayInputStream("<div>".getBytes(StandardCharsets.UTF_8));
ByteArrayInputStream divEnd = new ByteArrayInputStream("</div>".getBytes(StandardCharsets.UTF_8));
ByteArrayInputStream is = new ByteArrayInputStream(xhtmlString.getBytes(StandardCharsets.UTF_8));
Enumeration<InputStream> streams = new IteratorEnumeration(Arrays.asList(new InputStream[]{divStart, is, divEnd}).iterator());
try (SequenceInputStream wrapped = new SequenceInputStream(streams);) {
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(wrapped);
From here you can do whatever you like, but keep in mind the extra element.
XPath xPath = XPathFactory.newInstance().newXPath();
}
catch (Exception e) {
throw new RuntimeException("Failed parsing XML: " + e.getMessage());
}
}
I'm working on a project under which i have to take a raw file from the server and convert it into XML file.
Is there any tool available in java which can help me to accomplish this task like JAXP can be used to parse the XML document ?
I guess you will need your objects for later use ,so create MyObject that will be some bean that you will load the values form your Raw File and you can write this to someFile.xml
FileOutputStream os = new FileOutputStream("someFile.xml");
XMLEncoder encoder = new XMLEncoder(os);
MyObject p = new MyObject();
p.setFirstName("Mite");
encoder.writeObject(p);
encoder.close();
Or you con go with TransformerFactory if you don't need the objects for latter use.
Yes. This assumes that the text in the raw file is already XML.
You start with the DocumentBuilderFactory to get a DocumentBuilder, and then you can use its parse() method to turn an input stream into a Document, which is an internal XML representation.
If the raw file contains something other than XML, you'll want to scan it somehow (your own code here) and use the stuff you find to build up from an empty Document.
I then usually use a Transformer from a TransformerFactory to convert the Document into XML text in a file, but there may be a simpler way.
JAXP can also be used to create a new, empty document:
Document dom = DocumentBuilderFactory.newInstance()
.newDocumentBuilder()
.newDocument();
Then you can use that Document to create elements, and append them as needed:
Element root = dom.createElement("root");
dom.appendChild(root);
But, as Jørn noted in a comment to your question, it all depends on what you want to do with this "raw" file: how should it be turned into XML. And only you know that.
I think if you try to load it in an XmlDocument this will be fine
I need your expertise once again. I have a java class that searches a directory for xml files (displays the files it finds in the eclipse console window), applies the specified xslt to these and sends the output to a directory.
What I want to do now is create an xml containing the file names and file format types. The format should be something like;
<file>
<fileName> </fileName>
<fileType> </fileType>
</file>
<file>
<fileName> </fileName>
<fileType> </fileType>
</file>
Where for every file it finds in the directory it creates a new <file>.
Any help is truely appreciated.
Use an XML library. There are plenty around, and the third party ones are almost all easier to use than the built-in DOM API in Java. Last time I used it, JDom was pretty good. (I haven't had to do much XML recently.)
Something like:
Element rootElement = new Element("root"); // You didn't show what this should be
Document document = new Document(rootElement);
for (Whatever file : files)
{
Element fileElement = new Element("file");
fileElement.addContent(new Element("fileName").addContent(file.getName());
fileElement.addContent(new Element("fileType").addContent(file.getType());
}
String xml = XMLOutputter.outputString(document);
Have a look at DOM and ECS. The following example was adapted to you requirements from here:
XMLDocument document = new XMLDocument();
for (File f : files) {
document.addElement( new XML("file")
.addXMLAttribute("fileName", file.getName())
.addXMLAttribute("fileType", file.getType())
)
);
}
You can use the StringBuilder approach suggested by Vinze, but one caveat is that you will need to make sure your filenames contain no special XML characters, and escape them if they do (for example replace < with <, and deal with quotes appropriately).
In this case it probably doesn't arise and you will get away without it, however if you ever port this code to reuse in another case, you may be bitten by this. So you might want to look at an XMLWriter class which will do all the escaping work for you.
Well just use a StringBuilder :
StringBuilder builder = new StringBuilder();
for(File f : files) {
builder.append("<file>\n\t<fileName>").append(f.getName).append("</fileName>\n)";
[...]
}
System.out.println(builder.toString());