I'm working on a project under which i have to take a raw file from the server and convert it into XML file.
Is there any tool available in java which can help me to accomplish this task like JAXP can be used to parse the XML document ?
I guess you will need your objects for later use ,so create MyObject that will be some bean that you will load the values form your Raw File and you can write this to someFile.xml
FileOutputStream os = new FileOutputStream("someFile.xml");
XMLEncoder encoder = new XMLEncoder(os);
MyObject p = new MyObject();
p.setFirstName("Mite");
encoder.writeObject(p);
encoder.close();
Or you con go with TransformerFactory if you don't need the objects for latter use.
Yes. This assumes that the text in the raw file is already XML.
You start with the DocumentBuilderFactory to get a DocumentBuilder, and then you can use its parse() method to turn an input stream into a Document, which is an internal XML representation.
If the raw file contains something other than XML, you'll want to scan it somehow (your own code here) and use the stuff you find to build up from an empty Document.
I then usually use a Transformer from a TransformerFactory to convert the Document into XML text in a file, but there may be a simpler way.
JAXP can also be used to create a new, empty document:
Document dom = DocumentBuilderFactory.newInstance()
.newDocumentBuilder()
.newDocument();
Then you can use that Document to create elements, and append them as needed:
Element root = dom.createElement("root");
dom.appendChild(root);
But, as Jørn noted in a comment to your question, it all depends on what you want to do with this "raw" file: how should it be turned into XML. And only you know that.
I think if you try to load it in an XmlDocument this will be fine
Related
This code is in Java and uses Saxon
I am implementing a transform function to transform xml and several secondary xml sources
All of the inputs are not files, so I cannot use document() or other methods that define files directly
String transform(String xml, List<String> secondaryXmls, String xslt);
It outputs the transformed xml result
I am successful in applying the transformation from xslt to the single xml file, but I have difficulties in applying transformation that also utilize the secondaryXmls. I have done my research and still could not find the right method to apply these
here is a snapshot of the code
TransformerFactory tFactory = TransformerFactory.newInstance("net.sf.saxon.TransformerFactoryImpl",null);
Document transformerDoc = loadXMLFromString(xslt);
Source transformerSource = new DOMSource(transformerDoc);
Transformer transformer = tFactory.newTransformer(transformerSource);
Document sourceDoc = loadXMLFromString(xml);
Source source = new DOMSource(sourceDoc);
DOMResult result = new DOMResult();
transformer.transform(source, result);
Document resultDoc = (Document) result.getNode();
return getStringFrom(resultDoc);
Thanks!
EDIT:
Which is the better way:
concatenating all the xmls, transform, return only the original part filtering the concatenated secondary xmls
Write a code that adds
<xsl:variable name="asd" select="document('asd')">
on top of the xslt string
First thing - get rid of all that DOM stuff! Using the DOM with Saxon slows it down by a factor of ten. Let Saxon build the trees in its own format, by using a StreamSource or SAXSource, and a StreamResult. Or you can build a tree in Saxon format yourself, if you want, using the s9api DocumentBuilder class.
Then as to the answer to your question: here are three possible solutions:
(a) supply the documents as a stylesheet parameter of type document-node()* (that is, a sequence of document nodes). In the Java, convert your list of XML strings to a list of document nodes by calling Configuration.buildDocument() on each one.
(b) write a URIResolver whose effect is to interpret the URI doc/3 as meaning the third document in the list; then use document('doc/3') to fetch that document.
(c) write a CollectionURIResolver which makes the whole collection of documents available using the collection() function.
I'm trying to read an xml file on from an android app using XOM as the XML library. I'm trying this:
Builder parser = new Builder();
Document doc = parser.build(context.openFileInput(XML_FILE_LOCATION));
But I'm getting nu.xom.ParsingException: Premature end of file. even when the file is empty.
I need to parse a very simple XML file, and I'm ready to use another library instead of XOM so let me know if there's a better one. or just a solution to the problem using XOM.
In case it helps, I'm using xerces to get the parser.
------Edit-----
PS: The purpose of this wasn't to parse an empty file, the file just happened to be empty on the first run which showed this error.
If you follow this post to the end, it seems that this has to do with xerces and the fact that its an empty file, and they didn't reach a solution on xerces side.
So I handled the issue as follows:
Document doc = null;
try {
Builder parser = new Builder();
doc = parser.build(context.openFileInput(XML_FILE_LOCATION));
}catch (ParsingException ex) { //other catch blocks are required for other exceptions.
//fails to open the file with a parsing error.
//I create a new root element and a new document.
//I fill them with xml data (else where in the code) and save them.
Element root = new Element("root");
doc = new Document(root);
}
And then I can do whatever I want with doc. and you can add extra checks to make sure that the cause is really an empty file (like check the file size as indicated by one of sam's comments on the question).
An empty file is not a well-formed XML document. Throwing a ParsingException is the right thing to do here.
I've got an xml file looked like this. employees.xml
<Employees>
<Employee>
<FirstName>myFirstName</FirstName>
<LastName>myLastName</LastName>
<Salary>10000</Salary>
<Employee>
</Employees>
Now how do I add new Employee elements to the existing XML file?.. An example code is highly appreciated.
You can't 'write nodes to an existing XML file.' You can read an existing XML file into memory, add to the data model, and then write a new file. You can rename the old file and write the new file under the old name. But there is no commonly-used Java utility that will modify an XML file in place.
To add to an existing XML file, you generally need to read it in to an internal data structure, add the needed data in the internal form and then write it all out again, overwriting the original file.
The internal structure can be DOM or one of your own making, and there are multiple ways of both reading it in and writing it out.
If the data is reasonably small, DOM is probably easiest, and there is some sample code in the answers to this related question.
If your data is large, DOM will not do. Possible approaches are to use SAX to read and write (though SAX is traditionally only a reading mechanism) as described in an answer to another related question.
You might also want to consider JAXB or (maybe even best) StAX.
Please use xstream to parse your file as an object, or create a list with employees and then you can directly convert that to xml.
I think this link can be useful for you.
Here you have samples how to read / parse, modify (add elements) and save (write to xml file again).
The following samples you can find at: http://www.petefreitag.com/item/445.cfm
Read:
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
Document doc = docBuilder.parse("/path/to/file.xml");
Modify:
// attributes
Node earth = doc.getFirstChild();
NamedNodeMap earthAttributes = earth.getAttributes();
Attr galaxy = doc.createAttribute("galaxy");
galaxy.setValue("milky way");
earthAttributes.setNamedItem(galaxy);
// nodes
Node canada = doc.createElement("country");
canada.setTextContent("ca");
earth.appendChild(canada);
Write XML file:
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
//initialize StreamResult with File object to save to file
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
String xmlString = result.getWriter().toString();
System.out.println(xmlString);
You need to use DOM to write/edit your xml file.
It's very easy:
You just need to create nodes and add attributes to it.
You can also write/edit XSLT files by using DOM.
just search google for DOM java
I am having a XML document objecy created on the fly. I need to validate it against Schema. I am using xerces 2. I have set features for the parser.Now i need to parse to validate the XML.
For this i need to call "parser.parse()". But parse() method takes "InputSource" as parameter. But i have Document object. How do i convert this Document object to "InputSource" for passing it to parse() method.
Can anybody help.
Best Regards,
ByteArrayOutputStream docOutputStream = new ByteArrayOutputStream();
((XmlDocument)domDocument).write(docOutputStream);
ByteArrayInputStream docInputStream = new
ByteArrayInputStream(docOutputStream.toByteArray());
InputSource inputSource = new InputSource(docInputStream);
parser.parse(inputSource);
See this question to convert the Document to an InputStream: how to create an InputStream from a Document or Node
Then use InputSource(java.io.InputStream byteStream) to wrap that with an InputSource.
You should be able to to this:
Create a javax.xml.validation.Schema instance based on your schema resources.
Create a javax.xml.validation.Validator from the schema instance
validate your DOM document using the validator and a javax.xml.transfrom.dom.DOMSource
I have this XML file which doesn't have a root node. Other than manually adding a "fake" root element, is there any way I would be able to parse an XML file in Java? Thanks.
I suppose you could create a new implementation of InputStream that wraps the one you'll be parsing from. This implementation would return the bytes of the opening root tag before the bytes from the wrapped stream and the bytes of the closing root tag afterwards. That would be fairly simple to do.
I may be faced with this problem too. Legacy code, eh?
Ian.
Edit: You could also look at java.io.SequenceInputStream which allows you to append streams to one another. You would need to put your prefix and suffix in byte arrays and wrap them in ByteArrayInputStreams but it's all fairly straightforward.
Your XML document needs a root xml element to be considered well formed. Without this you will not be able to parse it with an xml parser.
One way is to provide your own dummy wrapper without touching the original 'xml' (the not well formed 'xml') Need the word for that:
Syntax
<!DOCTYPE some_root_elem SYSTEM "/home/ego/some.dtd"
[
<!ENTITY entity-name "Some value to be inserted at the entity">
]
Example:
<!DOCTYPE dummy [
<!ENTITY data SYSTEM "http://wherever-my-data-is">
]>
<dummy>
&data;
</dummy>
You could use another parser like Jsoup. It can parse XML without a root.
I think even if any API would have an option for this, it will only return you the first node of the "XML" which will look like a root and discard the rest.
So the answer is probably to do it yourself. Scanner or StringTokenizer might do the trick.
Maybe some html parsers might help, they are usually less strict.
Here's what I did:
There's an old java.io.SequenceInputStream class, which is so old that it takes Enumeration rather than List or such.
With it, you can prepend and append the root element tags (<div> and </div> in my case) around your no-root XML stream. (You shouldn't do it by concatenating Strings due to performance and memory reasons.)
public void tryExtractHighestHeader(ParserContext context)
{
String xhtmlString = context.getBody();
if (xhtmlString == null || "".equals(xhtmlString))
return;
// The XHTML needs to be wrapped, because it has no root element.
ByteArrayInputStream divStart = new ByteArrayInputStream("<div>".getBytes(StandardCharsets.UTF_8));
ByteArrayInputStream divEnd = new ByteArrayInputStream("</div>".getBytes(StandardCharsets.UTF_8));
ByteArrayInputStream is = new ByteArrayInputStream(xhtmlString.getBytes(StandardCharsets.UTF_8));
Enumeration<InputStream> streams = new IteratorEnumeration(Arrays.asList(new InputStream[]{divStart, is, divEnd}).iterator());
try (SequenceInputStream wrapped = new SequenceInputStream(streams);) {
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(wrapped);
From here you can do whatever you like, but keep in mind the extra element.
XPath xPath = XPathFactory.newInstance().newXPath();
}
catch (Exception e) {
throw new RuntimeException("Failed parsing XML: " + e.getMessage());
}
}