I have to generate huge and quite complex xml files by Java. I have to fetch the data from a Oracle database. What I really don't know is a proper and reliable way to this? I could of course create a String and concatenate all the tags, attributes and data but it doesn't feel right. I guess this is a quite common task and there are many established ways to this by Java. My question is what is the best way to this? What is your suggestion?
Thank you for any clues...
You could use JAXB for building XML out of structured objects that are a result of querying your data store.
If your object hierarchy is not complex, you can use Oracle's capability to generate results in XML.
There are several options for object to xml transformation.
jaxb
saxparser
dom parser
I would personally suggest JAXB for easy of use and saxparser for performance centric application.
You can use JAXP(Java API for XML parsing ) to create a XML structure.This is having all the features you wanted.
Related
Background
I have a situation where I can get data either in the form of an XML-file or Excel/CSV-files. In case the data comes in a non-XML format it will be divided into several different files/tables, representing different subsections of the XML. The end goal is to validate the data and generate a valid XML-file using an existing schema, regardless of the format of the indata.
When receiving an XML-file the idea is to unmarshall and validate it. For simple errors autmatic fixes will be applied, and in the end a new XML-file will be marshalled from the JAXB classes.
Question
In order to be able to generalize as much as possible of the solution, my idea was to try to generate a JAXB representation of the non-XML data too, and then generate the end XML-file from those classes. I have been trying to find a good tutorial or introduction to converting non-XML to a JAXB representation, but I haven't really been able to find anything useful, which makes me wonder, is this a really bad approach? Any better suggestions for how to solve this problem? In the majority of the cases the files are likely to be non-XML, so I am willing to throw out the current approach if anyone has better solution that uses some other technology.
I've worked before with univocity parsers. They work well and are simple to use to converting CSV to Java object which then you searialize using JAXB as well.
If one needed to be able to display certain elements that contain some certain data and then sort them based on this data.
Which would be a better choice for a XML parser, DOM or SAX?
Also can either of these achieve sorting of XML data without the need of storing the data first?
Sorting will require you to read in all of the XML document to memory. So working with a DOM will probably be easier. There are good libraries available that make working with a DOM easier:
dom4j
JDOM
It would be a better to use STAX (Streaming API for XML), because it is universal solution for tiny or large files, but if your XML files isn't bigger you could use DOM, because it will be easier. Also you could make xpath query when using DOM, that could be helpful for you.
Woodstox
Aalto XML processor
I dont know how to read data from such XML file. Lets say i want to read every every GUID and userID. How do i do it?
Here is part of XML: http://pastebin.com/7B25eyFz
if your xml file is Tree base then use DOM, if it is not nested then use SAX, is faster then DOM.
You may use Xstream
Look into SAX Parser. Also, do a search for your terms - there are a ton of questions about this topic.
Have you read the trail about XML of the Java tutorial?
You should use an XML library like XOM. You can then use it to query the XML document using XPATH. XOM offers a tutorial.
Adding to #user651407 point, If you just want to read the XML then go for SAX, It parses the XML in serial fashion so its faster, but if you want to do more complex operation like Adding, Updating or deleting a node then go for DOM but DOM Has Limitation
1. required more memory as entire XML is loaded at a time.
2. Slow in processing as it is a tree based parser.
I am working on converting an excel spread sheet into an xml document that needs to be validated against a schema. I am currently building the xml document using the DOM api, and validating at the end using SAX and a custom error handler. However, I would really like to be able to validate the xml produced from each Cell as I parse the excel document so I can indicate which cells are problematic in a friendlier way.
The problem that I am currently encountering, is that after validating the xml for the simple types, once they are built into a complex type, all the children nodes get validated again, producing redundant errors.
I found this question here at SO but it is using C# and the Microsoft API.
Thoughts? Thanks!
Sorry, but I don't see the problem. You are producing the XML, so what's the point in validating the XML while you produce it?
Are you looking to validate the cell contents? If yes, then write validation logic into your code. This validation logic may replicate the schema, but I suspect that it will actually be much more detailed than the schema.
Are you looking to validate your program's output? If yes, then write unit tests.
You could try having your parsing code fire SAX events instead of directly constructing a DOM. Then you could just register a validating SAX ContentHandler to listen to it and have that build your DOM for you. That should detect validation errors as they're encountered.
So the solution that I decided to go with and am almost finished implementing, was to use XSOM to parse the XSD. Than when parsing the Excel file, I looked up the column name in the parsed XSD to pull out the restrictions (since the column headers map to simple types in the XSD) and than did manual validation against the restrictions. I am still building the tree so that at the end of it I can validate the entire XML tree against the XSD since there are some things that I can't catch at the Cell level.
Thanks for all of your input.
Try building schemas at multiple levels of granularity. Test the simple (Cells) ones against the most granular, and the complex ones (Rows?) against a less granular schema that doesn't decompose the complex types.
I have a requirement where i need to generate html forms on the fly based on many different xml schema's (as of now i have 20 of them and the count keeps increasing). I need to collect data from the user to create instance docs corresponding to each of them and then store the instance docs in db....
challenges
1) schema has lot of unbounded complex types. so we doesnt know in advance the number and type of input types to be created. so pre-creating html etc is not an option
2) even if i can handle generation of the form on the fly, the problem is collecting the data entered..as forms generated dynamically should/will have dynamic id/names for input types
Can anyone suggest the best way to implement this?
thank you in advance
It seems to me like a clear case for XSLT.
Generating HTML from XML through XSLT is the primary goal of XSLT.
As for the id/names, you can create an XSLT which will also generate a set of id/names in a way that you can use.
Use WSDL2XForms to create XForms from XML Schemas (XSD). Then publish them with Chiba (chiba.sourceforge.net) - it converts these XForms to standard HTML forms on the server side.
The Google Code project xsd-forms seems to be a promising approach.
A XQuery-based translator from XSD to XForms is available at http://en.wikibooks.org/wiki/XRX/XForms_Generator.
I don't know much about that one: http://nunojob.wordpress.com/2008/01/05/creating-a-user-interface-for-xml-schema-using-xforms/. Seems to be a presentation only.
We had a problem somewhat like this. One of our team thought that we ought to be able to create a web form UI on the fly to accept data conforming to an XSD. It turned out that this is very difficult ... given all the complexity of full XSD. So we ended up inventing our own schema language (which was both simpler and richer than XSD) and using this as the basis for generating our UI layouts. We also implemented a tool-chain for creating and validating the schemas and for generating equivalent XSDs and OWL schemas.