My Java application takes XML input, and parses it using the simpleframework. I want it to accept JSON as well, therefore I want to convert the JSON to XML.
Tags and attributes are important, therefore I use the Badgerfish convention.
This works well in Python with xmljson, but I can't find a decent package to do this. GSON doesn't seem to have a Badgerfish implementation. This topic doesn't provide any tag/attribute retaining packages, the topic is a bit old as well.
Which Java packages can do the conversion from JSON to XML while putting tags/attributes at the right place?
Suggestions for alternative methods than Badgerfish are welcome as well...
Many thanks in advance!
You can use the XPath 3.1 json-to-xml() function, and then do an XSLT transformation on the generated XML to get it into the format you require.
Related
Background
I have a situation where I can get data either in the form of an XML-file or Excel/CSV-files. In case the data comes in a non-XML format it will be divided into several different files/tables, representing different subsections of the XML. The end goal is to validate the data and generate a valid XML-file using an existing schema, regardless of the format of the indata.
When receiving an XML-file the idea is to unmarshall and validate it. For simple errors autmatic fixes will be applied, and in the end a new XML-file will be marshalled from the JAXB classes.
Question
In order to be able to generalize as much as possible of the solution, my idea was to try to generate a JAXB representation of the non-XML data too, and then generate the end XML-file from those classes. I have been trying to find a good tutorial or introduction to converting non-XML to a JAXB representation, but I haven't really been able to find anything useful, which makes me wonder, is this a really bad approach? Any better suggestions for how to solve this problem? In the majority of the cases the files are likely to be non-XML, so I am willing to throw out the current approach if anyone has better solution that uses some other technology.
I've worked before with univocity parsers. They work well and are simple to use to converting CSV to Java object which then you searialize using JAXB as well.
I have a requirement to transform an incoming JSON to an output JSON. For this I am looking for a solution that can work based on templates. What I have in my mind is a solution on lines of XSLT transformation that allows converting an XML to a desired output format (XML, HTML, Text) defined by the style sheet.
One option(or rather a workaround) to use XSLT is to convert JSON to XML that is:
input JSON -> XML -> transform -> output JSON
This approach would have a performance overhead of converting JSON to XML and this would become prominent as the size of incoming object increases.
I found a Node/client layer solution that transforms JSON based on the rules specified in a template. More details about can be found [here][1]. However, I was not able to find any solution that works for a java based application.
Any thoughts/help in terms of solution/frameworks to resolve this would be really helpful.
Thanks.
You could try JOLT, advertised as a JSON to JSON transformation library written in Java.
Or you can search this thread for other libraries and tools which can transform JSON.
The new XSLT 3.0 draft also includes support for JSON as input and output format. Saxon has already started an implementation and seems to support for the JSON part.
You could try JSLT, which is a transform language where you write the fixed part of the output in JSON syntax, then insert expressions to compute the values you want to insert in the template. It's quite similar to how XSLT and XPath work together.
It's implemented in Java on top of Jackson.
For simple transformations you can use jmom library.
For a complex transformation you can use template framework like freemarker.
And convert json data to Map/List form using json library so it can be used by template framework.
I ve a HCSP File used by stellent(oracle product) and need of business is to convert a hcsp file to JSON format via java programming. Is there any standard way existing that I might not be aware of? Please give a pointer how to go for such conversion.
There might be a few core Java classes that could assist in this. Some of these are used when IsJson=1 is added in the URL.
Do you need a specific JSON layout? What does your HTML look like?
I want to convert request xml to JSON string. which framework is better to use? jettison, jackson, json-org,... and also how can I do this?
Any idea?
thanks
Afsaneh
If the goal is to go straight from XML to JSON, without any structure transformations, data changes, or inspections, then XStream or even the JSON in Java reference implementation at org.json can get the job done rather simply. In similar fashion, XSLT options are available, including XSLTJSON and xml2json-xslt.
If complicated interrogation and/or manipulation of the data and resulting JSON are in order, then Jackson combined with the jackson-xml-databind extension provide for a feature-rich option that also has excellent performance. (Performance comparisons of some JSON and XML serialization APIs are available at https://github.com/eishay/jvm-serializers/wiki.)
I would like to be able to parse XML that isn't necessarily well-formed. I'd be looking for a fuzzy rather than a strict parser, able to recover from badly nested tags, for example. I could write my own but it's worth asking here first.
Update:
What I'm trying to do is extract links and other info from HTML. In the case of well-formed XML I can use the Scala XML API. In the case of ill-formed XML, it would be nice to somehow convert it into correct XML (somehow) and deal with it the same way, otherwise I'd have to have two completely different sets of functions for dealing with documents.
Obviously because the input is not well-formed and I'm trying to create a well-formed tree, there would have to be some heuristic involved (such as when you see <parent><child></parent> you would close the <child> first and when you then see a <child> you ignore it). But of course this isn't a proper grammar and so there's no correct way of doing it.
What you're looking for would not be an XML parser. XML is very strict about nesting, closing, etc. One of the other answers suggests Tag Soup. This is a good suggestion, though technically it is much closer to a lexer than a parser. If all you want from XML-ish content is an event stream without any validation, then it's almost trivial to roll your own solution. Just loop through the input, consuming content which matches regular expressions along the way (this is exactly what Tag Soup does).
The problem is that a lexer is not going to be able to give you many of the features you want from a parser (e.g. production of a tree-based representation of the input). You have to implement that logic yourself because there is no way that such a "lenient" parser would be able to determine how to handle cases like the following:
<parent>
<child>
</parent>
</child>
Think about it: what sort of tree would expect to get out of this? There's really no sane answer to that question, which is precisely why a parser isn't going to be of much help.
Now, that's not to say that you couldn't use Tag Soup (or your own hand-written lexer) to produce some sort of tree structure based on this input, but the implementation would be very fragile. With tree-oriented formats like XML, you really have no choice but to be strict, otherwise it becomes nearly impossible to get a reasonable result (this is part of why browsers have such a hard time with compatibility).
Try the parser on the XHtml object. It is much more lenient than the one on XML.
Take a look at htmlcleaner. I have used it successfully to convert "HTML from the wild" to valid XML.
Try Tag Soup.
JTidy does something similar but only for HTML.
I mostly agree with Daniel Spiewak's answer. This is just another way to create "your own parser".
While I don't know of any Scala specific solution, you can try using Woodstox, a Java library that implements the StAX API. (Being an even-based API, I am assuming it will be more fault tolerant than a DOM parser)
There is also a Scala wrapper around Woodstox called Frostbridge, developed by the same guy who made the Simple Build Tool for Scala.
I had mixed opinions about Frostbridge when I tried it, but perhaps it is more suitable for your purposes.
I agree with the answers that turning invalid XML into "correct" XML is impossible.
Why don't you just do a regular text search for the hrefs if that's all you're interested in? One issue would be commented out links, but if the XML is invalid, it might not be possible to tell what is intended to be commented out!
Caucho has a JAXP compliant XML parser that is a little bit more tolerant than what you would usually expect. (Including support for dealing with escaped character entity references, AFAIK.)
Find JavaDoc for the parsers here
A related topic (with my solution) is listed below:
Scala and html parsing