What is the difference of them? It is said that JAXP is only a API Specification, JDOM and DOM4J realized it, is it right? And all of them need a XML parser, just like XERCES, is it right?
thanks in advance!
JAXP (JSR-206)
Is a set of standard APIs for Java XML parsers. It covers the following areas:
DOM (org.w3c.dom package)
SAX (org.xml.sax package)
StAX/JSR-173 (java.xml.stream)
XSLT (javax.xml.transform)
XPath (javax.xml.xpath)
Validation (javax.xml.validation)
Datatypes (javax.xml.datatype)
This standard was created by an expert group with representatives from many companies and individuals. As a standard this means there are multiple implementations (Xerces implements JAXP), and it can be included in the JDK.
Xerces
Is an open source Java XML parser that provides DOM and SAX implementations that are compliant with the JAXP standard.
JDOM and DOM4J
Are open source Java XML parsers.
You're comparing apples and automobiles.
JAXP is an API that is now bundled with the JDK
JDOM is a different API, but also a library
DOM4J is also a different API and library
XERCES is a XML parser implemented in Java. A version of XERCES is also bundled in the JDK.
Which API you use is largely a question of personal preference. I like JDOM in part because I'm used to working with it. There are, similarly, several implementations of XML parsers. If you're programming in Java using a recent JDK, you will be able to use JAXP without having to add external libraries.
Related
I have the following question: Is there any example of how to process xsl transformation?
I know that I can use the Transformer class (XSL API) but I want to know something more about the background work so I can adopt it for other languages/systems.
I already read some xsl tutorials but there are only descriptions about how xsl is structured and the meaning of the tags but I found no overview about the steps to process.
It might help you to know that the native API for XSLT transformations is called JAXP. If you google for "JAXP Transformation API examples" you will find many sites that give examples and tutorials - it's hard for me to advise which you should go to first, because everyone's learning style is different.
You need to be aware that the built-in XSLT engine in the JDK (a version of Xalan) is very old, and only supports the XSLT 1.0 language which came out in 1999. Unless you have strong reasons to use XSLT 1.0, you should really be using something more up to date (XSLT 2.0 came out in 2007, XSLT 3.0 in 2017). These versions are supported by third-party products such as Saxon. Saxon supports the JAXP API, but if you want to take advantage of all the features of XSLT 2.0/3.0 then you're better off using Saxon's native API, known as s9api (pronounced "snappy"). You can find examples of that in the documentation and resource downloads at www.saxonica.com.
Is it possible switch the native parser, which I believe is based on Java reflection. We have some performance issues and wondering whether we can switch the implementation.
Your advise is highly appreciated.
Additional information: This is inherited code and we need to fix performance issues in our web-services. I am looking for performance boost without code changes. The existing code uses JAXB for marshalling and unmarshalling java objects which are generated via CXF (wsdl to java).
My goal is to switch the implementation to sTax and then use Woodstox library.
If your JAXB implementation uses a StAX parser under the covers via the standard JAXP APIs, then adding the Woodstox jar to your classpath should cause your JAXB impl to use Woodstox. You should see a performance improvement by doing this.
Since the Woodstox jar contains the following entries, adding it to the classpath will allow the JAXP APIs to return an instance of it:
META-INF/services/javax.xml.stream.XMLInputFactory
META-INF/services/javax.xml.stream.XMLOuputFactory
Note: I lead EclipseLink JAXB (MOXy), and MOXy uses a StAX parser when one is available. The other JAXB implementations (Metro, JaxMe) probably do the same thing.
The java xml ecosystem seems awash in current implementations, API definitions and libraries all with cryptic names. (Web searches frequently turn up references to old/out-of-date implementations as well.)
To list just some of the terms out there (by no means exhaustive): Crimson, Xerces, Xalan, JDOM, Saxon, XOM, JAXP
Are there any good references out there for getting an overview of what libraries and frameworks are currently available and how they compare?
Particular questions it would be helpful for a reference to address:
What things are part of a standard current java JDK or SDK download?
What are the dependencies amongst the libraries/frameworks?
What is current, and what supersedes what?
There is an article "Choose Your Java XML Parser" that compares Xerces (Apache), XDK (Oracle) and JAXP (from Sun, is part of JDK).
Another article is "XML and Java technologies" from IBM. It compares JDOM, dom4j, EXML, XPP, Crimson and Xerces.
Does Java have a built in XML library for generating and parsing documents? If not, which third party one should I be using?
The Sun Java Runtime comes with the Xerces and Xalan implementations that provide the ability to parse XML (via the DOM and SAX intefaces), and also perform XSL transformations and execute XPath queries.
However, it is better to use the JAXP API to work on XML, since JAXP allows you to not worry about the underlying implementation used (Xerces or Crimson or any other). When you use JAXP, at runtime the JRE will use the service provider it can locate, to perform the needed operations. As indicated previously, Xerces/Xalan will be used since it is shipped with the Sun JRE (not others though), so you dont have to download and install a specific provider (say, a different version of Xerces, or Crimson).
A basic JAXP tutorial can be found in The J2EE 1.4 tutorial (Its from the J2EE tutorial, but it will help).
Do note that the Xerces/Xalan implementations provided by the Sun JRE, will not be found in the org.apache.xerces.* or org.apache.xalan.* packages. Instead, they will be present in the internal com.sun.org.apache.xerces.* and com.sun.org.apache.xalan.* packages.
By the way, JDOM is not an XML parser - it will use the parser provided to it by JAXP in order to provide you with an easier abstraction to work with.
Yes. It has a two options in the javax.xml package: DOM builds documents in memory, and SAX is an event-based approach.
You may also want to look at JDOM, which is a 3rd party library that offers a combination of the two, and can be easier to use.
Yes. Java contains javax.xml library. You can checkout some samples at Sun's Java API for XML Code Samples.
However, I personally like using JDOM library.
javax.xml package contains Java's native XML solution which is actually a special version of Xerces. You can do what you asked with it, however using 3rd party libraries such as JDOM makes the whole process a lot easier.
Have a look at JAX-B This is increasingly the "standard" way to do XML processing. Uses Java annotations to simplify the programming model. The reference gives sample code for reading and writing XML.
Java does come with a large set of packages and classes to handle XML. These are part of the Standard Edition JDK, and located under the javax.xml package.
Aside from reading XML and writing it with DOM or SAX, these packages also perform XSL transformations, JAX-B object marshalling and unmarshalling, XPath processing and web services SOAP handling. I advise you to read more about these online in Sun's excellent tutorials.
I can't tell you which one to use (few requirements specified, and there
are a dozen libraries), but I would seriously consider XOM (here).
Written by Eliotte Rusty Harold, it is quite complete in terms of the XML
spec, and generally excellent. I have found it very easy to use. See the
link above for Harold's motivation and criticism of other solutions.
You could have a look to the javax.xml package, which contains everything you need to work with XML documents in Java...
Java API for XML Processing (JAXP) is part of standard library JavaSE. JAXP allows you to code against standard interface and lets you pick the parser implementation later if needed.
The Java API for XML Processing, or
JAXP for short, enables applications
to parse and transform XML documents
using an API that is independent of a
particular XML processor
implementation. JAXP also provides a
pluggability feature which enables
applications to easily switch between
particular XML processor
implementations.
You can use StAX (streaming API for XML)
http://en.wikipedia.org/wiki/StAX
http://www.xml.com/pub/a/2003/09/17/stax.html
https://sjsxp.dev.java.net/
StAx is optimized to process large xml files, without causing OOM (out of memory) problem :)
As is said above... Java's SDK now comes with Xerces and Xalan. Xalan only implements version 1.0 of the XSLT API, so if you want 2.0, you should look at Saxon from Michael Kay.
I've been using JDOM for general XML parsing for a long time, but get the feeling that there must be something better, or at least more lightweight, for Java 5 or 6.
There's nothing wrong with the JDOM API, but I don't like having to include Xerces with my deployments. Is there a more lightweight alternative, if all I want to do is read in an XML file or write one out?
The best lightweight alternative is, in my opinion, XOM, but JDOM is still a very good API, and I see no reason to replace it.
It doesn't have a dependency on Xerces, though (at least, it doesn't need the Apache Xerces distro, it works alongside the Xerces that's packaged into the JRE).
I've used the javax.xml.stream package (XMLStreamReader/XMLStreamWriter) to read and write XML using xml pull/push techniques. It's worked for me so far.
We use JAXB - it generates the classes based on the schema. You can generate your files without a schema, and just annotate how you want the xml to be.
There was recently a fork of JDOM for java 5 called coffeeDOM. You should check it out.
You should check out Commons Digester (see the answer I've given here). It provides a very lightweight mechanism for parsing XML.
JDOM is very good and simple. There has been many new ways to parse XML after release of JDOM, but those has have different focus than simplicity. JAXB makes things simple in some cases when you have well known XML document has your schema does not get updated daily basis.
New push parsers are very good and even mandatory for very large XML files (hundreds of MBs).
Speed benefit for SAX parser can be ten fold.
Use one of the XML APIs that come standard with Java, so that you don't have to include any third-party libraries.
XML in the Java Platform Standard Edition (Java SE) 6
I would like to think JAXP is a good choise for you.
It's standard, included in JDK, it provides clear interface and allows to hook up any implementations..
If all what you need in is to read and write not very large and overcomplicated xml files, JAXP DOM api embedded in JDK will cover you requirements.