I'm learning XSLT and I find that Xalan is really helpful. I know that Xalan can be used through commandline commands, like:
java -classpath .;%XALAN_JAR% org.apache.xalan.xslt.Process -IN input.xml -XSL transform.xsl -OUT output.xml
However, how can I call this method from java code? Just like:
process(input.xml, transform.xsl, result.xml)
Thanks!
Java supports a transformation API sometimes referred to as JAXP. There's a tutorial on it here:
http://docs.oracle.com/javase/tutorial/jaxp/index.html
JAXP has also been implemented by other Java-based XSLT engines, though the only two in really common use are now Xalan and Saxon.
If you're new to XSLT, you need to be aware that the language has come on a long way since XSLT 1.0, which is what Xalan implements. XSLT 2.0 provides many useful enhancements such as user-written functions, date and time handling, regular expressions, multiple output files, and grouping. To use those features you'll need to move from Xalan to Saxon. The open-source version of Saxon (Saxon-HE 9.7) can be found via http://saxon.sf.net/.
You can check this which has a sample code how to do this.
Related
I have the following question: Is there any example of how to process xsl transformation?
I know that I can use the Transformer class (XSL API) but I want to know something more about the background work so I can adopt it for other languages/systems.
I already read some xsl tutorials but there are only descriptions about how xsl is structured and the meaning of the tags but I found no overview about the steps to process.
It might help you to know that the native API for XSLT transformations is called JAXP. If you google for "JAXP Transformation API examples" you will find many sites that give examples and tutorials - it's hard for me to advise which you should go to first, because everyone's learning style is different.
You need to be aware that the built-in XSLT engine in the JDK (a version of Xalan) is very old, and only supports the XSLT 1.0 language which came out in 1999. Unless you have strong reasons to use XSLT 1.0, you should really be using something more up to date (XSLT 2.0 came out in 2007, XSLT 3.0 in 2017). These versions are supported by third-party products such as Saxon. Saxon supports the JAXP API, but if you want to take advantage of all the features of XSLT 2.0/3.0 then you're better off using Saxon's native API, known as s9api (pronounced "snappy"). You can find examples of that in the documentation and resource downloads at www.saxonica.com.
What is the difference of them? It is said that JAXP is only a API Specification, JDOM and DOM4J realized it, is it right? And all of them need a XML parser, just like XERCES, is it right?
thanks in advance!
JAXP (JSR-206)
Is a set of standard APIs for Java XML parsers. It covers the following areas:
DOM (org.w3c.dom package)
SAX (org.xml.sax package)
StAX/JSR-173 (java.xml.stream)
XSLT (javax.xml.transform)
XPath (javax.xml.xpath)
Validation (javax.xml.validation)
Datatypes (javax.xml.datatype)
This standard was created by an expert group with representatives from many companies and individuals. As a standard this means there are multiple implementations (Xerces implements JAXP), and it can be included in the JDK.
Xerces
Is an open source Java XML parser that provides DOM and SAX implementations that are compliant with the JAXP standard.
JDOM and DOM4J
Are open source Java XML parsers.
You're comparing apples and automobiles.
JAXP is an API that is now bundled with the JDK
JDOM is a different API, but also a library
DOM4J is also a different API and library
XERCES is a XML parser implemented in Java. A version of XERCES is also bundled in the JDK.
Which API you use is largely a question of personal preference. I like JDOM in part because I'm used to working with it. There are, similarly, several implementations of XML parsers. If you're programming in Java using a recent JDK, you will be able to use JAXP without having to add external libraries.
What XSLT processor should I use for Java transformation? There are SAXON, Xalan and TrAX. What is the criteria for my choice? I need simple transformation without complicated logical transformations. Need fast and easy to implement solution. APP runs under Tomcat, Java 1.5.
I have some jaxp libraries and could not understand what is the version of jaxp used.
Thanks in advance.
Best regards.
The JDK comes bundles with an internal version of Xalan, and you get an instance on it by using the standard API (e.g. TransformerFactory.newInstance()).
Unless Xalan doesn't work for you (which is highly unlikely), there's no need to look elsewhere.
By the way, TrAX is the old name for the javax.xml.transform API, from the days when it was an optional extension to the JDK.
In general, it's a hard question to answer because it heavily depends what you mean by "simple transformation" and "fast", as well as the amount of XML you want to process. There are probably other considerations as well. To illustrate, "fast" could mean "fast to write" or "fast to execute", if you process files the size of available memory you might make a different choice (maybe STX, as described in another SO question) than if you parse small files etc.
Does Java have a built in XML library for generating and parsing documents? If not, which third party one should I be using?
The Sun Java Runtime comes with the Xerces and Xalan implementations that provide the ability to parse XML (via the DOM and SAX intefaces), and also perform XSL transformations and execute XPath queries.
However, it is better to use the JAXP API to work on XML, since JAXP allows you to not worry about the underlying implementation used (Xerces or Crimson or any other). When you use JAXP, at runtime the JRE will use the service provider it can locate, to perform the needed operations. As indicated previously, Xerces/Xalan will be used since it is shipped with the Sun JRE (not others though), so you dont have to download and install a specific provider (say, a different version of Xerces, or Crimson).
A basic JAXP tutorial can be found in The J2EE 1.4 tutorial (Its from the J2EE tutorial, but it will help).
Do note that the Xerces/Xalan implementations provided by the Sun JRE, will not be found in the org.apache.xerces.* or org.apache.xalan.* packages. Instead, they will be present in the internal com.sun.org.apache.xerces.* and com.sun.org.apache.xalan.* packages.
By the way, JDOM is not an XML parser - it will use the parser provided to it by JAXP in order to provide you with an easier abstraction to work with.
Yes. It has a two options in the javax.xml package: DOM builds documents in memory, and SAX is an event-based approach.
You may also want to look at JDOM, which is a 3rd party library that offers a combination of the two, and can be easier to use.
Yes. Java contains javax.xml library. You can checkout some samples at Sun's Java API for XML Code Samples.
However, I personally like using JDOM library.
javax.xml package contains Java's native XML solution which is actually a special version of Xerces. You can do what you asked with it, however using 3rd party libraries such as JDOM makes the whole process a lot easier.
Have a look at JAX-B This is increasingly the "standard" way to do XML processing. Uses Java annotations to simplify the programming model. The reference gives sample code for reading and writing XML.
Java does come with a large set of packages and classes to handle XML. These are part of the Standard Edition JDK, and located under the javax.xml package.
Aside from reading XML and writing it with DOM or SAX, these packages also perform XSL transformations, JAX-B object marshalling and unmarshalling, XPath processing and web services SOAP handling. I advise you to read more about these online in Sun's excellent tutorials.
I can't tell you which one to use (few requirements specified, and there
are a dozen libraries), but I would seriously consider XOM (here).
Written by Eliotte Rusty Harold, it is quite complete in terms of the XML
spec, and generally excellent. I have found it very easy to use. See the
link above for Harold's motivation and criticism of other solutions.
You could have a look to the javax.xml package, which contains everything you need to work with XML documents in Java...
Java API for XML Processing (JAXP) is part of standard library JavaSE. JAXP allows you to code against standard interface and lets you pick the parser implementation later if needed.
The Java API for XML Processing, or
JAXP for short, enables applications
to parse and transform XML documents
using an API that is independent of a
particular XML processor
implementation. JAXP also provides a
pluggability feature which enables
applications to easily switch between
particular XML processor
implementations.
You can use StAX (streaming API for XML)
http://en.wikipedia.org/wiki/StAX
http://www.xml.com/pub/a/2003/09/17/stax.html
https://sjsxp.dev.java.net/
StAx is optimized to process large xml files, without causing OOM (out of memory) problem :)
As is said above... Java's SDK now comes with Xerces and Xalan. Xalan only implements version 1.0 of the XSLT API, so if you want 2.0, you should look at Saxon from Michael Kay.
I've been using JDOM for general XML parsing for a long time, but get the feeling that there must be something better, or at least more lightweight, for Java 5 or 6.
There's nothing wrong with the JDOM API, but I don't like having to include Xerces with my deployments. Is there a more lightweight alternative, if all I want to do is read in an XML file or write one out?
The best lightweight alternative is, in my opinion, XOM, but JDOM is still a very good API, and I see no reason to replace it.
It doesn't have a dependency on Xerces, though (at least, it doesn't need the Apache Xerces distro, it works alongside the Xerces that's packaged into the JRE).
I've used the javax.xml.stream package (XMLStreamReader/XMLStreamWriter) to read and write XML using xml pull/push techniques. It's worked for me so far.
We use JAXB - it generates the classes based on the schema. You can generate your files without a schema, and just annotate how you want the xml to be.
There was recently a fork of JDOM for java 5 called coffeeDOM. You should check it out.
You should check out Commons Digester (see the answer I've given here). It provides a very lightweight mechanism for parsing XML.
JDOM is very good and simple. There has been many new ways to parse XML after release of JDOM, but those has have different focus than simplicity. JAXB makes things simple in some cases when you have well known XML document has your schema does not get updated daily basis.
New push parsers are very good and even mandatory for very large XML files (hundreds of MBs).
Speed benefit for SAX parser can be ten fold.
Use one of the XML APIs that come standard with Java, so that you don't have to include any third-party libraries.
XML in the Java Platform Standard Edition (Java SE) 6
I would like to think JAXP is a good choise for you.
It's standard, included in JDK, it provides clear interface and allows to hook up any implementations..
If all what you need in is to read and write not very large and overcomplicated xml files, JAXP DOM api embedded in JDK will cover you requirements.