I have followed Obtaining DOCTYPE details using SAX (JDK 7), implementing it like this:
public class MyXmlReader {
public static void parse(InputSource inputSource) {
try {
XMLReader xmlReader = XMLReaderFactory.createXMLReader();
MyContentHandler handler = new MyContentHandler();
xmlReader.setContentHandler(handler);
xmlReader.setProperty("http://xml.org/sax/properties/lexical-handler", handler); // Does not work; handler is set, but startDTD/endDTD is not called
xmlReader.setDTDHandler(handler);
xmlReader.setErrorHandler(new MyErrorHandler());
xmlReader.setFeature("http://xml.org/sax/features/validation", false);
xmlReader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
xmlReader.parse(inputSource);
}
catch (SAXException e) {
throw new MyImportException("Error while parsing file", e);
}
}
}
MyContentHandler extends DefaultHandler2, but neither startDTD nor endDTD is called (but e.g. startEntity is in fact called, so the lexical handler is set).
I have tried to leave the features out, but this changes nothing.
What goes wrong here?
I am using Java 8 JDK 1.8.0_144.
The XML looks like this:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE MyMessage SYSTEM "http://www.testsite.org/mymessage/5.1/reference/international.dtd">
<MyMessage>
<Header>
...
According to XMLReader API you need to set a DTD Handler, otherwise DTD Events will be silently ignored. A DefaultHandler2 yet implements DTDHandler interface, so you could use xmlReader.setDTDHandler(handler); again;
Related
I want to generate a xml file in the following format by using java :
each attribute should be in separate line.
<parameters>
<parameter
name="Tom"
city="York"
number="123"
/>
</parameters>
But I can only get all attributes in one line
<parameters>
<parameter name="Tom" city="York" number="123"/>
</parameters>
I'm using dom4j, could anyone tell how I can make it? Does dom4j supports this kind of format?
Thanks.
You cannot do it with the XMLWriter unless you want to substantially rewrite the main logic. However, since XMLWriter is also a SAX ContentHandler it can consume SAX events and serialize them to XML, and in this mode of operation, XMLWriteruses a different code path which is easier to customize. The following sub class will give you almost what you want, except that empty elements will not use the short form <element/>. Maybe that can be fixed by further tweaking.
static class ModifiedXmlWriter extends XMLWriter {
// indentLevel is private, need reflection to read it
Field il;
public ModifiedXmlWriter(OutputStream out, OutputFormat format) throws UnsupportedEncodingException {
super(out, format);
try {
il = XMLWriter.class.getDeclaredField("indentLevel");
il.setAccessible(true);
} catch (NoSuchFieldException e) {
throw new RuntimeException(e);
}
}
int getIndentLevel() {
try {
return il.getInt(this);
} catch (IllegalAccessException e) {
throw new RuntimeException(e);
}
}
#Override
protected void writeAttributes(Attributes attributes) throws IOException {
int l = getIndentLevel();
setIndentLevel(l+1);
super.writeAttributes(attributes);
setIndentLevel(l);
}
#Override
protected void writeAttribute(Attributes attributes, int index) throws IOException {
writePrintln();
indent();
super.writeAttribute(attributes, index);
}
}
public static void main(String[] args) throws Exception {
String XML = "<parameters>\n" +
" <parameter name=\"Tom\" city=\"York\" number=\"123\"/>\n" +
"</parameters>";
Document doc = DocumentHelper.parseText(XML);
XMLWriter writer = new ModifiedXmlWriter(System.out, OutputFormat.createPrettyPrint());
SAXWriter sw = new SAXWriter(writer);
sw.write(doc);
}
Sample output:
<?xml version="1.0" encoding="UTF-8"?>
<parameters>
<parameter
name="Tom"
city="York"
number="123"></parameter>
</parameters>
Generally speaking, very few XML serializers give you this level of control over the output format.
You can get something close to this with the Saxon serializer if you specify the options method=xml, indent=yes, saxon:line-length=20. The Saxon serializer is capable of taking a DOM4J tree as input. You will need Saxon-PE or -EE because it requires a serialization parameter in the Saxon namespace. It still won't be exactly what you want because the first attribute will be on the same line as the element name and the others will be vertically aligned underneath the first.
For a project at university, I need to parse a GML file. GML files are XML based so I use JDOM2 to parse it. To fit my purposes, I extended org.jdom2.Document like so:
package datenbank;
import java.io.File;
// some more imports
public class GMLDatei extends org.jdom2.Document {
public void saveAsFile() {
// ...
}
public GMLKnoten getRootElement(){
return (GMLKnoten) this.getDocument().getRootElement();
}
public void setRootElement(GMLKnoten root){
this.getDocument().setRootElement(root);
}
}
I also extended org.jdom2.Element and named the subclass GMLKnoten but this does not matter too much for my question.
When testing, I try to load a GML file. When using the native document and element classes, it loads fine, but when using my subclasses, I get the following scenario:
I load the file using:
SAXBuilder saxBuilder = new SAXBuilder();
File inputFile = new File("gml/Roads_Munich_Route_Lines.gml");
GMLDatei document = null;
ArrayList<String> types = new ArrayList<String>();
try {
document = (GMLDatei) saxBuilder.build(inputFile);
} catch (JDOMException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
In the line
document = (GMLDatei) saxBuilder.build(inputFile);
I get a Cast-Exception:
Exception in thread "main" java.lang.ClassCastException: org.jdom2.Document cannot be cast to datenbank.GMLDatei
at datenbank.GMLTest.main(GMLTest.java:27)
I thought that casting schould be no problem as I am subclassing org.jdom2.document. What am I missing?
vat
In general I want to "challenge" your requirement to extend Document - what value do you get from your custom classes that are not already part of the native implementation? I ask this for 2 reasons:
as the maintainer of JDOM, should I be adding some new feature?
I am just curious.....
JDOM has a system in place for allowing you to extend it's core classes and have a different implementation of them when parsing a document. It is done by extending the JDOMFactory.
Consider this code here: JDOMFactory interface. When SAXParser parses a document it uses those methods to build the document.
There is a default, overridable implementation in DefaultJDOMFactory that you can extend, and, for example, in your implementation, you must override the non-final "Element" methods like:
#Override
public Element element(final int line, final int col, final String name,
String prefix, String uri) {
return new Element(name, prefix, uri);
}
and instead have:
#Override
public Element element(final int line, final int col, final String name,
String prefix, String uri) {
return new GMLKnoten (name, prefix, uri);
}
Note that you will have to override all methods that are non-final and return content that is to be customised (for example, you will have to override 4 Element methods by my count.
With your own GMLJDOMFactory you can then use SAXBuilder by either using the full constructor new SAXBuilder(null, null, new GMPJDOMFactory()) or by setting the JDOMFactory after you have constructred it with setJDOMFactory(...)
Hey guys so I am brand new to the world of Java-XML parsing and found that the StaX API is probably my best bet as I need to both read and write XML files. Alright so I have a very short (and should be very simple) program that (should) create an XMLInputFactory and use that to create a XMLStreamReader. The XMLStreamReader is created using a FileInputStream attached to an XML file in the same directory as the source file. However even though the FileInputStream compiled properly, the XMLInputFactory cannot access it and without the FileInputStream it cannot creat the XMLStreamReader. Please help as I have no idea what to and am frustrated to the point of giving up!
import javax.xml.stream.*;
import java.io.*;
public class xml {
static String status;
public static void main(String[] args) {
status = "Program has started";
printStatus();
XMLInputFactory inFactory = XMLInputFactory.newInstance();
status = "XMLInputFactory (inFactory) defined"; printStatus();
try { FileInputStream fIS = new FileInputStream("stax.xml"); }
catch (FileNotFoundException na) { System.out.println("FileNotFound"); }
status = "InputStream (fIS) declared"; printStatus();
try { XMLStreamReader xmlReader = inFactory.createXMLStreamReader(fIS); } catch (XMLStreamException xmle) { System.out.println(xmle); }
status = "XMLStreamReader (xmlReader) created by 'inFactory'"; printStatus();
}
public static void printStatus(){ //this is a little code that send notifications when something has been done
System.out.println("Status: " + status);
}
}
also here is the XML file if you need it:
<?xml version="1.0"?>
<dennis>
<hair>brown</hair>
<pants>blue</pants>
<gender>male</gender>
</dennis>
Your problem has to do w/ basic java programming, nothing to do w/ stax. your FileInputStream is scoped within a try block (some decent code formatting would help) and therefore not visible to the code where you are attempting to create the XMLStreamReader. with formatting:
XMLInputFactory inFactory = XMLInputFactory.newInstance();
try {
// fIS is only visible within this try{} block
FileInputStream fIS = new FileInputStream("stax.xml");
} catch (FileNotFoundException na) {
System.out.println("FileNotFound");
}
try {
// fIS is not visible here
XMLStreamReader xmlReader = inFactory.createXMLStreamReader(fIS);
} catch (XMLStreamException xmle) {
System.out.println(xmle);
}
on a secondary note, StAX is a nice API, and a great one for highly performant XML processing in java. however, it is not the simplest XML api. you would probably be better off starting with the DOM based apis, and only using StAX if you experience performance issues using DOM. if you do stay with StAX, i'd advise using XMLEventReader instead of XMLStreamReader (again, an easier api).
lastly, do not hide exception details (e.g. catch them and print out something which does not include the exception itself) or ignore them (e.g. continue processing after the exception is thrown without attempting to deal with the problem).
I'm trying to capture xsl:message in java when calling my transform. Below is a snippet of my code.
final ArrayList<TransformerException> errorList = new ArrayList<TransformerException>();
ErrorListener errorListener = new ErrorListener() {
#Override
public void warning(TransformerException e) throws TransformerException {
//To change body of implemented methods use File | Settings | File Templates.
log.error(e.getMessage());
errorList.add(e);
}
#Override
public void error(TransformerException e) throws TransformerException {
//To change body of implemented methods use File | Settings | File Templates.
log.error(e.getMessage());
errorList.add(e);
}
#Override
public void fatalError(TransformerException e) throws TransformerException {
//To change body of implemented methods use File | Settings | File Templates.
errorList.add(e);
throw e;
}
};
...
try
{
transformer.setErrorListener(errorListener);
newDoc = transform(transformer, oldDoc);
}
catch (TransformerException e) {
log.error("Problem transforming normalized document into PUBS-XML", e);
throw e;
}
Unfortunately this is not working.
Is there a better way?
Thanks in advance!
If you are using Saxon, then you may need to set the message emitter using setMessageEmitter().
https://www.saxonica.com/html/documentation10/javadoc/net/sf/saxon/trans/XsltController.html#setMessageEmitter-net.sf.saxon.event.Receiver-
public void setMessageEmitter(Receiver receiver)
Set the Receiver to be used for xsl:message output.
Recent versions of the JAXP interface specify that by default the
output of xsl:message is sent to the
registered ErrorListener. Saxon does
not implement this convention.
Instead, the output is sent to a
default message emitter, which is a
slightly customised implementation of
the standard Saxon Emitter interface.
This interface can be used to change the way in which Saxon outputs
xsl:message output.
Michael Kay has explained why Saxon doesn't output xsl:message according to the JAXP interface, and has suggested two options for obtaining the output:
ErrorListener was something that was
introduced to JAXP at a rather late
stage (one of many regrettable
occasions where the spec was changed
unilaterally to match the Xalan
implementation), and I decided not to
implement this change as a default
behaviour, because it would have been
disruptive to existing applications.
In Saxon, xsl:message output is
directed to a Receiver, which you can
nominate to the Transformer:
((net.sf.saxon.Controller)transformer).setMessageEmitter(....)
If you want to follow the JAXP model
of sending the output to the
ErrorListener, you can nominate a
Receiver that does this:
((net.sf.saxon.Controller)transformer).setMessageEmitter(new net.sf.saxon.event.MessageWarner())
I'm copying code from one part of our application (an applet) to inside the app. I'm parsing XML as a String. It's been awhile since I parsed XML, but from the error that's thrown it looks like it might have to do with not finding the .dtd. The stack trace makes it difficult to find the exact cause of the error, but here's the message:
java.net.MalformedURLException: no protocol: http://www.mycomp.com/MyComp.dtd
and the XML has this as the first couple lines:
<?xml version='1.0'?>
<!DOCTYPE MYTHING SYSTEM 'http://www.mycomp.com/MyComp.dtd'>
and here's the relevant code snippets
class XMLImportParser extends DefaultHandler {
private SAXParser m_SaxParser = null;
private String is_InputString = "";
XMLImportParser(String xmlStr) throws SAXException, IOException {
super();
is_InputString = xmlStr;
createParser();
try {
preparseString();
parseString(is_InputString);
} catch (Exception e) {
throw new SAXException(e); //"Import Error : "+e.getMessage());
}
}
void createParser() throws SAXException {
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(true);
try {
factory.setFeature("http://xml.org/sax/features/namespaces", true);
factory.setFeature("http://xml.org/sax/features/namespace-prefixes", true);
m_SaxParser = factory.newSAXParser();
m_SaxParser.getXMLReader().setFeature("http://xml.org/sax/features/namespaces", true);
m_SaxParser.getXMLReader().setFeature("http://xml.org/sax/features/namespace-prefixes", true);
} catch (SAXNotRecognizedException snre){
throw new SAXException("Failed to create XML parser");
} catch (SAXNotSupportedException snse) {
throw new SAXException("Failed to create XML parser");
} catch (Exception ex) {
throw new SAXException(ex);
}
}
void preparseString() throws SAXException {
try {
InputSource lSource = new InputSource(new StringReader(is_InputString));
lSource.setEncoding("UTF-8");
m_SaxParser.parse(lSource, this);
} catch (Exception ex) {
throw new SAXException(ex);
}
}
}
It looks like the error is happening in the preparseString() method, on the line that actually does the parsing, the m_SaxParser.parse(lSource, this); line.
FYI, the 'MyComp.dtd' file does exist at that location and is accessible via http. The XML file comes from a different service on the server, so I can't change it to a file:// format and put the .dtd file on the classpath.
I think you have some extra code in the XML declaration. Try this:
<?xml version='1.0'?>
<!DOCTYPE MYTHING SYSTEM "http://www.mycomp.com/MyComp.dtd">
The above was captured from the W3C Recommendations: http://www.w3.org/QA/2002/04/valid-dtd-list.html
You can use the http link to set the Schema on the SAXParserFactory before creating your parser.
void createParser() throws SAXException {
Schema schema = SchemaFactory.newSchema(new URL("http://www.mycomp.com/MyComp.dtd"));
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(true);
factory.setSchema(schema);
The problem is that this:
http://www.mycomp.com/MyComp.dtd
is an HTML hyperlink, not a URL. Replace it with this:
http://www.mycomp.com/MyComp.dtd
Since this XML comes from an external source, the first thing to do would be to complain to them that they are sending invalid XML.
As a workaround, you can set an EntityResolver on your parser that compares the SystemId to this invalid url and returns a correct http url:
m_SaxParser.getXMLReader().setEntityResolver(
new EntityResolver() {
public InputSource resolveEntity(final String publicId, final String systemId) throws SAXException {
if ("http://www.mycomp.com/MyComp.dtd".equals(systemId)) {
return new InputSource("http://www.mycomp.com/MyComp.dtd");
} else {
return null;
}
}
}
);