Java XML Unmarshalling fails on ampersand (&) using JAXB - java

I have the following XML:
<?xml version="1.0" encoding="UTF-8"?>
<details>
...
<address1>Test&Address</address1>
...
</details>
When I try to unmarshal it using JAXB, it throws the following exception:
Caused by: org.xml.sax.SAXParseException: The reference to entity "Address" must end with the ';' delimiter.
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEntityReference(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:194)
But when I changed the & in the XML to &apos;, it works. Looks like the problem is only with ampersand & and I cannot understand why.
The code to unmarshal is:
JAXBContext context = JAXBContext.newInstance("some.package.name", this.getClass().getClassLoader());
Unmarshaller unmarshaller = context.createUnmarshaller();
obj = unmarshaller.unmarshal(new StringReader(xml));
Anyone have some insight?
EDIT: I tried the solution suggested by #abhin4v below (ie, add a space after &), but it doesn't seem to work too. Here's the stacktrace:
Caused by: org.xml.sax.SAXParseException: The entity name must immediately follow the '&' in the entity reference.
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanEntityReference(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:194)

I've run into this too. First pass I simply replaced the &amp to a token string (AMPERSAND_TOKEN), sent it through JAXB, then re-replaced the ampersand. Not ideal, but it was a quick fix.
Second pass I made a lot of significant changes, so I'm not sure what exactly solved the problem. I suspect that providing JAXB access to the html dtds made it much happier, but that's only a guess and could be specific to my project.
HTH

Xerces converts & to & and then tries to resolve &Address which fails because it does not end with ;. Put a space between & and Address and it should work. Putting a space will not work as Xerces will now try to resolve & and throw the second error given in OP. You can wrap the test in a CDATA section and Xerces will not try to resolve the entities.

It turns out that the problem is because of the framework I'm using (Mentawai framework). The said XML comes from the POST body of an HTTP request.
Apparently, the framework converts the character entities in the XML body, therefore, & becomes & and the unmarshaller fails to unmarshal the XML.

Related

ERROR: 'Namespace for prefix 'xsi' has not been declared.'

Why am I getting this error:
ERROR: 'Namespace for prefix 'xsi' has not been declared.'
Here is my Java code:
package com.emp.ma.jbl.nsnhlrspmlpl.nsnhlrspmlpl.internal.action;
import com.emp.ma.util.xml.XMLDocument;
import com.emp.ma.util.xml.XMLDocumentBuilder;
public class yay {
public static void main(String[] args) {
XMLDocument xmldoc = XMLDocumentBuilder.newDocument().addRoot("spml:modifyRequest");
xmldoc.gotoRoot().addTag("modification").addText("");
xmldoc.gotoChild("modification").addTag("valueObject").addText("");
xmldoc.gotoChild("valueObject").addAttribute("xsi:type","halo");
System.out.println(xmldoc);
}
}
This code was functioning properly until I tried throwing transformer exception whilst converting XML file to HTML for experimenting only.
I need to create an xml file with the format:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<spml:modifyRequest>
<modification>
<valueObject xsi:type="halo">
</valueObject>
</modification>
</spml:modifyRequest>
I removed the transformer part from the code already and yet I'm getting this error in eclipse:
ERROR: 'Namespace for prefix 'xsi' has not been declared.'
Exception in thread "main" com.emp.ma.util.xml.XMLDocumentException: java.lang.RuntimeException: Namespace for prefix 'xsi' has not been declared.
at com.emp.ma.util.xml.XMLDocumentImpl.toResult(XMLDocumentImpl.java:1244)
at com.emp.ma.util.xml.XMLDocumentImpl.toStream(XMLDocumentImpl.java:1314)
at com.emp.ma.util.xml.XMLDocumentImpl.toString(XMLDocumentImpl.java:1336)
at com.emp.ma.util.xml.XMLDocumentImpl.toString(XMLDocumentImpl.java:1325)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at com.emp.ma.util.xml.XMLDocumentBuilder$XMLDocumentHandler.invoke(XMLDocumentBuilder.java:55)
at $Proxy1.toString(Unknown Source)
at java.lang.String.valueOf(Unknown Source)
at java.io.PrintStream.println(Unknown Source)
at com.emp.ma.jbl.nsnhlrspmlpl.nsnhlrspmlpl.internal.action.yay.main(yay.java:13)
Caused by: javax.xml.transform.TransformerException: java.lang.RuntimeException: Namespace for prefix 'xsi' has not been declared.
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(Unknown Source)
at com.emp.ma.util.xml.XMLDocumentImpl.toResult(XMLDocumentImpl.java:1242)
... 12 more
Caused by: java.lang.RuntimeException: Namespace for prefix 'xsi' has not been declared.
at com.sun.org.apache.xml.internal.serializer.SerializerBase.getNamespaceURI(Unknown Source)
at com.sun.org.apache.xml.internal.serializer.SerializerBase.addAttribute(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(Unknown Source)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transformIdentity(Unknown Source)
... 15 more
I'm stuck because of this one exception and I don't know how to undo this. Please do help. Like i said, it was functioning properly before trying this experiment of mine, if possible how do I remove this transformer integration. I've tried changing workspace as well -- still not working.
For an XML document to be well-formed, all used namespace prefixes must be declared.
Simply declare the xsi namespace prefix on the root element of your XML,
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<spml:modifyRequest xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<modification>
<valueObject xsi:type="halo">
</valueObject>
</modification>
</spml:modifyRequest>
and your error will go away.
Note that you'll similarly have to define the spml namespace prefix.
I got the same error "AxisFault faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server faultSubcode: faultString: java.lang.RuntimeException: Namespace for prefix 'xsi' has not been declared." while calling the soap web service from coldfusion 9 server, as the error does not resolve quickly as I had to spent more time, finnaly found that due to the incorrect date value supplying to the webservice parameters, it throws different issue. Whenever we got this issue please check input values that are supplying to the webservcie parameters. In my case due to the datetime format 2015-03-04T00:00:00.000Z(It's a part of ISO-8601 date representation), issue happened, 2015-03-04 00:00 resolves the issue. For example for datetime If I provide string(xxxx), coldfusion axis webservice shows te irrelevant error --Namespace for prefix 'xsi' has not been declared....

issue with SAX parser in java

When web service is called by using SOAP request it will give following parse error.
I have check about the prolog of request its right there is no whitespace or dash. Even though it will cause following error
org.xml.sax.SAXParseException: Content is not allowed in prolog.
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Un
known Source)
at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
at org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispat
ch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Un
known Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Sour
ce)
at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source)
at javax.xml.parsers.SAXParser.parse(SAXParser.java:198)
at requestModel.SimpleCheckMail.checkMail(SimpleCheckMail.java:162)
at model.InboxDataBean.prepareList(InboxDataBean.java:97)
at model.InboxDataBean.getemailList(InboxDataBean.java:207)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at javax.el.BeanELResolver.getValue(BeanELResolver.java:87)
at com.sun.faces.el.DemuxCompositeELResolver._getValue(DemuxCompositeELR
esolver.java:176)
at com.sun.faces.el.DemuxCompositeELResolver.getValue(DemuxCompositeELRe
solver.java:203)
at org.apache.el.parser.AstValue.getValue(AstValue.java:169)
at org.apache.el.ValueExpressionImpl.getValue(ValueExpressionImpl.java:1
89)
at com.sun.faces.facelets.el.TagValueExpression.getValue(TagValueExpress
ion.java:109)
at javax.faces.component.ComponentStateHelper.eval(ComponentStateHelper.
java:194)
at javax.faces.component.ComponentStateHelper.eval(ComponentStateHelper.
java:182)
at javax.faces.component.UIData.getValue(UIData.java:731)
at javax.faces.component.UIData.getDataModel(UIData.java:1798)
at javax.faces.component.UIData.setRowIndexWithoutRowStatePreserved(UIDa
ta.java:484)
at javax.faces.component.UIData.setRowIndex(UIData.java:473)
at com.sun.faces.renderkit.html_basic.TableRenderer.encodeBegin(TableRen
derer.java:81)
at javax.faces.component.UIComponentBase.encodeBegin(UIComponentBase.jav
a:820)
at javax.faces.component.UIData.encodeBegin(UIData.java:1118)
at javax.faces.component.UIComponent.encodeAll(UIComponent.java:1754)
at javax.faces.render.Renderer.encodeChildren(Renderer.java:168)
at javax.faces.component.UIComponentBase.encodeChildren(UIComponentBase.
java:845)
at javax.faces.component.UIComponent.encodeAll(UIComponent.java:1756)
at javax.faces.component.UIComponent.encodeAll(UIComponent.java:1759)
at com.sun.faces.application.view.FaceletViewHandlingStrategy.renderView
(FaceletViewHandlingStrategy.java:401)
at com.sun.faces.application.view.MultiViewHandler.renderView(MultiViewH
andler.java:131)
at com.sun.faces.lifecycle.RenderResponsePhase.execute(RenderResponsePha
se.java:121)
at com.sun.faces.lifecycle.Phase.doPhase(Phase.java:101)
at com.sun.faces.lifecycle.LifecycleImpl.render(LifecycleImpl.java:139)
at javax.faces.webapp.FacesServlet.service(FacesServlet.java:410)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appl
icationFilterChain.java:305)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationF
ilterChain.java:210)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperV
alve.java:224)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextV
alve.java:169)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(Authentica
torBase.java:472)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.j
ava:168)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.j
ava:98)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:
927)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineVal
ve.java:118)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.jav
a:407)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp
11Processor.java:987)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(
AbstractProtocol.java:579)
at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoin
t.java:1805)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec
utor.java:885)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:907)
at java.lang.Thread.run(Thread.java:619)
Please let me know whats the problem......of this error....thanx in advance
Ya....all of you are right ....But what i am getting is that it is an SOAP request and i have already see the request carefully there is no bad character.....But the problem is that when i am invoking the web service through soap request it gives null as a respone so i am getting the error.....
As soon as Webservice work properly ......this works fine now....thanks all of you
It means that there is something in xml before <?xml ... look carefully in it. Also check that there is no invisible character (you can do it in any HEX editor). Sometimes windows notepad adds his marker in the file beginning.
The parser sees character data before the actual XML itself is started. Either make sure your XML does not contain any stuff before the XML starts, or let your SAX parser ignore this...
Try to display the data you're actually parsing. Maybe some bad characters are inserted before the beginning of your xml, or maybe you're not reading the right file.
This may be a because of a BOM, if your XML file is stored as UTF-8 (which it probably is).
Here, you have an example of an InputStream, that gets rid of the BOM.

Handle Applet throwing java.lang.ExceptionInInitializerError preventing it from running

We are developing java applet and embedding it in our web pages. When the applet is loaded via HTML APPLET tags, the browser/JVM prompts the user to allow it to run. When we hit cancel, the java console indicates the following exception:
java.lang.RuntimeException: java.lang.ExceptionInInitializerError
at sun.plugin2.applet.Plugin2Manager.createApplet(Unknown Source)
at sun.plugin2.applet.Plugin2Manager$AppletExecutionRunnable.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.ExceptionInInitializerError
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at java.lang.Class.newInstance0(Unknown Source)
at java.lang.Class.newInstance(Unknown Source)
at sun.plugin2.applet.Plugin2Manager$12.run(Unknown Source)
at java.awt.event.InvocationEvent.dispatch(Unknown Source)
at java.awt.EventQueue.dispatchEventImpl(Unknown Source)
at java.awt.EventQueue.access$000(Unknown Source)
at java.awt.EventQueue$1.run(Unknown Source)
at java.awt.EventQueue$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.security.AccessControlContext$1.doIntersectionPrivilege(Unknown Source)
at java.awt.EventQueue.dispatchEvent(Unknown Source)
at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source)
at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)
at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source)
at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
at java.awt.EventDispatchThread.run(Unknown Source)
Caused by: java.security.AccessControlException: access denied (java.lang.RuntimePermission getenv.TEMP)
at java.security.AccessControlContext.checkPermission(Unknown Source)
at java.security.AccessController.checkPermission(Unknown Source)
at java.lang.SecurityManager.checkPermission(Unknown Source)
at java.lang.System.getenv(Unknown Source)
at downLoadApp.<clinit>(downLoadApp.java:15)
... 21 more
Exception: java.lang.RuntimeException: java.lang.ExceptionInInitializerError
Now, I realize it's most likely because the class java tried to load was prevented from loading, and therefore the exception is thrown, but how do we gracefully handle this situation in the browser? I'd like to detect that the applet was denied and post a reasonable response to the condition, but I'm unaware of how to catch this exception since it appears to have been thrown by the JVM in reaction to not getting the jar file to load rather than code written in it...
Ideas?
Thanks!
You can catch the access control exception by putting the call to System.getenv in downLoadApp.java line 15 in a try-catch statement. That is, instead of this:
static String tmp = System.getenv("TEMP");
you should have:
static String tmp;
static {
try {
tmp = System.getenv("TEMP");
} catch (java.security.AccessControlException ace) {
// tmp is not set, maybe use some default value?
}
}
Do you call System.getenv("TEMP") somewhere in the static context of the downLoadApp class? Most probably this is in the initializer of a static field, but it could be in a static code block.
This is the place where it fails (a java.security.AccessControlException is thrown). The rest of the stack trace are errors following from that. You need to catch this exception to detect that you don't have the appropriate permissions.
As you can't catch exceptions of static field initializers, you need to move the call to getenv into a method or a static code block.
I don't think that you can catch this exception as this is thrown by JVM while loading applet.

Get xml namespace (without triggering UnknownHostException)

I have some Java code that determines the namespace of the root-level element of an xml document using SAX. If the namespace is "http://sbgn.org/libsbgn/pd/0.1", it should return version 1. If the namespace is "http://sbgn.org/libsbgn/0.2", the version should be 2. So all the code does is read the first element, and set a variable based on the namespace. Here is the code:
private static class VersionHandler extends DefaultHandler
{
private int version = -1;
#Override
public void startElement (String uri, String localName, String qName, Attributes attributes) throws SAXException
{
if ("sbgn".equals (qName))
{
System.out.println (uri);
if ("http://sbgn.org/libsbgn/0.2".equals(uri))
{
version = 2;
}
else if ("http://sbgn.org/libsbgn/pd/0.1".equals(uri))
{
version = 1;
}
else
{
version = -1;
}
}
}
public int getVersion() { return version; }
};
public static int getVersion(File file) throws SAXException, FileNotFoundException, IOException
{
XMLReader xr;
xr = XMLReaderFactory.createXMLReader();
VersionHandler versionHandler = new VersionHandler();
xr.setContentHandler(versionHandler);
xr.setErrorHandler(versionHandler);
xr.parse(new InputSource(
InputStreamToReader.inputStreamToReader(
new FileInputStream (file))));
return versionHandler.getVersion();
}
This works, but has two problems:
It is inefficient, because the whole document will be parsed even though only the first element is needed.
More importantly, this code sometimes (apparently depending on firewall configuration) triggers a UnknownHostException like so:
java.net.UnknownHostException: www.w3.org
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at sun.net.NetworkClient.doConnect(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.openServer(Unknown Source)
at sun.net.www.http.HttpClient.(Unknown Source)
at sun.net.www.http.HttpClient.New(Unknown Source)
at sun.net.www.http.HttpClient.New(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(Unknown
Source)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown
Source)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown
Source)
at
com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown
Source)
at org.sbgn.SbgnVersionFinder.getVersion(SbgnVersionFinder.java:57)
So my questions are:
Apparently this bit of code is connecting to the internet. How can I avoid that? Besides leading to problems with firewalls, it is also needlessly slow.
Why is it connecting to the internet? Please help me understand the logic here, there should be absolutely no need for it.
Is there a more efficient way to determine the namespace of the root element of an xml document?
Edit: here is a link to a sample document that I'm trying to parse this way: https://libsbgn.svn.sourceforge.net/svnroot/libsbgn/trunk/test-files/PD/adh.sbgn
Edit2: A note regarding to the solution of this bug: In fact the problem was triggered because the wrong document was being parsed, instead of the intended document, I was parsing an XHMTML document that does in fact refer to www.w3.org. Of course the solution is to use the correct document. Nevertheless, I found it useful to add this line:
xr.setEntityResolver(null);
To prevent xerces from going over the internet when it's really completely unnecessary.
I believe you need to set the entity resolver. See the javadoc. Also, this article seems relevant.
It's probably connecting to the internet because your document is referring to a DTD or other external entity on the W3C web site. Earlier this year, W3C stopped serving these documents because they couldn't handle the traffic.
You can solve the problem of reading the whole document by throwing a SAXException from one of your callbacks once you've seen as much of the document as you need to see. Be sure in the code that calls the XMLReader.parse() method to distinguish this exception from exceptions thrown by the parser itself (for example, you could subclass SAXException: though not all parsers throw your original exception unchanged and you may need to experiment.)

Is there ANY way to save a graph object containing nodes and edges?

I've tried using the standard serializing type things, stuff like:
FileOutputStream f_out;
try {
f_out = new FileOutputStream("MAOS.data");
ObjectOutputStream obj_out = new ObjectOutputStream (f_out);
obj_out.writeObject(s);
obj_out.flush();
obj_out.close();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} ;
But the problem seems to be that if my object s contains any recursion at ALL I get a stack overflow. If s is a graph that contains nodes and edges (with nodes knowing about edges for purposes of spreading activation, and edges knowing about nodes for the same reason) then it stack overflows. If I take edges out entirely, and just have nodes that know about which nodes they're supposed to spread activation too, the same thing happens! I can even just try to save the ArrayList of nodes that the graph knows about, and the stack overflows again!
I'm so frustrated!
Graphs aren't exactly strange and mysterious, surely SOMEONE has wanted to save one before me. I'm seeing something about saving them as XML files here...but if my problem is the recursiveness, wouldn't I still be having the same problems even if I saved it differently? I just can't think of how you could make a graph without there being connections!
Am I just doing things wrong, or is this object serialization less powerful than I thought? Or do I need to just abandon the idea of saving a graph?
-Jenny
Edit, part of the HUGE stack trace:
Exception in thread "main" java.lang.StackOverflowError
at java.io.ObjectStreamClass.getPrimFieldValues(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
at java.io.ObjectOutputStream.writeObject(Unknown Source)
at java.util.ArrayList.writeObject(Unknown Source)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at java.io.ObjectStreamClass.invokeWriteObject(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
at java.io.ObjectOutputStream.writeObject(Unknown Source)
at java.util.ArrayList.writeObject(Unknown Source)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at java.io.ObjectStreamClass.invokeWriteObject(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
at java.io.ObjectOutputStream.writeObject(Unknown Source)
at java.util.ArrayList.writeObject(Unknown Source)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at java.io.ObjectStreamClass.invokeWriteObject(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
at java.io.ObjectOutputStream.writeObject(Unknown Source)
at java.util.ArrayList.writeObject(Unknown Source)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at java.io.ObjectStreamClass.invokeWriteObject(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
These sort of structures are best saved like this:
collection of nodes, each node has a unique ID
collection of edges, each edge has two node IDs (or however many nodes an edge connects to)
without using any recursion. On reading the nodes, create a dictionary of nodes indexed by their ID. Then use the dictionary to fix up the edges when they're read. The IDs do not need to be part of the objects' run time structure, they only need to be unique within the data stream when the stream is written/read.
You could use the JGraphT library which supports serializing graphs into a text file with the ML format. GraphMLExporter Javadoc.
Java serialisation can cope with arbitrary graphs (although not necessarily very efficiently). Probably the problem lies with a custom implementation of writeObject. Perhaps a section of stack trace might help.
A useful serialization format you should consider is JSON, where dictionaries (as suggested by #Skizz) are easily represented:
A JSONObject is an unordered collection of name/value pairs. Its external form is a string wrapped in curly braces with colons between the names and values, and commas between the values and names. The internal form is an object having get() and opt() methods for accessing the values by name, and put() methods for adding or replacing values by name. The values can be any of these types: Boolean, JSONArray, JSONObject, Number, and String, or the JSONObject.NULL object.
Java serialization is capable of handling cyclic references (I assume this is what you mean by recursion), but there is a known problem with large graphs that is described here.
Don't let the date of the article throw you off, just follow chain of comments after it.
It seems you will have to use another serialization technique to accomplish this. Several have been mentioned, and some performance metrics give JSON high marks.
Hmmm. One solution would be to make it into a java bean and use XMLEncoder/XMLDecoder. This is a solution I've used in the past to save and load classes.

Categories

Resources