Groovy: parse string with special symbols - java

I have this description that I get from user:
sample description with special symbols >.
I want to parse this into a valid XML format string to pass it in my REST call. Currently, if I pass this as is, my third party implementation fires an exception, saying "it cannot handle any special symbols"
I have tried XMLParser, XmlSlurper but all fire exception as
[Fatal Error] :1:1: Content is not allowed in prolog.
Exception in thread "main" org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.

It needs to be escaped if you are sending it inside xml.
Approach 1:
change from > to > in the value.
Approach 2:
Put the string inside CDATA as shown below.
<![CDATA[sample description with special symbols >]]>

Related

SAXParseException when print the pdf from xslt and xml

I am trying to generate pdf from xml and xslt.
[Fatal Error] :89:14: Invalid byte 1 of 1-byte UTF-8 sequence.
file:////; Line #89; Column #14; org.xml.sax.SAXParseException; systemId: file:////; lineNumber: 89; columnNumber: 14; Invalid byte 1 of 1-byte UTF-8 sequence.
severe:XSLT Transformation failed null
JBAS014134: EJB Invocation failed on component PDFGenerationBean for
method public abstract java.util.List
au.com.copl.dbaccesslayer.session.PDFGenerationRemote.getPDFs(java.util.List,java.util.List,java.lang.Integer,java.lang.Integer)
throws java.lang.Exception: javax.ejb.EJBException:
java.lang.RuntimeException: java.lang.NullPointerException
Line 89 is
Part 1 – contains information about us and the services we can provide
to you; and.
Actually - sign creating the problem here. Now I have removed from this line. And pdf generated successfully.
now changed with this line
Part 1 contains information about us and the services we can provide
to you; and.

XML parsing tries to parse more lines then exist

I have a strange problem when parsing an xml request with JAXB: somehow it tries to parse more lines then exists in the string:
String xml; //content with 139 lines in xml format
MyReq req = JAXB.unmarshal(new StringReader(xml), MyReq.class);
Result:
Caused by: org.xml.sax.SAXParseException; lineNumber: 140; columnNumber: 1; Content is not allowed in trailing section.
What might be wrong with this?? The lines does not exist that is supposed to be have an error...
I can copy the xml just as it is to soapUI and execute the request successfully, thus concluding the xml is valid in general.
You should check the xml content. Most of the time Content is not allowed in trailing section error is because the content is not valid, probably some bad characters at the end of the stream.
You should print the content of the xml, with some known delimiters, to ensure that what you received is what you actually tested/expected, something like:
System.out.println("*"+xml+"*");

How to use ESC () character in XML?

I'm trying this:
<root>
text: 
</root>
But parser says:
org.xml.sax.SAXParseException; lineNumber: 2; columnNumber: 12;
Character reference "&#27" is an invalid XML character.
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)
How to use it there?
You cannot ( in XML 1.0). In XML 1.1 there is a slightly larger range of characters you can use, and the characters are expressed differently, but even then,  is 'restricted' (hex it is ) which, as far as I can tell, means that it is not valid XML, even though XML parsers should process it successfully. Note that the 'null' character () is never valid. Here's a Wiki article on these characters in XML
You can try forcing the XML document to XML 1.1 and see if your parser will process it successfully.... set the first line of the XML to:
<?xml version="1.1"?>
In fact, I did that, and it works:
<?xml version="1.1"?>
<root></root>

Native Java API for Checking Valid XML

How do you determine whether a Document object in Java contains valid XML. Is this checked when the object is constructed?
I can't appear to find any information on this in
http://docs.oracle.com/javase/1.5.0/docs/api/org/w3c/dom/Document.html
How do you determine whether you have a valid XML Document without using external libraries?
Note: I received this Document object by parsing from an input stream with a DOM XML parser.
Use the Java DOM API. It can handle any valid XML document. A valid document will give no exception. You need no external libraries for DOM.
In case of an error the exception message looks like this:
[Fatal Error] MyXMLFile.xml:6:2: The end-tag for element type "lastname" must end with a '>' delimiter.
The end-tag for element type "lastname" must end with a '>' delimiter.
BUILD SUCCESSFUL (total time: 0 seconds)

about SAXparseException: content is not allowed in prolog

I am using glassfish server and the following error keeps coming:
Caused by: org.xml.sax.SAXParseException: Content is not allowed in prolog.
at com.sun.enterprise.deployment.io.DeploymentDescriptorFile.read(DeploymentDescriptorFile.java:304)
at com.sun.enterprise.deployment.io.DeploymentDescriptorFile.read(DeploymentDescriptorFile.java:226)
at com.sun.enterprise.deployment.archivist.Archivist.readStandardDeploymentDescriptor(Archivist.java:480)
at com.sun.enterprise.deployment.archivist.Archivist.readDeploymentDescriptors(Archivist.java:305)
at com.sun.enterprise.deployment.archivist.Archivist.open(Archivist.java:213)
at com.sun.enterprise.deployment.archivist.ApplicationArchivist.openArchive(ApplicationArchivist.java:813)
at com.sun.enterprise.instance.WebModulesManager.getDescriptor(WebModulesManager.java:395)
... 65 more
Check this link
http://mark.koli.ch/2009/02/resolving-orgxmlsaxsaxparseexception-content-is-not-allowed-in-prolog.html
In short, some XML file contains the three-byte pattern (0xEF 0xBB 0xBF) at the front (right before <?xml ...?>), which is the UTF-8 byte order mark. The java's default XML parser can't handle this case.
The quick and dirty solution is to remove the white space at the front of the XML file:
String xml = "<?xml ...";
xml = xml.replaceFirst("^([\\W]+)<","<");
note that the String.trim() dost not enough, since it only trim the limited whitespace characters.

Categories

Resources