How to stop Jackson from parsing an element? - java

I have a XML Document where there are nested tags that should not be interpreted as XML tags
For example something like this
<something>cbaabc</something> should be parsed as a plain String "cbaabc" (it should be mentioned that the document has other elements as well that get parsed just fine). Jackson tho tries to interpret it as an Object and I don't know how to prevent this. I tried using #JacksonXmlText, turning off wrapping and a custom Deserializer, but I didn't get it to work.

The <a should be translated to <a. This back and forth conversion normally happens with every XML API, setting and getting text will use those entities &...;.
An other option is to use an additional CDATA section: <![CDATA[ ... ]]>.
<something><![CDATA[cbaabc]]></something>
If you cannot correct that, and have to live with an already corrupted XML text, you must do your own hack:
Load the wrong XML in a String
Repair the XML
Pass the XML string to jackson
Repairing:
String xml = ...
xml = xml.replaceAll("<(/?a\\b[^>]*)>", "<$1>"); // Links
StringReader in = new StringReader(xml);

Related

How to convert nested tags in dynamic XML into a for loop?

I want to convert dynamic xml file into a specific file format. i could able to parse the xml using jsoup parser but the problem is I want to parse the nested tags and put it into a for-loop.Is there any way to do it. Attaching the sample below for reference
Input XML(sample)
<lineComponents>
<invoiceComponents>
<invoiceComponent>
<type></type>
<name></name>
<amount>16.00</amount>
<taxPercentage>0.00</taxPercentage>
<taxAmount>0E-8</taxAmount>
</invoiceComponent>
</invoiceComponents>
<acctComponents>
<acctComponent>
<componentCode>BASE</componentCode>
<glAccountNr></glAccountNr>
<baseAmount>10.00000</baseAmount>
<taxRate>0.00</taxRate>
<taxAmount>0.00000</taxAmount>
<totalAmount>10.00000</totalAmount>
<isVAT>No</isVAT>
</acctComponent>
<acctComponent>
<componentCode></componentCode>
<glAccountNr></glAccountNr>
<baseAmount>3.00000</baseAmount>
<taxRate>0.00</taxRate>
<taxAmount>0.00000</taxAmount>
<totalAmount>3.00000</totalAmount>
<isVAT>No</isVAT>
</acctComponent>
<acctComponent>
<componentCode>DISC</componentCode>
<glAccountNr></glAccountNr>
<baseAmount>-2.00000</baseAmount>
<taxRate>0.00</taxRate>
<taxAmount>0.00000</taxAmount>
<totalAmount>-2.00000</totalAmount>
<isVAT>No</isVAT>
</acctComponent>
<acctComponent>
<componentCode>ARPIT</componentCode>
<glAccountNr></glAccountNr>
<baseAmount>5.00000</baseAmount>
<taxRate>0.00</taxRate>
<taxAmount>0.00000</taxAmount>
<totalAmount>5.00000</totalAmount>
<isVAT>No</isVAT>
</acctComponent>
</acctComponents>
</lineComponents>
Expected output:
for(OrderItem invoiceLineItem: orderLineWrp.invoiceLineItems){
Dom.XMLNode invoiceComponentNode = invoiceComponentsNode.addChildElement(EP_OrderConstant.invoiceComponent,null,null);
invoiceComponentNode.addChildElement(EP_OrderConstant.seqId,null,null).addTextNode(getValueforNode(invoiceLineItem.EP_SeqId__c));
invoiceComponentNode.addChildElement(EP_OrderConstant.TYPE,null,null).addTextNode(getValueforNode(invoiceLineItem.EP_ChargeType__c));
invoiceComponentNode.addChildElement(EP_OrderConstant.name,null,null).addTextNode(getValueforNode(invoiceLineItem.EP_Invoice_Name__c));
invoiceComponentNode.addChildElement(EP_OrderConstant.amount,null,null).addTextNode(getValueforNode(invoiceLineItem.UnitPrice)); //Value for amount
invoiceComponentNode.addChildElement(EP_OrderConstant.taxPercentage,null,null).addTextNode(getValueforNode(invoiceLineItem.EP_Tax_Percentage__c)); //Value for taxPercentage
invoiceComponentNode.addChildElement(EP_OrderConstant.taxAmount,null,null).addTextNode(getValueforNode(invoiceLineItem.EP_Tax_Amount_XML__c)); //Value for taxAmount
}
This Xml file is dynamic. Is there any way to handle dynamic XML file into a specific format like above?
Jsoup is rather for HTML parsing.
If you have XSD/DTD to your XML, you should use JAXB-generated classes and an unmarshaller to read it.
Otherwise you can use JAXP (DOMParser, if the file is small, and XPath, or event based SAXParser(however this is not so easy to use) for really large XML files).

Parsing string to extract XML

Lets say I have the following String:
abc: def
qxy
<?xml version='1.0'><xyz>
...
</xyz>
other text
<?xml version='1.0'><www>
...
</www>
more text
Is there a way to parse this? I am currently trying with an XMLStreamReader and it throws a parsing error: javax.xml.stream.XMLStreamException: ParseError. If I remove all the test and just try to parse one of the XML sections (like only xyz) then it works perfectly.
You have to filter out the xml part. No general purpose XMLStreamReader will do it for you since they have no idea where your document starts or ends. You may craft your own specialized version that can filter the input, but other implementations expect a full xml document only.

Convert DFDL to XML

I'm trying to parse a web service response message in the following format (message tree):
Message
Properties
Properties..[]
DFDL
ObjectIWantUnmarshalled
AllItsDataIwant[]
And unmarshal the "ObjectIWantUnmarshalled". However, this data is in DFDL format.
In my request, I use the following line in order to format from XML to DFDL:
Document outDocument = outMessage.createDOMDocument(MbDFDL.PARSER_NAME);
But there doesn't seem to be a way to to the opposite, of DFDL to XML.
I have tried:
Document outDocument = inMessage.createDOMDocument(MbXMLNSC.PARSER_NAME);
As well as other attempts to simply unmarshal the data directly from the MbMessage:
jaxbContext_COBOL.createUnmarshaller().unmarshal(inMessage.getDOMDocument())
But I have not been able to get a Document node this way, or any other way, it is always null.
Probably a lot too late, but you were going about this the wrong way.
When using WMB and IIB you should use the built-in XML support - not the javax.XML.* class library. So instead of using the JAXB unmarshaller, you should
create an XMLNSC tree under the output message root
copy the input DFDL message tree to the output XMLNSC message tree ( one line )
...and the message flow will serialize ( unmarshall ) the tree as XML whenever it needs to - when it encounters an output node, or when you call outMessage.toBitstream().

Invalid character while converting from JSON to XML using jsonlib

I'm trying to convert a JSON string to XML using jsonlib in Java.
JSONObject json = JSONObject.fromObject(jsonString);
XMLSerializer serializer = new XMLSerializer();
String xml = serializer.write( json );
System.out.println(xml);
The error that I get is
nu.xom.IllegalNameException: 0x24 is not a legal NCName character
The problem here is that I have some properties in my JSON that are invalid XML characters. eg. I have a property named "$t". The XMLSerializer throws the exception while trying to create a XML tag in this name because $ is not allowed in XML tag names. Is there any way in which I can override this XML well formedness check done by the serializer?
First I'd suggest to add the language you are using (it is Java, right?).
You could override the method where it checks your XML tag name to do nothing.
I took a look at the spec for the json-lib XMLSerializer and to my surprise it seems to have no option for serialising a JSON object whose keys are not valid XML names. If that's the case then I think you will need to find a different library.
You could loop over json.keySet (recursively if necessary) and replace invalid keys with valid ones (using remove and add).

How to determine whether a given string is an .xml file

I have an issue that I get some some response as a String.
This String could be a normal string,number etc.. or an .xml file.
Now ,when I get an xml file, I want to treat it differently.
I am not able to distinguish between a string or an .xml file.
Also, this xml file could have some syntatic error.
Please suggest , how do I go ahead
Code is like this:
Document document = reader.read(new StringReader(xml));
where xml can be a string or an xml file itself.
If xml is a string , it is fine but if it is an xml file and with some syntax error then it should throw exception
If it is a proper XML document it should begin with a XML declaration. If that's there, it's intended to be a conforming XML document. If that's not there it cannot be a conforming XML document.
If you are using a coding language like C#, then you can use - XmlDocument.loadxml -
http://msdn.microsoft.com/en-us/library/system.xml.xmldocument.loadxml.aspx
This will throw error if the string is not in correct xml format.

Categories

Resources