I'm trying to parse a web service response message in the following format (message tree):
Message
Properties
Properties..[]
DFDL
ObjectIWantUnmarshalled
AllItsDataIwant[]
And unmarshal the "ObjectIWantUnmarshalled". However, this data is in DFDL format.
In my request, I use the following line in order to format from XML to DFDL:
Document outDocument = outMessage.createDOMDocument(MbDFDL.PARSER_NAME);
But there doesn't seem to be a way to to the opposite, of DFDL to XML.
I have tried:
Document outDocument = inMessage.createDOMDocument(MbXMLNSC.PARSER_NAME);
As well as other attempts to simply unmarshal the data directly from the MbMessage:
jaxbContext_COBOL.createUnmarshaller().unmarshal(inMessage.getDOMDocument())
But I have not been able to get a Document node this way, or any other way, it is always null.
Probably a lot too late, but you were going about this the wrong way.
When using WMB and IIB you should use the built-in XML support - not the javax.XML.* class library. So instead of using the JAXB unmarshaller, you should
create an XMLNSC tree under the output message root
copy the input DFDL message tree to the output XMLNSC message tree ( one line )
...and the message flow will serialize ( unmarshall ) the tree as XML whenever it needs to - when it encounters an output node, or when you call outMessage.toBitstream().
Related
I have a XML Document where there are nested tags that should not be interpreted as XML tags
For example something like this
<something>cbaabc</something> should be parsed as a plain String "cbaabc" (it should be mentioned that the document has other elements as well that get parsed just fine). Jackson tho tries to interpret it as an Object and I don't know how to prevent this. I tried using #JacksonXmlText, turning off wrapping and a custom Deserializer, but I didn't get it to work.
The <a should be translated to <a. This back and forth conversion normally happens with every XML API, setting and getting text will use those entities &...;.
An other option is to use an additional CDATA section: <![CDATA[ ... ]]>.
<something><![CDATA[cbaabc]]></something>
If you cannot correct that, and have to live with an already corrupted XML text, you must do your own hack:
Load the wrong XML in a String
Repair the XML
Pass the XML string to jackson
Repairing:
String xml = ...
xml = xml.replaceAll("<(/?a\\b[^>]*)>", "<$1>"); // Links
StringReader in = new StringReader(xml);
I'm calling a soap webservice from my java application.
I get response and I want to parse it and get data.
The problem is that field <tranData>, contains structure with >< instead of <>. How can I parse this document to get data from field <tranData>?
This is response structure:
<response>
<Portfolio>
<ID>1</ID>
<holder>2</holder>
</Portfolio>
<tranData> <responseOne><header><code>1</code></header></responseOne></tranData>
Please remember that, this is only a example of response, and the amount of data will be much bigger, so the solution should be fast.
What you show us is the actual document as it is received over the wire, right? So <tranData> contains an XML string that has been escaped to not interfere with the markup of the rest of the containing document.
When you read the content of the <tranData> element, the XML processor will 'unescape' the string and give you the 'original' value:
<responseOne><header><code>1</code></header></responseOne>
What you do with that value is a different story. You can parse it as yet another XML document and retrieve the value of the <code> element, or just pass the string along to some other processing step.
I want to convert dynamic xml file into a specific file format. i could able to parse the xml using jsoup parser but the problem is I want to parse the nested tags and put it into a for-loop.Is there any way to do it. Attaching the sample below for reference
Input XML(sample)
<lineComponents>
<invoiceComponents>
<invoiceComponent>
<type></type>
<name></name>
<amount>16.00</amount>
<taxPercentage>0.00</taxPercentage>
<taxAmount>0E-8</taxAmount>
</invoiceComponent>
</invoiceComponents>
<acctComponents>
<acctComponent>
<componentCode>BASE</componentCode>
<glAccountNr></glAccountNr>
<baseAmount>10.00000</baseAmount>
<taxRate>0.00</taxRate>
<taxAmount>0.00000</taxAmount>
<totalAmount>10.00000</totalAmount>
<isVAT>No</isVAT>
</acctComponent>
<acctComponent>
<componentCode></componentCode>
<glAccountNr></glAccountNr>
<baseAmount>3.00000</baseAmount>
<taxRate>0.00</taxRate>
<taxAmount>0.00000</taxAmount>
<totalAmount>3.00000</totalAmount>
<isVAT>No</isVAT>
</acctComponent>
<acctComponent>
<componentCode>DISC</componentCode>
<glAccountNr></glAccountNr>
<baseAmount>-2.00000</baseAmount>
<taxRate>0.00</taxRate>
<taxAmount>0.00000</taxAmount>
<totalAmount>-2.00000</totalAmount>
<isVAT>No</isVAT>
</acctComponent>
<acctComponent>
<componentCode>ARPIT</componentCode>
<glAccountNr></glAccountNr>
<baseAmount>5.00000</baseAmount>
<taxRate>0.00</taxRate>
<taxAmount>0.00000</taxAmount>
<totalAmount>5.00000</totalAmount>
<isVAT>No</isVAT>
</acctComponent>
</acctComponents>
</lineComponents>
Expected output:
for(OrderItem invoiceLineItem: orderLineWrp.invoiceLineItems){
Dom.XMLNode invoiceComponentNode = invoiceComponentsNode.addChildElement(EP_OrderConstant.invoiceComponent,null,null);
invoiceComponentNode.addChildElement(EP_OrderConstant.seqId,null,null).addTextNode(getValueforNode(invoiceLineItem.EP_SeqId__c));
invoiceComponentNode.addChildElement(EP_OrderConstant.TYPE,null,null).addTextNode(getValueforNode(invoiceLineItem.EP_ChargeType__c));
invoiceComponentNode.addChildElement(EP_OrderConstant.name,null,null).addTextNode(getValueforNode(invoiceLineItem.EP_Invoice_Name__c));
invoiceComponentNode.addChildElement(EP_OrderConstant.amount,null,null).addTextNode(getValueforNode(invoiceLineItem.UnitPrice)); //Value for amount
invoiceComponentNode.addChildElement(EP_OrderConstant.taxPercentage,null,null).addTextNode(getValueforNode(invoiceLineItem.EP_Tax_Percentage__c)); //Value for taxPercentage
invoiceComponentNode.addChildElement(EP_OrderConstant.taxAmount,null,null).addTextNode(getValueforNode(invoiceLineItem.EP_Tax_Amount_XML__c)); //Value for taxAmount
}
This Xml file is dynamic. Is there any way to handle dynamic XML file into a specific format like above?
Jsoup is rather for HTML parsing.
If you have XSD/DTD to your XML, you should use JAXB-generated classes and an unmarshaller to read it.
Otherwise you can use JAXP (DOMParser, if the file is small, and XPath, or event based SAXParser(however this is not so easy to use) for really large XML files).
I want to map different XML elements to proper DTO generated from XSD files (which are generated from RNG schemas). These XMLs must be read from socket and the data arrives continously. Server is never closing the stream and I am not recieving any sign, that a new element is now arriving. The elements on stream are complete - the next appears, when other is already sent.
This is a part of data, which I am reading from socket:
<element1 prop1="value">
<otherTag foo="baaaaaar"/>
</element2>
<otherElement prop3="value" prop4="value">
<otherTag3 foo="baaaaaar">
<aaa bbb="ccc1"/>
<aaa bbb="ccc2"/>
<aaa bbb="ccc3"/>
</otherTag3>
</otherElement>
...and other emelents...
I tried to parse it with StaX (i count START_ELEMENT and END_ELEMENT events to determine if I recieved the whole element), build string and then send it to JAXB to map it. I realised that, it is not a good way, because it doubles parsing (StaX and then JAXB). Anyway I have to specify a class to wchich JAXB have to map recieved XML element.
How to do it with 'better way'?
I have a Python script and Java test running side-by-side. They both attempt to do the exact same thing - open a socket, receive a never-ending stream of XML, and parse the XML as it is received. The Python script is using Expat, while the Java test is using XMLStreamReader and an Unmarshaller.
The Python script is always one step/object ahead of the Java test, e.g. when I have enough XML to unmarshal an object, the Python script immediately does so, while the Java unmarshaller only BEGINS the unmarshalling, and WAITS for the beginning of the next XML start tag to stream in before returning the previously unmarshalled object. If I receive XML objects 20 seconds apart, without fail the Java unmarshaller will NOT return until the next is received.
XML Received
Python and Java start unmarshalling
Python returns immediately
New XML Received
Java unmarshaller returns
Back to step 2
XMLStreamReader reader = XMLInputFactory.newFactory().createXMLStreamReader(socket.getInputStream());
while (reader.hasNext()) {
// Unmarshal here -- hangs until next XML comes in
}
The XMLStreamReader is com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl
This issue seems to also describe it fairly well: http://java.net/jira/browse/JAXB-419
The 2.1.10 classes referred to in that issue appear to be:
UnmarshallerImpl
StAXStreamConnector
The unmarshaller wants to place the "cursor" to the event following the end element event of the portion it has unmarshalled. Therefore, it "hangs" until something gets available. From the API doc:
This method assumes that the parser is on a START_DOCUMENT or
START_ELEMENT event. Unmarshalling will be done from this start event
to the corresponding end event. If this method returns successfully,
the reader will be pointing at the token right after the end event.
However, any XML event will do: maybe you can do some trick by inserting an XML comment into the stream...