I want to skip the node which contains an error. I use SAXParser
Example XML:
<file>
<person>
<id>1
<name>Jhon</name>
</person>
<person>
<id>2</id>
<name>Julia</name>
</person>
</file>
I use:
SAXParserFactory fact= SAXParserFactory.newInstance();
SAXParser parser= fact.newSAXParser();
MyHandler handler = new MyHandler ();
parser.parse(new File(path), handler);
Example of handler :
public class MyHandler extends org.xml.sax.helpers.DefaultHandler
{
private String message = "";
#Override
public void fatalError(final SAXParseException e)
{
message += "Error : " + e.getMessage();
}
}
I want to skip the error of person with id 1 because we don't have </id>
and continue the execution to person 2 and just save the error message.
There are parsers such as TagSoup and validator.nu that attempt to parse bad XML. Whether they succeed depends on just how bad the XML is.
And of course they have to guess what the "correct" XML was meant to be. In your example, the XML can be made well-formed by adding an </id> end tag anywhere before the </person> tag, so the repair may not be the one you would have liked.
You say you want to skip invalid records, but I think the philosophy of these products is to try to repair them rather than skipping them.
Related
I am new to Stax and XStream. I am trying to unmarshall some common elements from huge XML stream (there might be between 1.5 million and 2.5 million elements to unmarshal)
I have tried to Stax to parse the stream to get to an element of interest and then call xStream to unMarshall the XML up to the EndElement.
XMLStreamReader reader = xmlInputFactory.createXMLStreamReader(fis);
while (reader.hasNext()) {
if (reader.isStartElement() && reader.getLocalName().toLowerCase().equals("person")) {
break;
}
reader.next();
}
StaxDriver sd = new StaxDriver();
AbstractPullReader rd = sd.createStaxReader(reader);
XStream xstream = new XStream(sd);
xstream.registerConverter(new PersonConverter());
Person p = (Person) xstream.unmarshal(rd);
I create a test input
<Persons>
<Person>
<name>A</name>
</Person>
<Person>
<name>B</name>
</Person>
<Person>
<name>C</name>
</Person>
</Persons>
The problem with this, is that first my converter is not called. Second, I get a CannotResolveClassException for the element "name" in Person and XStream doesn't create my Person object.
What did I miss in my code?
When you instantiate an AbstractPullReader it will read the first open-element event from the stream, establishing the "root" element. Because you've already read the first Person event it will advance to the next one (name), which it doesn't know how to unmarshal.
You'll have to do two things to make your example work:
First, alias the element name Person to your java class
xstream.alias("Person", Person.class);
Second, only advance the SAX cursor up to the element before the one you want to read:
while (reader.hasNext()) {
if (reader.isStartElement() && reader.getLocalName().equals("Persons")) {
break;
}
reader.next();
}
Hi I am writing a unmarshalling script for an XML response string using java. I have mentioned the xml response, unmarshalling code and the error i received.
Please help in fixing the issue and also advise me on the problem.
JAXBContext jaxbContext = JAXBContext.newInstance(Customer.class);
javax.xml.bind.Unmarshaller jaxbUnmarshaller = jaxbContext.createUnmarshaller();
Customer customer = (Customer) jaxbUnmarshaller.unmarshal(new StreamSource(new StringReader(response.toString() ) ) );
System.out.println(customer.getNAME());
Exception in thread "main" javax.xml.bind.UnmarshalException: unexpected element (uri:"", local:"response"). Expected elements are <{}customer>
<?xml version="1.0" encoding="UTF-8"?>
<response>
<control>
<status>success</status>
<senderid>XXXX</senderid>
<controlid>ControlIdHere</controlid>
<uniqueid>false</uniqueid>
<dtdversion>3.0</dtdversion>
</control>
<operation>
<authentication>
<status>XXXX</status>
<userid>XXXX</userid>
<companyid>XXXXXX</companyid>
<sessiontimestamp>2014-08-12T03:49:00-07:00</sessiontimestamp>
</authentication>
<result>
<status>success</status>
<function>readByQuery</function>
<controlid>testControlId</controlid>
<data listtype="customer" count="26" totalcount="26" numremaining="0">
<customer>
<RECORDNO>15</RECORDNO>
<CUSTOMERID>RIC001</CUSTOMERID>
<NAME>XYZ</NAME>
<ENTITY>CRIC001</ENTITY>
<PARENTKEY></PARENTKEY>
<PARENTID></PARENTID>
<PARENTNAME></PARENTNAME>
</customer>
<customer>
<RECORDNO>15</RECORDNO>
<CUSTOMERID>RIC001</CUSTOMERID>
<NAME>BBB</NAME>
<ENTITY>CRIC001</ENTITY>
<PARENTKEY></PARENTKEY>
<PARENTID></PARENTID>
<PARENTNAME></PARENTNAME>
</customer>
</data>
</result>
</operation>
The problem is that you are telling the unmarshaller that you want a Customer object and will give a XML string representing a Customer, but you are passing it a XML string that represents a Response object. If you have a Response class, use it to create the JAXBContext instance. If not, get the string representing the Customer object in the response
<customer>
<name>ABC</name>
<country>India<country>
</customer>
and use with the unmarshaller.
== Update ==
Assuming you do not have a Response or a Data class, you can use code similar to the following;
XMLInputFactory xif = XMLInputFactory.newInstance();
StreamSource xml = new StreamSource(new StringReader(response.toString()));
XMLStreamReader xsr = xif.createXMLStreamReader(xml);
// Advance to the "Customer" elements
while (xsr.hasNext()) {
if (xsr.isStartElement() && "customer".equals(xsr.getLocalName())) {
// Unmarshal to Customer
JAXBContext jc = JAXBContext.newInstance(Customer.class);
Unmarshaller unmarshaller = jc.createUnmarshaller();
Customer customer = unmarshaller.unmarshal(xsr, Customer.class).getValue();
customers.add();
}
xsr.next();
}
You're currently trying to turn the XML, which is opened with a <response> tag, into the Customer object.
You need to provide the element specifically to the JAXBUnmarshaller for this to work. For example:
<customer>
<name>ABC</name>
<country>India<country>
</customer>
The XML string is not valid XML (closing tag missing), but I assume that is a mistake when posting the question?
It looks like JAXB isn't expecting the <response> root element when unmarshalling to a Customer object. What does the Customer class look like?
See this question how to walk through the XML until you reached the customer element. From there you can unmarshall the XML :
How to get a particular element through JAXB xml parsing?
Java Class must be decalared with
#XMLRootElement
class Response
The problem is - when I parse an XML with namespaces through XPath it parses XML partly, for example if I set to XPath: /SOAP-ENV:Envelope/SOAP-ENV:Body - parser recognize this good, but if I go further - parser crushes without returning exception.
This is an XML:
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:m0="http://schemas.compassplus.com/two/1.0/fimi.xsd"
xmlns:m="http://schemas.compassplus.com/two/1.0/fimi.wsdl">
<SOAP-ENV:Body>
<m:GetAcctInfoRp xmlns:m="http://schemas.compassplus.com/two/1.0/fimi.xsd">
<Response Response="1" Product="FIMI" Ver="0"><m0:Avail>995526.4</m0:Avail>
<m0:Bonus>0</m0:Bonus>
<m0:Branch>1</m0:Branch>
<m0:Cards><m0:Row>
<m0:PAN>6706250002450356</m0:PAN>
<m0:MBR>0</m0:MBR>
<m0:CardUID>E4BFBC24A2844F13BE5C5AEEB15D27CE</m0:CardUID>
<m0:Status>1</m0:Status>
</m0:Row>
<m0:Row>
<m0:PAN>6706255660781224</m0:PAN>
<m0:MBR>0</m0:MBR>
<m0:CardUID>971111C18D774C3BA26434336CB57475</m0:CardUID>
<m0:Status>1</m0:Status>
</m0:Row>
</m0:Cards>
<m0:CreditHold>0</m0:CreditHold>
<m0:Currency>810</m0:Currency>
<m0:DebitHold>50240.81</m0:DebitHold>
<m0:DropTmpOverOnRefresh>0</m0:DropTmpOverOnRefresh>
<m0:FoundAccount>40817810200001058114</m0:FoundAccount>
<m0:FoundAccountUID>E79459BEEEF94BEFBA57D0D23503EF7E</m0:FoundAccountUID>
<m0:LastDepAmount>10</m0:LastDepAmount>
<m0:LastDepTime>2012-03-06T17:23:35</m0:LastDepTime>
<m0:LastRefreshTime>2012-02-22T12:49:47</m0:LastRefreshTime>
<m0:LastTranId>4172</m0:LastTranId>
<m0:LastWdlAmount>100</m0:LastWdlAmount>
<m0:LastWdlTime>2012-03-06T17:29:44</m0:LastWdlTime>
<m0:Ledger>1045767.21</m0:Ledger>
<m0:MaskBalances>0</m0:MaskBalances>
<m0:Overdraft>9000000</m0:Overdraft>
<m0:PersonExtId>1891</m0:PersonExtId>
<m0:PersonFIO>bla bla bla</m0:PersonFIO>
<m0:PersonId>782</m0:PersonId>
<m0:Remain>1046068.21</m0:Remain>
<m0:Status>3</m0:Status>
<m0:TmpOverdraft>0</m0:TmpOverdraft>
<m0:Type>1</m0:Type>
<m0:UserFields><m0:Row>
<m0:Name>test</m0:Name>
<m0:TextValue>ัะตัั</m0:TextValue>
</m0:Row>
</m0:UserFields>
</Response>
</m:GetAcctInfoRp>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
Java code that should parse this XML file:
String res = RequestMethods.executeXMLPostRequest(url, xml);
Document doc = DocumentHelper.parseText(res.trim());
String session = doc.valueOf("SOAP-ENV:Envelope/SOAP-ENV:Body/m:GetAcctInfoRp/Response/m0:Avail");
PS sorry for my English, may be someone solved this problem yet?
Please help.
THe 1-st issue is solved, another question is - how can I get in cycled structure, for example if We have several SOAP-ENV:Envelope/SOAP-ENV:Body//m0:Avail elements?
This works for me
public static void main(String[] args) throws DocumentException {
String xmlText = getContents(new File("/home/bpgergo/Temp/9682103.xml"));
Document doc = DocumentHelper.parseText(xmlText);
String session = doc.valueOf("SOAP-ENV:Envelope/SOAP-ENV:Body//m0:Avail");
System.out.println("session:"+session);
}
I want to parse xml elemets using java.I m succeeded in some part...But not sure how to do rest..I have xml as,
<MainTag>
<userid>user1</userid>
<country>US</country>
<city>LA</city>
<phone>
<number>1111111111</number>
</phone>
<phone>
<number>222222222</number>
</phone>
</MainTag>
<MainTag>
<userid>user2</userid>
<country>Aus</country>
<city>MB</city>
<phone>
<number>23233</number>
</phone>
<phone>
<number>8787822</number>
</phone>
<phone>
<number>10101</number>
</phone>
I am able to parse xml elements such as country,city etc as below.
public void endelement()
{
if (someText.equalsIgnoreCase("country"))
{
pojo.setCountry(Val);
}
else if(someText.equalsIgnoreCase("city"))
{
pojo.setCity(Val);
}
}
public void stratelement()
{
............
}
But in case of phone how I can parse it ? since one user has multiple phone nos.
I want to find multiple phone nos for particular user.
for e.g. in above xml
for user1 there are two phone nos.
for user2 there are three phone nos.
Can anybody help in this ? Thanks in advance.
I would recommend using JAXB, since it appears you are attempting to bind your xml to a POJO.
Looking at the code you have written here (and assuming that the example xml you have provided is a snippet of well formed xml), I am guess that your pojo object should have a member for phone numbers that is of type List<String>, and your pojo should have a method that allows you to add a phone number to the List (perhaps addPhoneNumber(String phoneNumber) {...})
First, that is not a well-formed XML (as it has two root elements) and you can't parse it with any parser API unless it is well-formed. Now, to parse the XML you would normally use the APIs meant for it like SAX, DOM or StAX or even better the JAXB binding API.
Since you seem to be new to this, I suggest you start learning JAXP. Use StAX instead of DOM or SAX.
you can use DocumetBuilderFactory java default class if you know the incoming xml format for example how many node it has and the names it is very simple look at this code ;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
//documentBuilder instance
DocumentBuilder db = dbf.newDocumentBuilder();
Document dom = db.parse("employees.xml");
}catch(ParserConfigurationException pce) {
pce.printStackTrace();
}catch(SAXException se) {
se.printStackTrace();
}catch(IOException ioe) {
ioe.printStackTrace();
}
//and than get root element
Element de= dom.getDocumentElement();
//get the nodelist of main element
NodeList nl = de.getElementsByTagName("Employee");
if(nl != null && nl.getLength() > 0) {
for(int i = 0 ; i < nl.getLength();i++) {
//get the employee element
Element el = (Element)nl.item(i);
}
}
//and then get data
private void getEmployee(Element el) {
//for each <employee> element get values
String name = getTextValue(el,"Name");
int id = getIntValue(el,"Id");
int age = getIntValue(el,"Age");
//get any element attribute
//String type = el.getAttribute("type");
}
thats all
I'm trying to parse an XML string, and the tagnames are variable; I haven't seen any examples on how to pull the information out without knowing them. For example, I will always know the <response> and <data> tags below, but what falls inside/outside of them could be anything from <employee> to you name it.
<?xml version="1.0" encoding="UTF-8"?>
<response>
<generic>
....
</generic>
<data>
<employee>
<name>Seagull</name>
<id>3674</id>
<age>34</age>
</employee>
<employee>
<name>Robin</name>
<id>3675</id>
<age>25</age>
</employee>
</data>
</response>
You could parse it into a generic dom object and traverse it. For example, you could use dom4j to do this.
From the dom4j quick start guide:
public void treeWalk(Document document) {
treeWalk( document.getRootElement() );
}
public void treeWalk(Element element) {
for ( int i = 0, size = element.nodeCount(); i < size; i++ ) {
Node node = element.node(i);
if ( node instanceof Element ) {
treeWalk( (Element) node );
}
else {
// do something....
}
}
}
public Document parse(URL url) throws DocumentException {
SAXReader reader = new SAXReader();
Document document = reader.read(url);
return document;
}
I have seen similar situation in the projects.
If you are going to deal with large XMLs, you can use Stax or Sax parser to read the XML. On every step (like on reaching end element), enter the data into a Map or a dta structure of your choice, where you keep tag names as the key and value as value in the Map. Finally once you have the parsing done, use this Map to figure out which object to build as finally you would have a proper entity representation of the information in the XML
If XML is small,use DOM and directly build the entity object by reading the specific tag (like employee> or use XPATh to where you expect the tag to be present, giving you hint of the entity. Build that object directly by reading the specific information from the XML.