Read child nodes with same name using StAX - java

While trying to read XML file using StAX I came across this problem.
In an XML file (essentially its an XLIFF file), I have child nodes with the same name.
I couldn't quite figure out how to read these duplicate nodes.
Below is the part of code that I am trying on, and an example of the XLIFF file as well
This is only the working part of the code.
Java Code:
// Initialize ArrayList to return
ArrayList<SourceCollection> xmlData = new ArrayList<>();
boolean isSource = false;
boolean isTrans = false;
boolean isContext = false;
// Setting Up Data Class
SourceCollection srcData = null;
// Start StAX XLIFF reader
XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
try {
XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(inStream);
int event = xmlStreamReader.getEventType();
while (true) {
switch (event) {
case XMLStreamConstants.START_ELEMENT:
switch (xmlStreamReader.getLocalName()) {
case "group":
// Create SourceCollection Object
srcData = new SourceCollection();
srcData.setID(xmlStreamReader.getAttributeValue(0));
break;
case "source":
isSource = true;
break;
case "target":
isTarget = true;
break;
case "context":
isContext = true;
break;
default:
isSource = false;
isTarget = false;
isContext = false;
break;
}
break;
case XMLStreamConstants.CHARACTERS:
if (srcData != null) {
String srcTrns = xmlStreamReader.getText();
if (!Utility.isStringNullOrEmptyOrWhiteSpace(srcTrns)) {
if (isSource) {
srcData.setSource(srcTrns);
isSource = false;
} else if (isTarget) {
srcData.setTarget(srcTrns);
isTarget = false;
}
}
}
break;
case XMLStreamConstants.END_ELEMENT:
if (xmlStreamReader.getLocalName().equals("group")) {
xmlData.add(srcData);
}
break;
}
if (!xmlStreamReader.hasNext()) {
break;
}
event = xmlStreamReader.next();
}
} catch (XMLStreamException ex) {
LOG.log(Level.WARNING, ex.getMessage(), MessageFormat.format("{0} {1}", ex.getCause(), ex.getLocation()));
}
XLIFF file sample:
<XLIFF>
<xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<file datatype="xml">
<body>
<group id="25032014">
<context-group>
<context context-type="sub1">xxxx</context>
<context context-type="sub2">yyyy</context>
<context context-type="sub3"/>
</context-group>
<target-unit>
<source>ABC</source>
<target>ABC</target>
</target-unit>
</group>
</body>
</file>
</xliff>
</XLIFF>
Of course, this is a modified XLIFF file, but structure is exactly the same as original.
Any sample or suggestions would be helpful.

But you already process these duplicates. I modified your code a little like
switch (event) {
case XMLStreamConstants.START_ELEMENT:
System.out.println(xmlStreamReader.getLocalName());
switch (xmlStreamReader.getLocalName()) {
and the System.out delivers:
XLIFF
xliff
file
body
group
context-group
context
context
context
target-unit
source
target
You see the multiple context outputs. Now you have to adapt your data structure to hold lists of context elements and not only one.

Related

Parse special characters in xml stax file

I have the following question:
Original a part of RSS file:
<item>
<title> I can get data in tag this </title>
<description><p> i don't get data in this </p></description></item>
When I read the file using StAX parser the special character '&lt'; . It is automatically converted to '<'. then I cannot get data in the rest of tag "<'description>'
This is my code:
public Feed readFeed() {
Feed feed = null;
try {
boolean isFeedHeader = true;
String description = "";
String title = "";
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
InputStream in = read();
XMLEventReader eventReader = inputFactory.createXMLEventReader(in);
while (eventReader.hasNext()) {
XMLEvent event = eventReader.nextEvent();
if (event.isStartElement()) {
String localPart = event.asStartElement().getName()
.getLocalPart();
switch (localPart) {
case "title":
title = getCharacterData(event, eventReader);
break;
case "description":
description = getCharacterData(event, eventReader);
break;
}
} else if (event.isEndElement()) {
if (event.asEndElement().getName().getLocalPart() == ("item")) {
FeedMessage message = new FeedMessage();
message.setDescription(description);
message.setTitle(title);
feed.getMessages().add(message);
event = eventReader.nextEvent();
continue;
}
}
}
} catch (XMLStreamException e) {
throw new RuntimeException(e);
}
return feed;}
private String getCharacterData(XMLEvent event, XMLEventReader eventReader)
throws XMLStreamException {
String result = "";
event = eventReader.nextEvent();
if (event instanceof Characters) {
result = event.asCharacters().getData();
}
return result;}
I am following the instructions at: http://www.vogella.com/tutorials/RSSFeed/article.html
The tutorial is flawed. It doesn't account for the fact that you could get multiple text events for a single block of text (which tends to happen when you have embedded entities).
In order to make your life easier, make sure you set the IS_COALESCING property to true on the XMLInputFactory before creating your XMLEventReader (this property forces the reader to combine all adjacent text events into a single event).

XmlPullParserException during parsing XML with XMLPullParser

Need your help to resolve this :
Trying to parse this xml -
<LungProtocol><configuration name="FLUS Sitting">
<segment order="1" name="left upper anterior">
<segment order="2" name="left lower anterior">
</configuration>
<configuration name="FLUS Supine">
<segment order="1" name="left upper anterior">
<segment order="2" name="left lower anterior">
</configuration></LungProtocol>
With following function -
public List<LungSegment> parse(InputStream is, String configuration) {
try {
XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
factory.setNamespaceAware(true);
XmlPullParser parser = factory.newPullParser();
segment = new LungSegment();
parser.setInput(is, null);
parser.require(XmlPullParser.START_TAG, null, "configuration");
int eventType = parser.getEventType();
while (eventType != XmlPullParser.END_DOCUMENT) {
String tagname = parser.getName();
switch (eventType) {
case XmlPullParser.START_TAG:
if(("configuration").equalsIgnoreCase(tagname) && parser.getAttributeValue(null, "name").equalsIgnoreCase(configuration)){
eventType = parser.next();
//eventType = parser.next();
tagname = parser.getName();
Log.v("XMLTAG", "configuration = "+configuration);
if (("segment").equalsIgnoreCase(tagname)) {
// create a new instance of segment
segment = new LungSegment();
segment.setSegmentName(parser.getAttributeValue(null, "name"));
segment.setSegmentOrder(Integer.parseInt(parser.getAttributeValue(null, "order")));
}
}
break;
case XmlPullParser.END_TAG:
if (tagname.equalsIgnoreCase("segment")) {
// add segment object to list
segments.add(segment);
} else if (("configuration").equalsIgnoreCase(tagname) && parser.getAttributeValue(null, "name").equalsIgnoreCase(configuration)) {
return segments;
}
break;
}
eventType = parser.next();
}
} catch (XmlPullParserException e) {e.printStackTrace();}
catch (IOException e) {e.printStackTrace();}
return segments;
}
where configuration is the name attribute of configuration tag. But getting an exception -
org.xmlpull.v1.XmlPullParserException: expected: START_TAG {null}configuration (position:START_DOCUMENT null#1:1 in java.io.InputStreamReader#42323800)
Please help me where I am going wrong.
Make these two changes in your code:
Make xml valid. The segment tags are not ended. They should be:<segment order="1" name="left upper anterior"/>
Remove the line parser.require(XmlPullParser.START_TAG, null, "configuration");
Everything else looks fine. I was able to run your code as well.

Cyclomatic Complexity is 11 ( max allowed is 10 ) in Java code

I have the following java code that violets the checkstyle saying that "Cyclomatic Complexity is 11 ( max allowed is 10 )"
public boolean validate(final BindingResult bindingResult) {
boolean validate = true;
for (String channel : getConfiguredChannels()) {
switch (channel) {
case "SMS":
// do nothing
break;
case "Email":
// do nothing
break;
case "Facebook":
// do nothing
break;
case "Voice":
final SpelExpressionParser parser = new SpelExpressionParser();
if (parser
.parseExpression(
"!voiceMessageForm.audioForms.?[audioId == '' || audioId == null].isEmpty()")
.getValue(this, Boolean.class)) {
bindingResult.rejectValue("voiceMessageForm.audioForms",
"message.voice.provide.all.audios");
validate = false;
}
boolean voiceContentErrorSet = false;
boolean voiceDescriptionErrorSet = false;
for (AudioForm audioForm : (List<AudioForm>) parser
.parseExpression(
"voiceMessageForm.audioForms.?[description.length() > 8000]")
.getValue(this)) {
if (audioForm.getAddAudioBy().equals(
AudioForm.AddBy.TTS)
&& !voiceContentErrorSet) {
voiceContentErrorSet = true;
bindingResult.rejectValue(
"voiceMessageForm.audioForms",
"message.voice.content.exceed.limit");
} else {
if (!voiceDescriptionErrorSet) {
voiceDescriptionErrorSet = false;
bindingResult
.rejectValue(
"voiceMessageForm.audioForms",
"message.describe.voice.content.exceed.limit");
}
}
validate = false;
}
break;
default:
throw new IllegalStateException("Unsupported channel: "
+ channel);
}
}
return validate;
}
}
Please suggest a suitable way to avoid this checkstyle issue
I'd go ahead and extract your code of the "Voice" case to another method. After that your validate method will look like:
(You can use the refactoring tools of your IDE to do so.)
public boolean validate(final BindingResult bindingResult) {
boolean validate = true;
for (String channel : getConfiguredChannels()) {
switch (channel) {
case "SMS":
// do nothing
break;
case "Email":
// do nothing
break;
case "Facebook":
// do nothing
break;
case "Voice":
validate = validateVoice(bindingResult);
default:
throw new IllegalStateException("Unsupported channel: "
+ channel);
}
}
return validate;
}
Edit: (Added extracted method, although I did not really look into it.)
private boolean validateVoice(final BindingResult bindingResult) {
boolean validate = true;
final SpelExpressionParser parser = new SpelExpressionParser();
if (parser.parseExpression("!voiceMessageForm.audioForms.?[audioId == '' || audioId == null].isEmpty()").getValue(this, Boolean.class)) {
bindingResult.rejectValue("voiceMessageForm.audioForms", "message.voice.provide.all.audios");
validate = false;
}
boolean voiceContentErrorSet = false;
boolean voiceDescriptionErrorSet = false;
for (AudioForm audioForm : (List<AudioForm>) parser.parseExpression("voiceMessageForm.audioForms.?[description.length() > 8000]").getValue(this)) {
if (audioForm.getAddAudioBy().equals(AudioForm.AddBy.TTS) && !voiceContentErrorSet) {
voiceContentErrorSet = true;
bindingResult.rejectValue("voiceMessageForm.audioForms", "message.voice.content.exceed.limit");
} else {
if (!voiceDescriptionErrorSet) {
voiceDescriptionErrorSet = false;
bindingResult.rejectValue("voiceMessageForm.audioForms", "message.describe.voice.content.exceed.limit");
}
}
validate = false;
}
return validate;
}

Android: Xml Pull Parser not working

I'm trying to extract data from an Xml file, I followed this tutorial:
XmlPullParser tutorial
And now have the following code:
public void parse(InputStream is) {
// create new Study object to hold data
try {
// get a new XmlPullParser object from Factory
XmlPullParser parser = XmlPullParserFactory.newInstance().newPullParser();
// set input source
parser.setInput(is, null);
// get event type
int eventType = parser.getEventType();
// process tag while not reaching the end of document
while(eventType != XmlPullParser.END_DOCUMENT) {
switch(eventType) {
// at start of document: START_DOCUMENT
case XmlPullParser.START_DOCUMENT:
break;
// at start of a tag: START_TAG
case XmlPullParser.START_TAG:
// get tag name
String tagName = parser.getName();
Log.i("AT START TAG","AT START TAG..."+tagName);
// if <study>, get attribute: 'id'
if(tagName.equalsIgnoreCase("Date")) {
Log.i("****PARSER INFO","TAG NAME="+tagName+"...."+parser.nextText());
eventDates.add(parser.nextText());
//study.mId = Integer.parseInt(parser.getAttributeValue(null, Study.ID));
}
// if <content>
else if(tagName.equalsIgnoreCase("Name")) {
Log.i("****PARSER INFO","TAG NAME="+tagName+"...."+parser.nextText());
performanceNames.add(parser.nextText());
//study.mContent = parser.nextText();
}
// if <topic>
else if(tagName.equalsIgnoreCase("RequestURL")) {
Log.i("****PARSER INFO","TAG NAME="+tagName+"...."+parser.nextText());
eventsURLS.add(parser.nextText());
//study.mTopic = parser.nextText();
}
break;
}
// jump to next event
eventType = parser.next();
}
// exception stuffs
} catch (XmlPullParserException e) {
//study = null;
} catch (IOException e) {
//study = null;
}
// return Study object
}
For some reason, the code within the IF statements is not running even though I have made sure the tag names do equal the strings above.
What am I doing wrong?

Locating Specific Attributes in Digester - Java

I'm using the Apache Commons Digester and trying to locate a particular tag in the structure to include in the object.
<parent>
<image size="small">some url</image>
<image size="medium">some url</image>
<image size="large">some url</image>
<image size="huge">some url</image>
</parent>
I really only want the medium image to be included in my partent object but I'm not sure how I would do that.
Right now I'm using digester.addBeanPropertySetter(PathToParent+"/image","image"); but this gets updated for every image tag (as it should).
Ideally I would like something like digester.addBeanPropertySetter(PathToParent+"/image/medium","image"); but you can't do that.
I omitted generic getters/setters.
public class Parent {
private Image image;
public void setImage(Image image) {
if ("medium".equals(image.getSize())) {
this.image = image;
}
}
}
public class Image {
private String size;
private String url;
}
public static void main(String[] args) throws IOException, SAXException {
String s = "<parent>"
+ "<image size='small'>some url1</image>"
+ "<image size='medium'>some url2</image>"
+ "<image size='large'>some url3</image>"
+ "<image size='huge'>some url4</image>"
+ "</parent>";
Digester digester = new Digester();
digester.addObjectCreate("parent", Parent.class);
digester.addFactoryCreate("parent/image", new ImageCreationFactory());
digester.addBeanPropertySetter("parent/image", "url");
digester.addSetNext("parent/image", "setImage");
Parent p = (Parent) digester.parse(new StringReader(s));
}
public class ImageCreationFactory implements ObjectCreationFactory {
public Object createObject(Attributes attributes) throws Exception {
Image i = new Image();
i.setSize(attributes.getValue("size"));
return i;
}
}
I actually figured this out using the xmlpullparser - here is the code to get the image attribute "large" only and ignore the rest - it's the last "if" in the case statement.
public class XmlPullFeedParser extends BaseFeedParser {
public XmlPullFeedParser(String feedUrl) {
super(feedUrl);
}
public ArrayList<Message> parse() {
ArrayList<Message> messages = null;
XmlPullParser parser = Xml.newPullParser();
try {
// auto-detect the encoding from the stream
parser.setInput(this.getInputStream(), null);
int eventType = parser.getEventType();
Message currentMessage = null;
boolean done = false;
while (eventType != XmlPullParser.END_DOCUMENT && !done){
String name = null;
String attrib = null;
switch (eventType){
case XmlPullParser.START_DOCUMENT:
messages = new ArrayList<Message>();
break;
case XmlPullParser.START_TAG:
name = parser.getName();
attrib = parser.getAttributeValue(0);
if (name.equalsIgnoreCase(EVENT)){
currentMessage = new Message();
} else if (currentMessage != null){
if (name.equalsIgnoreCase(WEBSITE)){
currentMessage.setWebsite(parser.nextText());
} else if (name.equalsIgnoreCase(DESCRIPTION)){
currentMessage.setDescription(parser.nextText());
} else if (name.equalsIgnoreCase(START_DATE)){
currentMessage.setDate(parser.nextText());
} else if (name.equalsIgnoreCase(TITLE)){
currentMessage.setTitle(parser.nextText());
} else if (name.equalsIgnoreCase(HEADLINER)){
currentMessage.setHeadliner(parser.nextText());
} else if ((name.equalsIgnoreCase(IMAGE)) && (attrib.equalsIgnoreCase("large"))) {
currentMessage.setImage(parser.nextText());
}
}
break;
case XmlPullParser.END_TAG:
name = parser.getName();
if (name.equalsIgnoreCase(EVENT) && currentMessage != null){
messages.add(currentMessage);
} else if (name.equalsIgnoreCase(EVENTS)){
done = true;
}
break;
}
eventType = parser.next();
}
} catch (Exception e) {
Log.e("AndroidNews::PullFeedParser", e.getMessage(), e);
throw new RuntimeException(e);
}
return messages;
}
}
I do not think that it is possible. You have to write your own code to perform this kind of filtering.
But it is very simple. If you wish to create clean code write class named ImageAccessor with method getImage(String size). This method will get the data from digester and compare it with predefined size string (or pattern).

Categories

Resources