I need your help to parse XML announces.
Let's see the XML below :
<?xml version="1.0" encoding="UTF-8"?>
<FAVORIS>
<LOGGED>1</LOGGED>
<NOTICES NUM_PAGE="2">
<NOTICE>
<DEPARTMENT>33</DEPARTMENT>
<SOURCE>EMP</SOURCE>
<OBJET><![CDATA[Fourniture d'une solution régionale]]></OBJET>
<ID_PROCEDURE>7543</ID_PROCEDURE>
<ORGANISME><![CDATA[Gcs]]></ORGANISME>
<TYPE_AVIS>ZZ</TYPE_AVIS>
<TYPE_PROCEDURE>W3</TYPE_PROCEDURE>
<CATEGORIE>Service</CATEGORIE>
<DATE_OFFRE>16/09/2013 12:00</DATE_OFFRE>
<DECALAGE_HORAIRE>0</DECALAGE_HORAIRE>
</NOTICE>
<NOTICE>
<DEPARTMENT>33</DEPARTMENT>
<SOURCE>EMP</SOURCE>
<OBJET><![CDATA[Refonte du portail]]></OBJET>
<ID_PROCEDURE>4323</ID_PROCEDURE>
<ORGANISME><![CDATA[Mairie de W]]></ORGANISME>
<TYPE_AVIS>Z</TYPE_AVIS>
<TYPE_PROCEDURE>W1</TYPE_PROCEDURE>
<CATEGORIE>Service</CATEGORIE>
<DATE_OFFRE>03/09/2013 12:00</DATE_OFFRE>
<DECALAGE_HORAIRE>0</DECALAGE_HORAIRE>
</NOTICE>
....
</NOTICES>
</FAVORIS>
To parse this XML I followed the tutorial about XML Parsing in http://developer.android.com/training/basics/network-ops/xml.html
When I parse this XML I have an error like this :
expected: START_TAG {null}NOTICES (position:START_TAG <FAVORIS>#2:10 in java.io.InputStreamReader#40e69360)
It's due to the require but how can I resolve this issue ?
More about my code :
public class AOThumbnail_ParserXML {
private static final String ns = null;
public ArrayList<AOThumbnail> parse(String in) throws XmlPullParserException, IOException {
Decoder decoder = new Decoder();
String decoded_in = decoder.decode(in);
XmlPullParser parser = Xml.newPullParser();
parser.setFeature(XmlPullParser.FEATURE_PROCESS_NAMESPACES, true);
parser.setInput(new ByteArrayInputStream(decoded_in.getBytes()), null);
parser.nextTag();
return readFeed(parser);
}
private ArrayList<AOThumbnail> readFeed(XmlPullParser parser) throws XmlPullParserException, IOException {
ArrayList<AOThumbnail> entries = new ArrayList<AOThumbnail>();
parser.require(XmlPullParser.START_TAG, ns, "NOTICES");
while (parser.next() != XmlPullParser.END_TAG) {
if (parser.getEventType() != XmlPullParser.START_TAG) {
continue;
}
String name = parser.getName();
// Starts by looking for the entry tag
if (name.equals("NOTICE")) {
entries.add(readEntry(parser));
} else {
skip(parser);
}
}
return entries;
}
private AOThumbnail readEntry(XmlPullParser parser) throws XmlPullParserException, IOException {
parser.require(XmlPullParser.START_TAG, ns, "NOTICE");
String organisme = null;
String objet = null;
String date_candidature = null;
String date_offre = null;
String departement = null;
String source = null;
String idProcedure = null;
String decalageHoraire = null;
while (parser.next() != XmlPullParser.END_TAG) {
if (parser.getEventType() != XmlPullParser.START_TAG) {
continue;
}
String name = parser.getName();
if (name.equals("ORGANISME")) {
organisme = read(parser, "ORGANISME");
} else if (name.equals("OBJET")) {
objet = read(parser, "OBJET");
} else if (name.equals("DATE_CAND")) {
date_candidature = read(parser, "DATE_CAND");
} else if (name.equals("DATE_OFFRE")) {
date_offre = read(parser, "DATE_OFFRE");
} else if (name.equals("DEPARTMENT")) {
departement = read(parser, "DEPARTMENT");
} else if (name.equals("SOURCE")) {
source = read(parser, "SOURCE");
} else if (name.equals("ID_PROCEDURE")) {
idProcedure = read(parser, "ID_PROCEDURE");
} else if (name.equals("DECALAGE_HORAIRE")) {
decalageHoraire = read(parser, "DECALAGE_HORAIRE");
} else {
skip(parser);
}
}
String date = "";
try {
if (date_candidature.length() > 0)
date = date_candidature;
else if (date_offre.length() > 0)
date = date_offre;
else
date = "Se référer à l'annonce";
} catch (Exception ex) {
Logger.logit(ex);
}
return new AOThumbnail(organisme, objet, date, departement, idProcedure, source, Boolean.parseBoolean(decalageHoraire));
}
private String read(XmlPullParser parser, String balise) throws IOException, XmlPullParserException {
parser.require(XmlPullParser.START_TAG, ns, balise);
String content = readText(parser);
parser.require(XmlPullParser.END_TAG, ns, balise);
return content;
}
private String readText(XmlPullParser parser) throws IOException, XmlPullParserException {
String result = "";
if (parser.next() == XmlPullParser.TEXT) {
result = parser.getText();
parser.nextTag();
}
return result;
}
private void skip(XmlPullParser parser) throws XmlPullParserException, IOException {
if (parser.getEventType() != XmlPullParser.START_TAG) {
throw new IllegalStateException();
}
int depth = 1;
while (depth != 0) {
switch (parser.next()) {
case XmlPullParser.END_TAG:
depth--;
break;
case XmlPullParser.START_TAG:
depth++;
break;
}
}
}
}
If I parse an XML like this (see below), it works, but not with the XML above !
<NOTICES NUM_PAGE="2">
<NOTICE>
<DEPARTMENT>33</DEPARTMENT>
<SOURCE>EMP</SOURCE>
<OBJET><![CDATA[Fourniture d'une solution régionale]]></OBJET>
<ID_PROCEDURE>7543</ID_PROCEDURE>
<ORGANISME><![CDATA[Gcs]]></ORGANISME>
<TYPE_AVIS>ZZ</TYPE_AVIS>
<TYPE_PROCEDURE>W3</TYPE_PROCEDURE>
<CATEGORIE>Service</CATEGORIE>
<DATE_OFFRE>16/09/2013 12:00</DATE_OFFRE>
<DECALAGE_HORAIRE>0</DECALAGE_HORAIRE>
</NOTICE>
<NOTICE>
<DEPARTMENT>33</DEPARTMENT>
<SOURCE>EMP</SOURCE>
<OBJET><![CDATA[Refonte du portail]]></OBJET>
<ID_PROCEDURE>4323</ID_PROCEDURE>
<ORGANISME><![CDATA[Mairie de W]]></ORGANISME>
<TYPE_AVIS>Z</TYPE_AVIS>
<TYPE_PROCEDURE>W1</TYPE_PROCEDURE>
<CATEGORIE>Service</CATEGORIE>
<DATE_OFFRE>03/09/2013 12:00</DATE_OFFRE>
<DECALAGE_HORAIRE>0</DECALAGE_HORAIRE>
</NOTICE>
....
</NOTICES>
I hope you can help me,
Thanks a lot !
In readFeed() method, you are telling to your Parser that the current event should be an START_TAG with name "NOTICES", in other case the method will throw an Exception, which is the one you posted. Thats why the second .xml file works, but not the first (the first event of this file will be an START_TAG with name "FAVORIS").
So you should use another way to look up for the NOTICES block, you can use something like this (there are other ways to do it):
while (parser.next() != XmlPullParser.START_TAG && !(parser.getName().equals("NOTICES")));
So when this while loop ends, your parser will be at the start of the NOTICES block.
PS: I didn't test the code, so it might not work, but the solution should be similar.
Issue resolved using require "FAVORIS" !
private ArrayList<AOThumbnail> readFeed(XmlPullParser parser) throws XmlPullParserException, IOException {
ArrayList<AOThumbnail> entries = new ArrayList<AOThumbnail>();
//Go to FAVORIS' level
parser.require(XmlPullParser.START_TAG, ns, "FAVORIS");
while (parser.next() != XmlPullParser.END_TAG) {
if (parser.getEventType() != XmlPullParser.START_TAG) {
continue;
}
String name = parser.getName();
if(parser.getName().equals("NOTICES")) {
//Go to NOTICES' level
parser.require(XmlPullParser.START_TAG, ns, "NOTICES");
//Parse on Notices' level
return readFeed(parser);
}
// Starts by looking for the entry tag
if (name.equals("NOTICE")) {
Logger.logit("Dans notice !!!!!!");
entries.add(readEntry(parser));
} else {
skip(parser);
}
}
return entries;
}
Thanks to #javi9375 for his tips ;)
Related
Hello I'm new to XML parsing and completed the google tutorial for xml parsing. In the tutorial they are using: https://stackoverflow.com/feeds/tag?tagnames=android&sort=newest
So I wanted to expand this with the information about the author of the post and I tried it but it keeps returning null whatever I use that is within the author tag (uri tag and name tag)
My file where "the search for the tag" happens
/**
* This class parses XML feeds from stackoverflow.com.
* Given an InputStream representation of a feed, it returns a List of entries,
* where each list element represents a single entry (post) in the XML feed.
*/
public class StackOverflowXmlParser {
private static final String ns = null;
// We don't use namespaces
public List<Entry> parse(InputStream in) throws XmlPullParserException, IOException {
try {
XmlPullParser parser = Xml.newPullParser();
parser.setFeature(XmlPullParser.FEATURE_PROCESS_NAMESPACES, false);
parser.setInput(in, null);
parser.nextTag();
return readFeed(parser);
} finally {
in.close();
}
}
private List<Entry> readFeed(XmlPullParser parser) throws XmlPullParserException, IOException {
List<Entry> entries = new ArrayList<Entry>();
parser.require(XmlPullParser.START_TAG, ns, "feed");
while (parser.next() != XmlPullParser.END_TAG) {
if (parser.getEventType() != XmlPullParser.START_TAG) {
continue;
}
String name = parser.getName();
// Starts by looking for the entry tag
if (name.equals("entry")) {
entries.add(readEntry(parser));
}
else {
skip(parser);
}
}
return entries;
}
// This class represents a single entry (post) in the XML feed.
// It includes the data members "title," "link," and "summary."
public static class Entry
{
public final String rating;
public final String title;
public final String link;
public final String author;
private Entry(String rating, String title, String link, String author)
{
this.rating = rating;
this.title = title;
this.link = link;
this.author = author;
}
}
// Parses the contents of an entry. If it encounters a title, summary, or link tag, hands them
// off
// to their respective "read" methods for processing. Otherwise, skips the tag.
private Entry readEntry(XmlPullParser parser) throws XmlPullParserException, IOException {
parser.require(XmlPullParser.START_TAG, ns, "entry");
String rating = null;
String title = null;
String link = null;
String author = null;
while (parser.next() != XmlPullParser.END_TAG) {
if (parser.getEventType() != XmlPullParser.START_TAG) {
continue;
}
String name = parser.getName();
if (name.equals("re:rank")){
rating = readRating_name(parser);
}
else if (name.equals("title")){
title = readTitle_name(parser);
}
else if (name.equals("id")){
link = readLink_name(parser);
}
else if (name.equals("name")){
author = readAuthor_name(parser);
}
else
{
skip(parser);
}
}
return new Entry(rating, title, link, author);
}
// Processes title tags in the feed.
private String readRating_name(XmlPullParser parser) throws IOException, XmlPullParserException {
parser.require(XmlPullParser.START_TAG, ns, "re:rank");
String rating = readText(parser);
parser.require(XmlPullParser.END_TAG, ns, "re:rank");
return rating;
}
private String readTitle_name(XmlPullParser parser) throws IOException, XmlPullParserException {
parser.require(XmlPullParser.START_TAG, ns, "title");
String title = readText(parser);
parser.require(XmlPullParser.END_TAG, ns, "title");
return title;
}
private String readLink_name(XmlPullParser parser) throws IOException, XmlPullParserException {
parser.require(XmlPullParser.START_TAG, ns, "id");
String link = readText(parser);
parser.require(XmlPullParser.END_TAG, ns, "id");
return link;
}
private String readAuthor_name(XmlPullParser parser) throws IOException, XmlPullParserException {
parser.require(XmlPullParser.START_TAG, ns, "name");
String author = readText(parser);
parser.require(XmlPullParser.END_TAG, ns, "name");
return author;
}
// For the tags title and summary, extracts their text values.
private String readText(XmlPullParser parser)
throws IOException, XmlPullParserException {
String result = "";
if (parser.next() == XmlPullParser.TEXT) {
result = parser.getText();
parser.nextTag();
}
return result;
}
// Skips tags the parser isn't interested in. Uses depth to handle nested tags. i.e.,
// if the next tag after a START_TAG isn't a matching END_TAG, it keeps going until it
// finds the matching END_TAG (as indicated by the value of "depth" being 0).
private void skip(XmlPullParser parser) throws XmlPullParserException, IOException {
if (parser.getEventType() != XmlPullParser.START_TAG) {
throw new IllegalStateException();
}
int depth = 1;
while (depth != 0) {
switch (parser.next()) {
case XmlPullParser.END_TAG:
depth--;
break;
case XmlPullParser.START_TAG:
depth++;
break;
}
}
}
}
my class where I output the results of the parsing:
// Uploads XML from stackoverflow.com, parses it, and combines it with
// HTML markup. Returns HTML string.
private String loadXmlFromNetwork(String urlString) throws XmlPullParserException, IOException
{
InputStream stream = null;
StackOverflowXmlParser stackOverflowXmlParser = new StackOverflowXmlParser();
List<Entry> entries = null;
StringBuilder htmlString = new StringBuilder();
try {
stream = downloadUrl(urlString);
entries = stackOverflowXmlParser.parse(stream);
// Makes sure that the InputStream is closed after the app is
// finished using it.
}
finally
{
if (stream != null)
{
stream.close();
}
}
// Content section
for (Entry entry : entries)
{
// Name link
String question = getString(R.string.question);
String rating = getString(R.string.rating);
String author = getString(R.string.author);
// Question link + title
htmlString.append("<p>" + question + "<a href='" + entry.link + "'>" + entry.title + "</a><br />");
htmlString.append(rating + entry.rating + "<br />");
htmlString.append(author + entry.author + "</p>");
}
return htmlString.toString();
}
Hope it helps you, instead of xml parsing use this approach which gives json data [Easy to code]::
String xml = "xml here...";
XMLSerializer xmlSerializer = new XMLSerializer();
JSON json = xmlSerializer.read( xml );
You can get json-lib -- http://json-lib.sourceforge.net/index.html
For easily understanding XML Parsing go through this tutorial.
Its really easy to understand.
XML Parsing-Android Hive
I've created an XML parser based on the Android docs example found here:
http://developer.android.com/training/basics/network-ops/xml.html
The primary difference between the docs's implementation and my own is I am attempting to implement the XML parser within my Activity - however when I attempt to do so the code does not seem to be reached although it is in onCreate (which I though would start it) I'm not sure exactly what should be done to correct this issue - but any suggestions/input is greatly appreciated.
I've tried moving the closing bracket at the end of my parser however I end up with 84 new errors (mostly duplicate local variable errors - I can provide them if necessary)
I believe I'll need to leave the bracket where it is - and simply call the parser from the android docs example somehow - but I'm not sure exactly how this should be done.
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
}
// ///////////////////////////////////////////////////////
// Read and Parse XML Data from BootConfiguration
// ///////////////////////////////////////////////////////
public List parse(InputStream in) throws XmlPullParserException,
IOException {
try {
XmlPullParser parser = Xml.newPullParser();
parser.setFeature(XmlPullParser.FEATURE_PROCESS_NAMESPACES, false);
parser.setInput(in, Environment.getExternalStorageDirectory()
+ "/BootConfiguration.xml");
parser.nextTag();
return readFeed(parser);
} finally {
in.close();
}
}
private List readFeed(XmlPullParser parser) throws XmlPullParserException,
IOException {
List entries = new ArrayList();
parser.require(XmlPullParser.START_TAG, ns, "feed");
while (parser.next() != XmlPullParser.END_TAG) {
if (parser.getEventType() != XmlPullParser.START_TAG) {
continue;
}
String name = parser.getName();
// Starts by looking for the entry tag
if (name.equals("BootConfiguration")) {
entries.add(readEntry(parser));
} else {
skip(parser);
}
}
return entries;
}
public static class Entry {
public final String url;
public final String user;
public final String password;
private Entry(String url, String user, String password) {
this.url = url;
this.user = user;
this.password = password;
}
}
// Parses the contents of an entry. If it encounters a title, summary, or
// link tag, hands them off
// to their respective "read" methods for processing. Otherwise, skips the
// tag.
private Entry readEntry(XmlPullParser parser)
throws XmlPullParserException, IOException {
parser.require(XmlPullParser.START_TAG, ns, "BootConfiguration");
String url = null;
String user = null;
String password = null;
while (parser.next() != XmlPullParser.END_TAG) {
if (parser.getEventType() != XmlPullParser.START_TAG) {
continue;
}
String name = parser.getName();
if (name.equals("ServiceUrl")) {
url = readUrl(parser);
} else if (name.equals("ApiUser")) {
user = readUser(parser);
} else if (name.equals("ApiPassword")) {
password = readPassword(parser);
} else {
skip(parser);
}
}
return new Entry(url, user, password);
}
// Processes title tags in the feed.
private String readUrl(XmlPullParser parser) throws IOException,
XmlPullParserException {
parser.require(XmlPullParser.START_TAG, ns, "ServiceUrl");
String url = readText(parser);
parser.require(XmlPullParser.END_TAG, ns, "ServiceUrl");
return url;
}
// Processes title tags in the feed.
private String readUser(XmlPullParser parser) throws IOException,
XmlPullParserException {
parser.require(XmlPullParser.START_TAG, ns, "ApiUser");
String user = readText(parser);
parser.require(XmlPullParser.END_TAG, ns, "ApiUser");
return user;
}
// Processes summary tags in the feed.
private String readPassword(XmlPullParser parser) throws IOException,
XmlPullParserException {
parser.require(XmlPullParser.START_TAG, ns, "ApiPassword");
String password = readText(parser);
parser.require(XmlPullParser.END_TAG, ns, "ApiPassword");
return password;
}
// For the tags title and summary, extracts their text values.
private String readText(XmlPullParser parser) throws IOException,
XmlPullParserException {
String result = "";
if (parser.next() == XmlPullParser.TEXT) {
result = parser.getText();
parser.nextTag();
}
return result;
}
private void skip(XmlPullParser parser) throws XmlPullParserException,
IOException {
if (parser.getEventType() != XmlPullParser.START_TAG) {
throw new IllegalStateException();
}
int depth = 1;
while (depth != 0) {
switch (parser.next()) {
case XmlPullParser.END_TAG:
depth--;
break;
case XmlPullParser.START_TAG:
depth++;
break;
}
}
// //////////////////////////////////////////////////////
// End XML Parser
// //////////////////////////////////////////////////////
The xmlpullparser doesn't read my second tag term. It always iterates two times: the first time, the inner tag is correctly identified as appname, but the second time, it's null.
public static List<Term> readConfig(XmlPullParser parser)
throws XmlPullParserException, IOException
{
List<Term> terms = null;
parser.require(XmlPullParser.START_TAG, ns, "app");
while (parser.next() != XmlPullParser.END_TAG)
{
if (parser.getEventType() != XmlPullParser.START_TAG)
{
continue;
}
String innerTag = parser.getName();
if (innerTag.equals("appname"))
{
Logger.log("2");
}
else if (innerTag.equals("term"))
{
// terms = readTerm(parser);
Logger.log("1");
}
}
return terms;
}
My xml File
<?xml version="1.0" encoding="UTF-8"?>
<app>
<appname>abdalla</appname>
<term>term1</term>
</app>
the exception
Caused by: java.lang.NullPointerException: asset
at android.content.res.AssetManager.getAssetRemainingLength(Native Method)
at android.content.res.AssetManager.access$300(AssetManager.java:36)
at android.content.res.AssetManager$AssetInputStream.available(AssetManager.java:555)
at java.io.InputStreamReader.read(InputStreamReader.java:234)
at org.kxml2.io.KXmlParser.fillBuffer(KXmlParser.java:1496)
at org.kxml2.io.KXmlParser.readValue(KXmlParser.java:1340)
at org.kxml2.io.KXmlParser.next(KXmlParser.java:390)
at org.kxml2.io.KXmlParser.next(KXmlParser.java:310)
at model.XMLParser.readConfig(XMLParser.java:55)
at com.example.xmlparser.MainActivity.onCreate(MainActivity.java:47)
at android.app.Activity.performCreate(Activity.java:5133)
i found the answer i fixed the above code with
else if (eventType == XmlPullParser.END_TAG && tagName.equals(APP))
{
break;
}
the complete code
public static List<Term> readConfig(XmlPullParser parser)
throws XmlPullParserException, IOException
{
List<Term> terms = new ArrayList<Term>();
parser.require(XmlPullParser.START_TAG, ns, APP);
int eventType = parser.getEventType();
while (eventType != XmlPullParser.END_DOCUMENT)
{
String tagName = parser.getName();
if (eventType == XmlPullParser.START_TAG && tagName.equals(APP))
{
// Attrubites
String name = parser.getAttributeValue(null, "name");
Logger.log(name);
}
else if (tagName != null && tagName.equals(TERM))
{
terms.add(readTerm(parser));
}
else if (eventType == XmlPullParser.END_TAG && tagName.equals(APP))
{
break;
}
eventType = parser.next();
}
return terms;
}
I am very new to Java/Android programming and need a little help with a part of code that I am writing. I have been following the "Parsing XML Data" article with no success.
The intent from "MainActivity.java" sends the string with XML text to my "ParsingXMLStringActivity.java" that I like to parse. Here is the string I successfully get over to the activity.
<action><app>survo</app><parameters><id>5666</id><p_t>205</p_t></parameters></action>
Now if wrote the public class XmlParser within the following "ParsingXMLStringActivity.java"
public class ParsingXMLStringActivity extends Activity {
#Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_parsing_xmlstring);
// Get the message from the intent
Intent intent = getIntent();
String receivedXMLstring = intent.getStringExtra(MainActivity.Authorize.XML_STRING);
System.out.println("STRING:"+receivedXMLstring);
InputStream in_stream;
try {
in_stream = new ByteArrayInputStream(receivedXMLstring.getBytes("UTF-8"));
System.out.println("STREAM:"+ in_stream);
XmlParser XmlParser = new XmlParser();
System.out.println("XmlParser:"+ XmlParser);
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} // OnCreate
public class XmlParser {
public final String ns = null;
public List<Entry> parse (InputStream in_stream) throws XmlPullParserException, IOException {
try {
XmlPullParser parser = Xml.newPullParser();
parser.setFeature(XmlPullParser.FEATURE_PROCESS_NAMESPACES, false);
parser.setInput(in_stream, null);
parser.nextTag();
return readAction(parser);
} finally {
in_stream.close();
}
} // Public List Parse
private List<Entry> readAction(XmlPullParser parser) throws XmlPullParserException, IOException {
List<Entry> action = new ArrayList<Entry>();
parser.require(XmlPullParser.START_TAG, ns, "action");
while (parser.next() != XmlPullParser.END_TAG) {
if (parser.getEventType() != XmlPullParser.START_TAG) {
continue;
}
String name = parser.getName();
// Starts by looking for the entry tag
if (name.equals("action")) {
action.add(readParameters(parser));
} else {
skip(parser);
}
}
System.out.println("ACTION: "+action);
return action;
} // Public List ReadAction
private void skip(XmlPullParser parser) throws XmlPullParserException, IOException {
if (parser.getEventType() != XmlPullParser.START_TAG) {
throw new IllegalStateException();
}
int depth = 1;
while (depth != 0) {
switch (parser.next()) {
case XmlPullParser.END_TAG:
depth--;
break;
case XmlPullParser.START_TAG:
depth++;
break;
}
}
} //Private Void Skip
public class Entry {
public final String id;
public final String pt;
private Entry(String id, String pt) {
this.id = id;
this.pt = pt;
}
} // public static class Entry
private Entry readParameters(XmlPullParser parser) throws XmlPullParserException, IOException {
parser.require(XmlPullParser.START_TAG, ns, "entry");
String id = null;
String pt = null;
while (parser.next() != XmlPullParser.END_TAG) {
if (parser.getEventType() != XmlPullParser.START_TAG) {
continue;
}
String name = parser.getName();
if (name.equals("id")) {
id = readId(parser);
System.out.println("ID: "+ id);
} else if (name.equals("p_t")) {
pt = readPT(parser);
System.out.println("P_T: "+ pt);
}
else {
skip(parser);
}
}
return new Entry(id, pt);
} // Private Entry
// Processes title tags in the feed.
private String readId(XmlPullParser parser) throws IOException, XmlPullParserException {
parser.require(XmlPullParser.START_TAG, ns, "id");
String id = readText(parser);
parser.require(XmlPullParser.END_TAG, ns, "id");
System.out.println(id);
return id;
}
private String readPT(XmlPullParser parser) throws IOException, XmlPullParserException {
parser.require(XmlPullParser.START_TAG, ns, "p_t");
String pt = readText(parser);
parser.require(XmlPullParser.END_TAG, ns, "p_t");
System.out.println(pt);
return pt;
}
private String readText(XmlPullParser parser) throws IOException, XmlPullParserException {
String result = "";
if (parser.next() == XmlPullParser.TEXT) {
result = parser.getText();
parser.nextTag();
}
System.out.println(result);
return result;
}
}
The System.out.println("ID: "+ id); are not giving any information to the LogCat and I have set break points to check if the parser is even started but it does not seem to start the parsing process with the string I supply.
Does someone have an idea and can tell me what part I am missing???
Kind regards, Ben
I am trying to pull some XHTML out of an RSS feed so I can place it in a WebView. The RSS feed in question has a tag called <content> and the characters inside the content are XHTML. (The site I'm paring is a blogger feed)
What is the best way to try to pull this content? The < characters are confusing my parser. I have tried both DOM and SAX but neither can handle this very well.
Here is a sample of the XML as requested. In this case, I want basically XHTML inside the content tag to be a string. <content> XHTML </content>
Edit: based on ignyhere's suggestion I have tried XPath, but I am still having the same issue. Here is a pastebin sample of my tests.
It's not pretty, but this is (the essence of) what I use to parse an ATOM feed from Blogger using XmlPullParser. The code is pretty icky, but it is from a real app. You can probably get the general flavor of it, anyway.
final String TAG_FEED = "feed";
public int parseXml(Reader reader) {
XmlPullParserFactory factory = null;
StringBuilder out = new StringBuilder();
int entries = 0;
try {
factory = XmlPullParserFactory.newInstance();
factory.setNamespaceAware(true);
XmlPullParser xpp = factory.newPullParser();
xpp.setInput(reader);
while (true) {
int eventType = xpp.next();
if (eventType == XmlPullParser.END_DOCUMENT) {
break;
} else if (eventType == XmlPullParser.START_DOCUMENT) {
out.append("Start document\n");
} else if (eventType == XmlPullParser.START_TAG) {
String tag = xpp.getName();
// out.append("Start tag " + tag + "\n");
if (TAG_FEED.equalsIgnoreCase(tag)) {
entries = parseFeed(xpp);
}
} else if (eventType == XmlPullParser.END_TAG) {
// out.append("End tag " + xpp.getName() + "\n");
} else if (eventType == XmlPullParser.TEXT) {
// out.append("Text " + xpp.getText() + "\n");
}
}
out.append("End document\n");
} catch (XmlPullParserException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
// return out.toString();
return entries;
}
private int parseFeed(XmlPullParser xpp) throws XmlPullParserException, IOException {
int depth = xpp.getDepth();
assert (depth == 1);
int eventType;
int entries = 0;
xpp.require(XmlPullParser.START_TAG, null, TAG_FEED);
while (((eventType = xpp.next()) != XmlPullParser.END_DOCUMENT) && (xpp.getDepth() > depth)) {
// loop invariant: At this point, the parser is not sitting on
// end-of-document, and is at a level deeper than where it started.
if (eventType == XmlPullParser.START_TAG) {
String tag = xpp.getName();
// Log.d("parseFeed", "Start tag: " + tag); // Uncomment to debug
if (FeedEntry.TAG_ENTRY.equalsIgnoreCase(tag)) {
FeedEntry feedEntry = new FeedEntry(xpp);
feedEntry.persist(this);
entries++;
// Log.d("FeedEntry", feedEntry.title); // Uncomment to debug
// xpp.require(XmlPullParser.END_TAG, null, tag);
}
}
}
assert (depth == 1);
return entries;
}
class FeedEntry {
String id;
String published;
String updated;
// Timestamp lastRead;
String title;
String subtitle;
String authorName;
int contentType;
String content;
String preview;
String origLink;
String thumbnailUri;
// Media media;
static final String TAG_ENTRY = "entry";
static final String TAG_ENTRY_ID = "id";
static final String TAG_TITLE = "title";
static final String TAG_SUBTITLE = "subtitle";
static final String TAG_UPDATED = "updated";
static final String TAG_PUBLISHED = "published";
static final String TAG_AUTHOR = "author";
static final String TAG_CONTENT = "content";
static final String TAG_TYPE = "type";
static final String TAG_ORIG_LINK = "origLink";
static final String TAG_THUMBNAIL = "thumbnail";
static final String ATTRIBUTE_URL = "url";
/**
* Create a FeedEntry by pulling its bits out of an XML Pull Parser. Side effect: Advances
* XmlPullParser.
*
* #param xpp
*/
public FeedEntry(XmlPullParser xpp) {
int eventType;
int depth = xpp.getDepth();
assert (depth == 2);
try {
xpp.require(XmlPullParser.START_TAG, null, TAG_ENTRY);
while (((eventType = xpp.next()) != XmlPullParser.END_DOCUMENT)
&& (xpp.getDepth() > depth)) {
if (eventType == XmlPullParser.START_TAG) {
String tag = xpp.getName();
if (TAG_ENTRY_ID.equalsIgnoreCase(tag)) {
id = Util.XmlPullTag(xpp, TAG_ENTRY_ID);
} else if (TAG_TITLE.equalsIgnoreCase(tag)) {
title = Util.XmlPullTag(xpp, TAG_TITLE);
} else if (TAG_SUBTITLE.equalsIgnoreCase(tag)) {
subtitle = Util.XmlPullTag(xpp, TAG_SUBTITLE);
} else if (TAG_UPDATED.equalsIgnoreCase(tag)) {
updated = Util.XmlPullTag(xpp, TAG_UPDATED);
} else if (TAG_PUBLISHED.equalsIgnoreCase(tag)) {
published = Util.XmlPullTag(xpp, TAG_PUBLISHED);
} else if (TAG_CONTENT.equalsIgnoreCase(tag)) {
int attributeCount = xpp.getAttributeCount();
for (int i = 0; i < attributeCount; i++) {
String attributeName = xpp.getAttributeName(i);
if (attributeName.equalsIgnoreCase(TAG_TYPE)) {
String attributeValue = xpp.getAttributeValue(i);
if (attributeValue
.equalsIgnoreCase(FeedReaderContract.FeedEntry.ATTRIBUTE_NAME_HTML)) {
contentType = FeedReaderContract.FeedEntry.CONTENT_TYPE_HTML;
} else if (attributeValue
.equalsIgnoreCase(FeedReaderContract.FeedEntry.ATTRIBUTE_NAME_XHTML)) {
contentType = FeedReaderContract.FeedEntry.CONTENT_TYPE_XHTML;
} else {
contentType = FeedReaderContract.FeedEntry.CONTENT_TYPE_TEXT;
}
break;
}
}
content = Util.XmlPullTag(xpp, TAG_CONTENT);
extractPreview();
} else if (TAG_AUTHOR.equalsIgnoreCase(tag)) {
// Skip author for now -- it is complicated
int authorDepth = xpp.getDepth();
assert (authorDepth == 3);
xpp.require(XmlPullParser.START_TAG, null, TAG_AUTHOR);
while (((eventType = xpp.next()) != XmlPullParser.END_DOCUMENT)
&& (xpp.getDepth() > authorDepth)) {
}
assert (xpp.getDepth() == 3);
xpp.require(XmlPullParser.END_TAG, null, TAG_AUTHOR);
} else if (TAG_ORIG_LINK.equalsIgnoreCase(tag)) {
origLink = Util.XmlPullTag(xpp, TAG_ORIG_LINK);
} else if (TAG_THUMBNAIL.equalsIgnoreCase(tag)) {
thumbnailUri = Util.XmlPullAttribute(xpp, tag, null, ATTRIBUTE_URL);
} else {
#SuppressWarnings("unused")
String throwAway = Util.XmlPullTag(xpp, tag);
}
}
} // while
} catch (XmlPullParserException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
assert (xpp.getDepth() == 2);
}
}
public static String XmlPullTag(XmlPullParser xpp, String tag)
throws XmlPullParserException, IOException {
xpp.require(XmlPullParser.START_TAG, null, tag);
String itemText = xpp.nextText();
if (xpp.getEventType() != XmlPullParser.END_TAG) {
xpp.nextTag();
}
xpp.require(XmlPullParser.END_TAG, null, tag);
return itemText;
}
public static String XmlPullAttribute(XmlPullParser xpp,
String tag, String namespace, String name)
throws XmlPullParserException, IOException {
assert (!TextUtils.isEmpty(tag));
assert (!TextUtils.isEmpty(name));
xpp.require(XmlPullParser.START_TAG, null, tag);
String itemText = xpp.getAttributeValue(namespace, name);
if (xpp.getEventType() != XmlPullParser.END_TAG) {
xpp.nextTag();
}
xpp.require(XmlPullParser.END_TAG, null, tag);
return itemText;
}
I'll give you a hint: None of the return values matter. The data is saved into a database by a method (not shown) called at this line:
feedEntry.persist(this);
I would attempt to attack it with XPath. Would something like this work?
public static String parseAtom (InputStream atomIS)
throws Exception {
// Below should yield the second content block
String xpathString = "(//*[starts-with(name(),"content")])[2]";
// or, String xpathString = "//*[name() = 'content'][2]";
// remove the '[2]' to get all content tags or get the count,
// if needed, and then target specific blocks
//String xpathString = "count(//*[starts-with(name(),"content")])";
// note the evaluate expression below returns a glob and not a node set
XPathFactory xpf = XPathFactory.newInstance ();
XPath xpath = xpf.newXPath ();
XPathExpression xpathCompiled = xpath.compile (xpathString);
// use the first to recast and evaluate as NodeList
//Object atomOut = xpathCompiled.evaluate (
// new InputSource (atomIS), XPathConstants.NODESET);
String atomOut = xpathCompiled.evaluate (
new InputSource (atomIS), XPathConstants.STRING);
System.out.println (atomOut);
return atomOut;
}
I can see your problem here, the reason why these parsers are not producing the correct result is because contents of your <content> tag are not wrapped into <![CDATA[ ]]>, what I would do until I find more adequate solution I'd use quick and dirty trick :
private void parseFile(String fileName) throws IOException {
String line;
BufferedReader br = new BufferedReader(new FileReader(new File(fileName)));
StringBuilder sb = new StringBuilder();
boolean match = false;
while ((line = br.readLine()) != null) {
if(line.contains("<content")){
sb.append(line);
sb.append("\n");
match = true;
continue;
}
if(match){
sb.append(line);
sb.append("\n");
match = false;
}
if(line.contains("</content")){
sb.append(line);
sb.append("\n");
}
}
System.out.println(sb.toString());
}
This will give you all content in String. You can optionaly seperate them by slightly modyfiying this method or if you don't need actual <content> you can filter that out as well.