Java -Android. Parser problem - java

I am making a very simple app with an RSS reader. The reader works great, but it's only giving me the title, and i want the description too.
I'am very new to android, and I have tried a lot of things, but I can't get it to work.
I've found a lot of parsers but they are to complicated for me to understand, so I was hoping to find a simple solution, since it's only title and description i want.
Can anyone help me?
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import android.app.Activity;
import android.os.Bundle;
import android.widget.TextView;
public class NyhedActivity extends Activity {
String streamTitle = "";
#Override
protected void onCreate(Bundle savedInstanceState) {
// TODO Auto-generated method stub
super.onCreate(savedInstanceState);
setContentView(R.layout.nyheder);
TextView result = (TextView)findViewById(R.id.result);
try {
URL rssUrl = new URL("http://tv2sport.dk/rss/*/*/*/248/*/*");
SAXParserFactory mySAXParserFactory = SAXParserFactory.newInstance();
SAXParser mySAXParser = mySAXParserFactory.newSAXParser();
XMLReader myXMLReader = mySAXParser.getXMLReader();
RSSHandler myRSSHandler = new RSSHandler();
myXMLReader.setContentHandler(myRSSHandler);
InputSource myInputSource = new InputSource(rssUrl.openStream());
myXMLReader.parse(myInputSource);
result.setText(streamTitle);
} catch (MalformedURLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
result.setText("Cannot connect RSS!");
} catch (ParserConfigurationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
result.setText("Cannot connect RSS!");
} catch (SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
result.setText("Cannot connect RSS!");
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
result.setText("Cannot connect RSS!");
}
}
private class RSSHandler extends DefaultHandler
{
final int stateUnknown = 0;
final int stateTitle = 1;
int state = stateUnknown;
int numberOfTitle = 0;
String strTitle = "";
String strElement = "";
#Override
public void startDocument() throws SAXException {
// TODO Auto-generated method stub
strTitle = "Nyheder fra ";
}
#Override
public void endDocument() throws SAXException {
// TODO Auto-generated method stub
strTitle += "";
streamTitle = "" + strTitle;
}
#Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
// TODO Auto-generated method stub
if (localName.equalsIgnoreCase("title"))
{
state = stateTitle;
strElement = "";
numberOfTitle++;
}
else
{
state = stateUnknown;
}
}
#Override
public void endElement(String uri, String localName, String qName)
throws SAXException {
// TODO Auto-generated method stub
if (localName.equalsIgnoreCase("title"))
{
strTitle += strElement + "\n"+"\n";
}
state = stateUnknown;
}
#Override
public void characters(char[] ch, int start, int length)
throws SAXException {
// TODO Auto-generated method stub
String strCharacters = new String(ch, start, length);
if (state == stateTitle)
{
strElement += strCharacters;
}
}
}
}

I've never really used SAX, when it comes to parsing XML in Java. I allways use JDOM. It's simple and really easy to use.
To read a an XML-file with JDOM, you create a document and fill it using an InputStream and a SAXBuilder:
SAXBuilder builder = new SAXBuilder();
Document document = builder.builder( myInputStream );
In your posted case: myInputStream = url.openStream();
Then you need to fetch the root of the XML-document:
Element root = document.getRootElement();
Now it's very simple. Since I don't know the structure of the XML you're getting I'll just assume that it looks something like:
<rssfeed>
<news>
<title> Title </title>
<description> Description </description>
</news>
<news>
<title> ... </title>
<description> ... </description>
</news>
<news>
<title> ... </title>
<description> ... </description>
</news>
<rssfeed>
You can then list all elements of like this:
List<Element> news = root.getChildren( "news" );
Then you run through the list in a for-each loop, getting the title and description (Having a data-class to hold these information would be a help e.g. a News-class):
ArrayList<News> newsList = new ArrayList<News>();
for( Element child : news ) {
News news = new News();
news.setTitle( child.getChildText( "title" );
news.setDescription( child.getChildText( "description" );
newsList.add( news );
}
Now you have a list of news that you can play around with.

Kano,
You can simplify your life and get top-notch performance by utilizing SJXP to write this RSS feed parser with (disclaimer: I am the author).
SJXP is a very-very thin abstraction layer that sits on top of the XML Pull Parsing API (Android provides its own so you only have the sjxp.JAR dependency, XPP3 for every other platform) and allows you to use XPath-like parsing rules for matching simple rules against certain places of a document and then telling the parsing what information you want from those locations.
I wrote an example Eclipse project for you that parses that TV2 Sports feed for you in 6 minutes (I'll link it at the bottom).
The main method looks like this so you get an idea of the flow:
public static void main(String[] args) throws IllegalArgumentException,
XMLParserException, IOException {
// Location we want to parse.
URL feedURL = new URL("http://tv2sport.dk/rss/*/*/*/248/*/*");
// List we will hold all parsed stories in.
List<Item> itemList = new ArrayList<Item>();
// Get all the rules we will use to parse this file
IRule[] rules = createRules();
// Create the parser and populate it with the rules.
XMLParser<List<Item>> parser = new XMLParser<List<Item>>(rules);
// Parse the RSS feed.
parser.parse(feedURL.openStream(), itemList);
// Print the results.
System.out.println("Parsed " + itemList.size() + " RSS items.");
for (Item i : itemList)
System.out.println("\t" + i);
}
You see the flow starts with creating our List to hold our Items in as we parse them from the doc. Then we get a set of IRule instances to give the parser, then create the parser and give it the rules to use while working.
We then invoke the parse method on the contents of the feed and pass it what is called a "user object", more specifically, just an instance of anything that we would like it to pass-through to the rules when they execute.
In this case, we want access to our List so we can add items to it, so we just pass that in and the parser passes it right through to our IRule logic when it executes so we can use it.
The Item class utilizes is just a simple POJO to hold the data and make printing look nice:
public class Item {
public String title;
public String description;
#Override
public String toString() {
return "Item [title='" + title + "', description='" + description + "']";
}
}
All the interesting stuff happens in your IRule where you define what kind of element you are targeting (character data, attribute data or just tag open/close events) and then override the appropriate method from the IRule interface to provide a handler that does something.
For example, here is the handler that parses the titles:
IRule<List<Item>> itemDescRule = new DefaultRule<List<Item>>(Type.CHARACTER, "/rss/channel/item/description") {
#Override
public void handleParsedCharacters(XMLParser<List<Item>> parser, String text, List<Item> userObject) {
Item item = userObject.get(userObject.size() - 1);
item.description = text;
}
};
You see that you get given the parser instance itself (so you can trigger the 'stop' method if you want to end parsing early), you get the text that was the character data and you get that 'user object' which happened to be our list passed through to you.
We grab the item we are populating out of the list, give it the description and that's it. 2 lines of code.
There is another IRule that adds a new Item to the list every time an open-tag is encountered, that is what allows our subsequent rules like this one to just pop the end element off the list and populate it.
When you run the project, the output looks like this:
Parsed 50 RSS items.
Item [title='Barcas bøddel beæret over Barca-føler', description='Tirsdag snød Thiago Silva Barcelona for tre point, da han headede AC Milans udligning i kassen i Champions League-kampens overtid.']
Item [title='Guardiola: Pato hurtigere end Usain Bolt', description='FC Barcelona-træner, Josep Guardiola, er dybt imponeret af Milan-målscoreren Alexandre Patos hurtighed.']
Item [title='Milan-profil: Vi kan nå semifinalen', description='Clarence Seedorf mener, at AC Milan kan nå semifinalerne i Champions League efter 2-2 i Barcelona.']
<SNIP...>
You can download the entire Eclipse project I created for you here.
Hope that helps.

I hope I can help you:
#Override
public void endElement(String uri, String localName, String qName)
throws SAXException {
// TODO Auto-generated method stub
if (localName.equalsIgnoreCase("title"))
{
strTitle += strElement + "\n"+"\n";
}
else if (localName.equalsIgnoreCase("lead"))
{
lead += strElement + "\n"+"\n";
}
}

Related

How to read this XML file in Java? [duplicate]

I have to read and write to and from an XML file. What is the easiest way to read and write XML files using Java?
Here is a quick DOM example that shows how to read and write a simple xml file with its dtd:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE roles SYSTEM "roles.dtd">
<roles>
<role1>User</role1>
<role2>Author</role2>
<role3>Admin</role3>
<role4/>
</roles>
and the dtd:
<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT roles (role1,role2,role3,role4)>
<!ELEMENT role1 (#PCDATA)>
<!ELEMENT role2 (#PCDATA)>
<!ELEMENT role3 (#PCDATA)>
<!ELEMENT role4 (#PCDATA)>
First import these:
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;
import org.xml.sax.*;
import org.w3c.dom.*;
Here are a few variables you will need:
private String role1 = null;
private String role2 = null;
private String role3 = null;
private String role4 = null;
private ArrayList<String> rolev;
Here is a reader (String xml is the name of your xml file):
public boolean readXML(String xml) {
rolev = new ArrayList<String>();
Document dom;
// Make an instance of the DocumentBuilderFactory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
// use the factory to take an instance of the document builder
DocumentBuilder db = dbf.newDocumentBuilder();
// parse using the builder to get the DOM mapping of the
// XML file
dom = db.parse(xml);
Element doc = dom.getDocumentElement();
role1 = getTextValue(role1, doc, "role1");
if (role1 != null) {
if (!role1.isEmpty())
rolev.add(role1);
}
role2 = getTextValue(role2, doc, "role2");
if (role2 != null) {
if (!role2.isEmpty())
rolev.add(role2);
}
role3 = getTextValue(role3, doc, "role3");
if (role3 != null) {
if (!role3.isEmpty())
rolev.add(role3);
}
role4 = getTextValue(role4, doc, "role4");
if ( role4 != null) {
if (!role4.isEmpty())
rolev.add(role4);
}
return true;
} catch (ParserConfigurationException pce) {
System.out.println(pce.getMessage());
} catch (SAXException se) {
System.out.println(se.getMessage());
} catch (IOException ioe) {
System.err.println(ioe.getMessage());
}
return false;
}
And here a writer:
public void saveToXML(String xml) {
Document dom;
Element e = null;
// instance of a DocumentBuilderFactory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
// use factory to get an instance of document builder
DocumentBuilder db = dbf.newDocumentBuilder();
// create instance of DOM
dom = db.newDocument();
// create the root element
Element rootEle = dom.createElement("roles");
// create data elements and place them under root
e = dom.createElement("role1");
e.appendChild(dom.createTextNode(role1));
rootEle.appendChild(e);
e = dom.createElement("role2");
e.appendChild(dom.createTextNode(role2));
rootEle.appendChild(e);
e = dom.createElement("role3");
e.appendChild(dom.createTextNode(role3));
rootEle.appendChild(e);
e = dom.createElement("role4");
e.appendChild(dom.createTextNode(role4));
rootEle.appendChild(e);
dom.appendChild(rootEle);
try {
Transformer tr = TransformerFactory.newInstance().newTransformer();
tr.setOutputProperty(OutputKeys.INDENT, "yes");
tr.setOutputProperty(OutputKeys.METHOD, "xml");
tr.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
tr.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, "roles.dtd");
tr.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
// send DOM to file
tr.transform(new DOMSource(dom),
new StreamResult(new FileOutputStream(xml)));
} catch (TransformerException te) {
System.out.println(te.getMessage());
} catch (IOException ioe) {
System.out.println(ioe.getMessage());
}
} catch (ParserConfigurationException pce) {
System.out.println("UsersXML: Error trying to instantiate DocumentBuilder " + pce);
}
}
getTextValue is here:
private String getTextValue(String def, Element doc, String tag) {
String value = def;
NodeList nl;
nl = doc.getElementsByTagName(tag);
if (nl.getLength() > 0 && nl.item(0).hasChildNodes()) {
value = nl.item(0).getFirstChild().getNodeValue();
}
return value;
}
Add a few accessors and mutators and you are done!
Writing XML using JAXB (Java Architecture for XML Binding):
http://www.mkyong.com/java/jaxb-hello-world-example/
package com.mkyong.core;
import javax.xml.bind.annotation.XmlAttribute;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
#XmlRootElement
public class Customer {
String name;
int age;
int id;
public String getName() {
return name;
}
#XmlElement
public void setName(String name) {
this.name = name;
}
public int getAge() {
return age;
}
#XmlElement
public void setAge(int age) {
this.age = age;
}
public int getId() {
return id;
}
#XmlAttribute
public void setId(int id) {
this.id = id;
}
}
package com.mkyong.core;
import java.io.File;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Marshaller;
public class JAXBExample {
public static void main(String[] args) {
Customer customer = new Customer();
customer.setId(100);
customer.setName("mkyong");
customer.setAge(29);
try {
File file = new File("C:\\file.xml");
JAXBContext jaxbContext = JAXBContext.newInstance(Customer.class);
Marshaller jaxbMarshaller = jaxbContext.createMarshaller();
// output pretty printed
jaxbMarshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
jaxbMarshaller.marshal(customer, file);
jaxbMarshaller.marshal(customer, System.out);
} catch (JAXBException e) {
e.printStackTrace();
}
}
}
The above answer only deal with DOM parser (that normally reads the entire file in memory and parse it, what for a big file is a problem), you could use a SAX parser that uses less memory and is faster (anyway that depends on your code).
SAX parser callback some functions when it find a start of element, end of element, attribute, text between elements, etc, so it can parse the document and at the same time you
get what you need.
Some example code:
http://www.mkyong.com/java/how-to-read-xml-file-in-java-sax-parser/
The answers only cover DOM / SAX and a copy paste implementation of a JAXB example.
However, one big area of when you are using XML is missing. In many projects / programs there is a need to store / retrieve some basic data structures. Your program has already a classes for your nice and shiny business objects / data structures, you just want a comfortable way to convert this data to a XML structure so you can do more magic on it (store, load, send, manipulate with XSLT).
This is where XStream shines. You simply annotate the classes holding your data, or if you do not want to change those classes, you configure a XStream instance for marshalling (objects -> xml) or unmarshalling (xml -> objects).
Internally XStream uses reflection, the readObject and readResolve methods of standard Java object serialization.
You get a good and speedy tutorial here:
To give a short overview of how it works, I also provide some sample code which marshalls and unmarshalls a data structure.
The marshalling / unmarshalling happens all in the main method, the rest is just code to generate some test objects and populate some data to them.
It is super simple to configure the xStream instance and marshalling / unmarshalling is done with one line of code each.
import java.math.BigDecimal;
import java.util.ArrayList;
import java.util.List;
import com.thoughtworks.xstream.XStream;
public class XStreamIsGreat {
public static void main(String[] args) {
XStream xStream = new XStream();
xStream.alias("good", Good.class);
xStream.alias("pRoDuCeR", Producer.class);
xStream.alias("customer", Customer.class);
Producer a = new Producer("Apple");
Producer s = new Producer("Samsung");
Customer c = new Customer("Someone").add(new Good("S4", 10, new BigDecimal(600), s))
.add(new Good("S4 mini", 5, new BigDecimal(450), s)).add(new Good("I5S", 3, new BigDecimal(875), a));
String xml = xStream.toXML(c); // objects -> xml
System.out.println("Marshalled:\n" + xml);
Customer unmarshalledCustomer = (Customer)xStream.fromXML(xml); // xml -> objects
}
static class Good {
Producer producer;
String name;
int quantity;
BigDecimal price;
Good(String name, int quantity, BigDecimal price, Producer p) {
this.producer = p;
this.name = name;
this.quantity = quantity;
this.price = price;
}
}
static class Producer {
String name;
public Producer(String name) {
this.name = name;
}
}
static class Customer {
String name;
public Customer(String name) {
this.name = name;
}
List<Good> stock = new ArrayList<Good>();
Customer add(Good g) {
stock.add(g);
return this;
}
}
}
Ok, already having DOM, JaxB and XStream in the list of answers, there is still a complete different way to read and write XML: Data projection You can decouple the XML structure and the Java structure by using a library that provides read and writeable views to the XML Data as Java interfaces. From the tutorials:
Given some real world XML:
<weatherdata>
<weather
...
degreetype="F"
lat="50.5520210266113" lon="6.24060010910034"
searchlocation="Monschau, Stadt Aachen, NW, Germany"
... >
<current ... skytext="Clear" temperature="46"/>
</weather>
</weatherdata>
With data projection you can define a projection interface:
public interface WeatherData {
#XBRead("/weatherdata/weather/#searchlocation")
String getLocation();
#XBRead("/weatherdata/weather/current/#temperature")
int getTemperature();
#XBRead("/weatherdata/weather/#degreetype")
String getDegreeType();
#XBRead("/weatherdata/weather/current/#skytext")
String getSkytext();
/**
* This would be our "sub projection". A structure grouping two attribute
* values in one object.
*/
interface Coordinates {
#XBRead("#lon")
double getLongitude();
#XBRead("#lat")
double getLatitude();
}
#XBRead("/weatherdata/weather")
Coordinates getCoordinates();
}
And use instances of this interface just like POJOs:
private void printWeatherData(String location) throws IOException {
final String BaseURL = "http://weather.service.msn.com/find.aspx?outputview=search&weasearchstr=";
// We let the projector fetch the data for us
WeatherData weatherData = new XBProjector().io().url(BaseURL + location).read(WeatherData.class);
// Print some values
System.out.println("The weather in " + weatherData.getLocation() + ":");
System.out.println(weatherData.getSkytext());
System.out.println("Temperature: " + weatherData.getTemperature() + "°"
+ weatherData.getDegreeType());
// Access our sub projection
Coordinates coordinates = weatherData.getCoordinates();
System.out.println("The place is located at " + coordinates.getLatitude() + ","
+ coordinates.getLongitude());
}
This works even for creating XML, the XPath expressions can be writable.
SAX parser is working differently with a DOM parser, it neither load any XML document into memory nor create any object representation of the XML document. Instead, the SAX parser use callback function org.xml.sax.helpers.DefaultHandler to informs clients of the XML document structure.
SAX Parser is faster and uses less memory than DOM parser.
See following SAX callback methods :
startDocument() and endDocument() – Method called at the start and end of an XML document.
startElement() and endElement() – Method called at the start and end of a document element.
characters() – Method called with the text contents in between the start and end tags of an XML document element.
XML file
Create a simple XML file.
<?xml version="1.0"?>
<company>
<staff>
<firstname>yong</firstname>
<lastname>mook kim</lastname>
<nickname>mkyong</nickname>
<salary>100000</salary>
</staff>
<staff>
<firstname>low</firstname>
<lastname>yin fong</lastname>
<nickname>fong fong</nickname>
<salary>200000</salary>
</staff>
</company>
XML parser:
Java file Use SAX parser to parse the XML file.
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class ReadXMLFile {
public static void main(String argv[]) {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
boolean bfname = false;
boolean blname = false;
boolean bnname = false;
boolean bsalary = false;
public void startElement(String uri, String localName,String qName,
Attributes attributes) throws SAXException {
System.out.println("Start Element :" + qName);
if (qName.equalsIgnoreCase("FIRSTNAME")) {
bfname = true;
}
if (qName.equalsIgnoreCase("LASTNAME")) {
blname = true;
}
if (qName.equalsIgnoreCase("NICKNAME")) {
bnname = true;
}
if (qName.equalsIgnoreCase("SALARY")) {
bsalary = true;
}
}
public void endElement(String uri, String localName,
String qName) throws SAXException {
System.out.println("End Element :" + qName);
}
public void characters(char ch[], int start, int length) throws SAXException {
if (bfname) {
System.out.println("First Name : " + new String(ch, start, length));
bfname = false;
}
if (blname) {
System.out.println("Last Name : " + new String(ch, start, length));
blname = false;
}
if (bnname) {
System.out.println("Nick Name : " + new String(ch, start, length));
bnname = false;
}
if (bsalary) {
System.out.println("Salary : " + new String(ch, start, length));
bsalary = false;
}
}
};
saxParser.parse("c:\\file.xml", handler);
} catch (Exception e) {
e.printStackTrace();
}
}
}
Result
Start Element :company
Start Element :staff
Start Element :firstname
First Name : yong
End Element :firstname
Start Element :lastname
Last Name : mook kim
End Element :lastname
Start Element :nickname
Nick Name : mkyong
End Element :nickname
and so on...
Source(MyKong) - http://www.mkyong.com/java/how-to-read-xml-file-in-java-sax-parser/

How to import XML data Using Java and store them into a Java List? [duplicate]

I have to read and write to and from an XML file. What is the easiest way to read and write XML files using Java?
Here is a quick DOM example that shows how to read and write a simple xml file with its dtd:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE roles SYSTEM "roles.dtd">
<roles>
<role1>User</role1>
<role2>Author</role2>
<role3>Admin</role3>
<role4/>
</roles>
and the dtd:
<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT roles (role1,role2,role3,role4)>
<!ELEMENT role1 (#PCDATA)>
<!ELEMENT role2 (#PCDATA)>
<!ELEMENT role3 (#PCDATA)>
<!ELEMENT role4 (#PCDATA)>
First import these:
import javax.xml.parsers.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;
import org.xml.sax.*;
import org.w3c.dom.*;
Here are a few variables you will need:
private String role1 = null;
private String role2 = null;
private String role3 = null;
private String role4 = null;
private ArrayList<String> rolev;
Here is a reader (String xml is the name of your xml file):
public boolean readXML(String xml) {
rolev = new ArrayList<String>();
Document dom;
// Make an instance of the DocumentBuilderFactory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
// use the factory to take an instance of the document builder
DocumentBuilder db = dbf.newDocumentBuilder();
// parse using the builder to get the DOM mapping of the
// XML file
dom = db.parse(xml);
Element doc = dom.getDocumentElement();
role1 = getTextValue(role1, doc, "role1");
if (role1 != null) {
if (!role1.isEmpty())
rolev.add(role1);
}
role2 = getTextValue(role2, doc, "role2");
if (role2 != null) {
if (!role2.isEmpty())
rolev.add(role2);
}
role3 = getTextValue(role3, doc, "role3");
if (role3 != null) {
if (!role3.isEmpty())
rolev.add(role3);
}
role4 = getTextValue(role4, doc, "role4");
if ( role4 != null) {
if (!role4.isEmpty())
rolev.add(role4);
}
return true;
} catch (ParserConfigurationException pce) {
System.out.println(pce.getMessage());
} catch (SAXException se) {
System.out.println(se.getMessage());
} catch (IOException ioe) {
System.err.println(ioe.getMessage());
}
return false;
}
And here a writer:
public void saveToXML(String xml) {
Document dom;
Element e = null;
// instance of a DocumentBuilderFactory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
// use factory to get an instance of document builder
DocumentBuilder db = dbf.newDocumentBuilder();
// create instance of DOM
dom = db.newDocument();
// create the root element
Element rootEle = dom.createElement("roles");
// create data elements and place them under root
e = dom.createElement("role1");
e.appendChild(dom.createTextNode(role1));
rootEle.appendChild(e);
e = dom.createElement("role2");
e.appendChild(dom.createTextNode(role2));
rootEle.appendChild(e);
e = dom.createElement("role3");
e.appendChild(dom.createTextNode(role3));
rootEle.appendChild(e);
e = dom.createElement("role4");
e.appendChild(dom.createTextNode(role4));
rootEle.appendChild(e);
dom.appendChild(rootEle);
try {
Transformer tr = TransformerFactory.newInstance().newTransformer();
tr.setOutputProperty(OutputKeys.INDENT, "yes");
tr.setOutputProperty(OutputKeys.METHOD, "xml");
tr.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
tr.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, "roles.dtd");
tr.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
// send DOM to file
tr.transform(new DOMSource(dom),
new StreamResult(new FileOutputStream(xml)));
} catch (TransformerException te) {
System.out.println(te.getMessage());
} catch (IOException ioe) {
System.out.println(ioe.getMessage());
}
} catch (ParserConfigurationException pce) {
System.out.println("UsersXML: Error trying to instantiate DocumentBuilder " + pce);
}
}
getTextValue is here:
private String getTextValue(String def, Element doc, String tag) {
String value = def;
NodeList nl;
nl = doc.getElementsByTagName(tag);
if (nl.getLength() > 0 && nl.item(0).hasChildNodes()) {
value = nl.item(0).getFirstChild().getNodeValue();
}
return value;
}
Add a few accessors and mutators and you are done!
Writing XML using JAXB (Java Architecture for XML Binding):
http://www.mkyong.com/java/jaxb-hello-world-example/
package com.mkyong.core;
import javax.xml.bind.annotation.XmlAttribute;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
#XmlRootElement
public class Customer {
String name;
int age;
int id;
public String getName() {
return name;
}
#XmlElement
public void setName(String name) {
this.name = name;
}
public int getAge() {
return age;
}
#XmlElement
public void setAge(int age) {
this.age = age;
}
public int getId() {
return id;
}
#XmlAttribute
public void setId(int id) {
this.id = id;
}
}
package com.mkyong.core;
import java.io.File;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Marshaller;
public class JAXBExample {
public static void main(String[] args) {
Customer customer = new Customer();
customer.setId(100);
customer.setName("mkyong");
customer.setAge(29);
try {
File file = new File("C:\\file.xml");
JAXBContext jaxbContext = JAXBContext.newInstance(Customer.class);
Marshaller jaxbMarshaller = jaxbContext.createMarshaller();
// output pretty printed
jaxbMarshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
jaxbMarshaller.marshal(customer, file);
jaxbMarshaller.marshal(customer, System.out);
} catch (JAXBException e) {
e.printStackTrace();
}
}
}
The above answer only deal with DOM parser (that normally reads the entire file in memory and parse it, what for a big file is a problem), you could use a SAX parser that uses less memory and is faster (anyway that depends on your code).
SAX parser callback some functions when it find a start of element, end of element, attribute, text between elements, etc, so it can parse the document and at the same time you
get what you need.
Some example code:
http://www.mkyong.com/java/how-to-read-xml-file-in-java-sax-parser/
The answers only cover DOM / SAX and a copy paste implementation of a JAXB example.
However, one big area of when you are using XML is missing. In many projects / programs there is a need to store / retrieve some basic data structures. Your program has already a classes for your nice and shiny business objects / data structures, you just want a comfortable way to convert this data to a XML structure so you can do more magic on it (store, load, send, manipulate with XSLT).
This is where XStream shines. You simply annotate the classes holding your data, or if you do not want to change those classes, you configure a XStream instance for marshalling (objects -> xml) or unmarshalling (xml -> objects).
Internally XStream uses reflection, the readObject and readResolve methods of standard Java object serialization.
You get a good and speedy tutorial here:
To give a short overview of how it works, I also provide some sample code which marshalls and unmarshalls a data structure.
The marshalling / unmarshalling happens all in the main method, the rest is just code to generate some test objects and populate some data to them.
It is super simple to configure the xStream instance and marshalling / unmarshalling is done with one line of code each.
import java.math.BigDecimal;
import java.util.ArrayList;
import java.util.List;
import com.thoughtworks.xstream.XStream;
public class XStreamIsGreat {
public static void main(String[] args) {
XStream xStream = new XStream();
xStream.alias("good", Good.class);
xStream.alias("pRoDuCeR", Producer.class);
xStream.alias("customer", Customer.class);
Producer a = new Producer("Apple");
Producer s = new Producer("Samsung");
Customer c = new Customer("Someone").add(new Good("S4", 10, new BigDecimal(600), s))
.add(new Good("S4 mini", 5, new BigDecimal(450), s)).add(new Good("I5S", 3, new BigDecimal(875), a));
String xml = xStream.toXML(c); // objects -> xml
System.out.println("Marshalled:\n" + xml);
Customer unmarshalledCustomer = (Customer)xStream.fromXML(xml); // xml -> objects
}
static class Good {
Producer producer;
String name;
int quantity;
BigDecimal price;
Good(String name, int quantity, BigDecimal price, Producer p) {
this.producer = p;
this.name = name;
this.quantity = quantity;
this.price = price;
}
}
static class Producer {
String name;
public Producer(String name) {
this.name = name;
}
}
static class Customer {
String name;
public Customer(String name) {
this.name = name;
}
List<Good> stock = new ArrayList<Good>();
Customer add(Good g) {
stock.add(g);
return this;
}
}
}
Ok, already having DOM, JaxB and XStream in the list of answers, there is still a complete different way to read and write XML: Data projection You can decouple the XML structure and the Java structure by using a library that provides read and writeable views to the XML Data as Java interfaces. From the tutorials:
Given some real world XML:
<weatherdata>
<weather
...
degreetype="F"
lat="50.5520210266113" lon="6.24060010910034"
searchlocation="Monschau, Stadt Aachen, NW, Germany"
... >
<current ... skytext="Clear" temperature="46"/>
</weather>
</weatherdata>
With data projection you can define a projection interface:
public interface WeatherData {
#XBRead("/weatherdata/weather/#searchlocation")
String getLocation();
#XBRead("/weatherdata/weather/current/#temperature")
int getTemperature();
#XBRead("/weatherdata/weather/#degreetype")
String getDegreeType();
#XBRead("/weatherdata/weather/current/#skytext")
String getSkytext();
/**
* This would be our "sub projection". A structure grouping two attribute
* values in one object.
*/
interface Coordinates {
#XBRead("#lon")
double getLongitude();
#XBRead("#lat")
double getLatitude();
}
#XBRead("/weatherdata/weather")
Coordinates getCoordinates();
}
And use instances of this interface just like POJOs:
private void printWeatherData(String location) throws IOException {
final String BaseURL = "http://weather.service.msn.com/find.aspx?outputview=search&weasearchstr=";
// We let the projector fetch the data for us
WeatherData weatherData = new XBProjector().io().url(BaseURL + location).read(WeatherData.class);
// Print some values
System.out.println("The weather in " + weatherData.getLocation() + ":");
System.out.println(weatherData.getSkytext());
System.out.println("Temperature: " + weatherData.getTemperature() + "°"
+ weatherData.getDegreeType());
// Access our sub projection
Coordinates coordinates = weatherData.getCoordinates();
System.out.println("The place is located at " + coordinates.getLatitude() + ","
+ coordinates.getLongitude());
}
This works even for creating XML, the XPath expressions can be writable.
SAX parser is working differently with a DOM parser, it neither load any XML document into memory nor create any object representation of the XML document. Instead, the SAX parser use callback function org.xml.sax.helpers.DefaultHandler to informs clients of the XML document structure.
SAX Parser is faster and uses less memory than DOM parser.
See following SAX callback methods :
startDocument() and endDocument() – Method called at the start and end of an XML document.
startElement() and endElement() – Method called at the start and end of a document element.
characters() – Method called with the text contents in between the start and end tags of an XML document element.
XML file
Create a simple XML file.
<?xml version="1.0"?>
<company>
<staff>
<firstname>yong</firstname>
<lastname>mook kim</lastname>
<nickname>mkyong</nickname>
<salary>100000</salary>
</staff>
<staff>
<firstname>low</firstname>
<lastname>yin fong</lastname>
<nickname>fong fong</nickname>
<salary>200000</salary>
</staff>
</company>
XML parser:
Java file Use SAX parser to parse the XML file.
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class ReadXMLFile {
public static void main(String argv[]) {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
boolean bfname = false;
boolean blname = false;
boolean bnname = false;
boolean bsalary = false;
public void startElement(String uri, String localName,String qName,
Attributes attributes) throws SAXException {
System.out.println("Start Element :" + qName);
if (qName.equalsIgnoreCase("FIRSTNAME")) {
bfname = true;
}
if (qName.equalsIgnoreCase("LASTNAME")) {
blname = true;
}
if (qName.equalsIgnoreCase("NICKNAME")) {
bnname = true;
}
if (qName.equalsIgnoreCase("SALARY")) {
bsalary = true;
}
}
public void endElement(String uri, String localName,
String qName) throws SAXException {
System.out.println("End Element :" + qName);
}
public void characters(char ch[], int start, int length) throws SAXException {
if (bfname) {
System.out.println("First Name : " + new String(ch, start, length));
bfname = false;
}
if (blname) {
System.out.println("Last Name : " + new String(ch, start, length));
blname = false;
}
if (bnname) {
System.out.println("Nick Name : " + new String(ch, start, length));
bnname = false;
}
if (bsalary) {
System.out.println("Salary : " + new String(ch, start, length));
bsalary = false;
}
}
};
saxParser.parse("c:\\file.xml", handler);
} catch (Exception e) {
e.printStackTrace();
}
}
}
Result
Start Element :company
Start Element :staff
Start Element :firstname
First Name : yong
End Element :firstname
Start Element :lastname
Last Name : mook kim
End Element :lastname
Start Element :nickname
Nick Name : mkyong
End Element :nickname
and so on...
Source(MyKong) - http://www.mkyong.com/java/how-to-read-xml-file-in-java-sax-parser/

Parsing XML without document start and end tags

I'm parsing a document that I cannot change from the internet using a SAX Parser. It was working just fine when the documents came formatted as such:
<outtertag>
<innertag>data</innertag>
<innerag>moreData</innertag>
</outtertag>
However, there are certain calls I make where the XML comes formatted without the outer tags, so I essentially get just a list of data, like such:
<innertag>data</innertag>
<innerag>moreData</innertag>
This seems silly to me, but I don't get to choose how the XML is formatted and it can't be changed for now. The problem is that it seems that the SAX Parser hits the endDocument event as soon as it hits the first closing innertag.
I have a rather hacky solution of converting the InputStream into a String, throwing tags around it, and then converting it back to an InputStream. It actually parses fine that way. But, surely there's a better way. I'd also would prefer not to write a whole other parser. Most of the tags are the same aside from the lack of opening and closing tags.
Just for the heck of it, I'll post the code, but it's pretty standard SAX Parser. The original is actually parsing about 30 some tags:
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
XMLReader xmlReader = saxParser.getXMLReader();
MyHandler handler = new MyHandler();
xmlReader.setContentHandler(handler);
InputSource inputSource = new InputSource(url.openStream());
xmlReader.parse(inputSource);
}
catch (SAXException e) { e.printStackTrace(); }
catch (ParserConfigurationException e) { e.printStackTrace(); }
catch(Exception e) { e.printStackTrace(); }
}
private class MyHandler extends DefaultHandler {
private StringBuilder content;
public MyHandler() {
content = new StringBuilder();
}
public void startElement(String uri, String localName, String qName,
Attributes atts) throws SAXException {
content = new StringBuilder();
if(localName.equalsIgnoreCase("innertag")) {
//Doing stuff
}
}
public void endElement(String uri, String localName, String qName)
throws SAXException {
//Doing stuff
}
public void characters(char[] ch, int start, int length)
throws SAXException {
content.append(ch, start, length);
}
public void endDocument() throws SAXException {
//When parsing the second type of document, hits this event almost immediately after parsing first tag
}
}
And, if it matters, here's my hacky code I'm using, but just feels wrong, yet it works:
BufferedReader reader = new BufferedReader(new InputStreamReader(url.openStream()));
StringBuilder sb = new StringBuilder("<tag>");
String line = null;
while ((line = reader.readLine()) != null) {
sb.append(line);
}
sb.append("</tag>");
String xml =sb.toString();
InputStream is = new ByteArrayInputStream(xml.getBytes());
InputSource source = new InputSource(is);
xmlReader.parse(source);
I'd say what you're doing now is about as good as you'll get. The one thing to consider improving is the stream -> string -> stream conversion, especially if the documents are large. You could use something like Guava's ByteStreams.join(), which lets you concatenate streams together instead of strings. Something like the following:
import com.google.common.io.*;
import java.io.*;
public class ConcatenateStreams {
public static void main(String[] args) throws Exception {
InputStream malformedXmlContent = externalXmlStream();
InputSupplier<InputStream> joined = ByteStreams.join(
inputSupplier("<root>"),
inputSupplier(malformedXmlContent),
inputSupplier("</root>"));
ByteStreams.copy(joined, System.out);
}
private static InputStream externalXmlStream() {
return new ByteArrayInputStream("<foo>5</foo><bar>10</bar>".getBytes());
}
private static InputSupplier<InputStream> inputSupplier(final String text) {
return inputSupplier(new ByteArrayInputStream(text.getBytes()));
}
private static InputSupplier<InputStream> inputSupplier(final InputStream inputStream) {
return new InputSupplier<InputStream>() {
#Override
public InputStream getInput() throws IOException {
return inputStream;
}
};
}
}
which outputs:
<root><foo>5</foo><bar>10</bar></root>
The XML you have is not a well-formed document, but it is a well-formed external parsed entity, which means it can be referenced from a well-formed document by means of an entity reference. So create a skeleton document like this:
<!DOCTYPE doc [
<!ENTITY e SYSTEM "data.xml">
]>
<doc>&e;</doc>
where data.xml is your XML, and pass this document to the XML parser in place of the original. Beats writing dozens of lines of Java code.

Parsing XML from a website to a String array in Android please help me

Hello I am in the process of making an Android app that pulls some data from a Wiki, at first I was planning on finding a way to parse the HTML, but from something that someone pointed out to me is that XML would be much easier to work with. Now I am stuck trying to find a way to parse the XML correctly. I am trying to parse from a web address right now from:
http://zelda.wikia.com/api.php?action=query&list=categorymembers&cmtitle=Category:Games&cmlimit=500&format=xml
I am trying to get the titles of each of the games into a string array and I am having some trouble. I don't have an example of the code I was trying out, it was by using xmlpullparser. My app crashes everytime that I try to do anything with it. Would it be better to save the XML locally and parse from there? or would I be okay going from the web address? and how would I go about parsing this correctly into a string array? Please help me, and thank you for taking the time to read this.
If you need to see code or anything I can get it later tonight, I am just not near my PC at this time. Thank you.
Whenever you find yourself writing parser code for simple formats like the one in your example you're almost always doing something wrong and not using a suitable framework.
For instance - there's a set of simple helpers for parsing XML in the android.sax package included in the SDK and it just happens that the example you posted could be easily parsed like this:
public class WikiParser {
public static class Cm {
public String mPageId;
public String mNs;
public String mTitle;
}
private static class CmListener implements StartElementListener {
final List<Cm> mCms;
CmListener(List<Cm> cms) {
mCms = cms;
}
#Override
public void start(Attributes attributes) {
Cm cm = new Cm();
cm.mPageId = attributes.getValue("", "pageid");
cm.mNs = attributes.getValue("", "ns");
cm.mTitle = attributes.getValue("", "title");
mCms.add(cm);
}
}
public void parseInto(URL url, List<Cm> cms) throws IOException, SAXException {
HttpURLConnection con = (HttpURLConnection) url.openConnection();
try {
parseInto(new BufferedInputStream(con.getInputStream()), cms);
} finally {
con.disconnect();
}
}
public void parseInto(InputStream docStream, List<Cm> cms) throws IOException, SAXException {
RootElement api = new RootElement("api");
Element query = api.requireChild("query");
Element categoryMembers = query.requireChild("categorymembers");
Element cm = categoryMembers.requireChild("cm");
cm.setStartElementListener(new CmListener(cms));
Xml.parse(docStream, Encoding.UTF_8, api.getContentHandler());
}
}
Basically, called like this:
WikiParser p = new WikiParser();
ArrayList<WikiParser.Cm> res = new ArrayList<WikiParser.Cm>();
try {
p.parseInto(new URL("http://zelda.wikia.com/api.php?action=query&list=categorymembers&cmtitle=Category:Games&cmlimit=500&format=xml"), res);
} catch (MalformedURLException e) {
} catch (IOException e) {
} catch (SAXException e) {}
Edit: This is how you'd create a List<String> instead:
public class WikiParser {
private static class CmListener implements StartElementListener {
final List<String> mTitles;
CmListener(List<String> titles) {
mTitles = titles;
}
#Override
public void start(Attributes attributes) {
String title = attributes.getValue("", "title");
if (!TextUtils.isEmpty(title)) {
mTitles.add(title);
}
}
}
public void parseInto(URL url, List<String> titles) throws IOException, SAXException {
HttpURLConnection con = (HttpURLConnection) url.openConnection();
try {
parseInto(new BufferedInputStream(con.getInputStream()), titles);
} finally {
con.disconnect();
}
}
public void parseInto(InputStream docStream, List<String> titles) throws IOException, SAXException {
RootElement api = new RootElement("api");
Element query = api.requireChild("query");
Element categoryMembers = query.requireChild("categorymembers");
Element cm = categoryMembers.requireChild("cm");
cm.setStartElementListener(new CmListener(titles));
Xml.parse(docStream, Encoding.UTF_8, api.getContentHandler());
}
}
and then:
WikiParser p = new WikiParser();
ArrayList<String> titles = new ArrayList<String>();
try {
p.parseInto(new URL("http://zelda.wikia.com/api.php?action=query&list=categorymembers&cmtitle=Category:Games&cmlimit=500&format=xml"), titles);
} catch (MalformedURLException e) {
} catch (IOException e) {
} catch (SAXException e) {}

parse an xml string in java?

how do you parse xml stored in a java string object?
Java's XMLReader only parses XML documents from a URI or inputstream. is it not possible to parse from a String containing an xml data?
Right now I have the following:
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser sp = factory.newSAXParser();
XMLReader xr = sp.getXMLReader();
ContactListXmlHandler handler = new ContactListXmlHandler();
xr.setContentHandler(handler);
xr.p
} catch (ParserConfigurationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
And on my handler i have this:
public class ContactListXmlHandler extends DefaultHandler implements Resources {
private List<ContactName> contactNameList = new ArrayList<ContactName>();
private ContactName contactItem;
private StringBuffer sb;
public List<ContactName> getContactNameList() {
return contactNameList;
}
#Override
public void startDocument() throws SAXException {
// TODO Auto-generated method stub
super.startDocument();
sb = new StringBuffer();
}
#Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
// TODO Auto-generated method stub
super.startElement(uri, localName, qName, attributes);
if(localName.equals(XML_CONTACT_NAME)){
contactItem = new ContactName();
}
sb.setLength(0);
}
#Override
public void characters(char[] ch, int start, int length){
// TODO Auto-generated method stub
try {
super.characters(ch, start, length);
} catch (SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
sb.append(ch, start, length);
}
#Override
public void endDocument() throws SAXException {
// TODO Auto-generated method stub
super.endDocument();
}
/**
* where the real stuff happens
*/
#Override
public void endElement(String uri, String localName, String qName)
throws SAXException {
// TODO Auto-generated method stub
//super.endElement(arg0, arg1, arg2);
if(contactItem != null){
if (localName.equalsIgnoreCase("title")) {
contactItem.setUid(sb.toString());
Log.d("handler", "setTitle = " + sb.toString());
} else if (localName.equalsIgnoreCase("link")) {
contactItem.setFullName(sb.toString());
} else if (localName.equalsIgnoreCase("item")){
Log.d("handler", "adding rss item");
contactNameList.add(contactItem);
}
sb.setLength(0);
}
}
Thanks in advance
The SAXParser can read an InputSource.
An InputSource can take a Reader in its constructor
So, you can put parse XML string via a StringReader
new InputSource(new StringReader("... your xml here....")));
Try jcabi-xml (see this blog post) with a one-liner:
XML xml = new XMLDocument("<document>...</document>")
Your XML might be simple enough to parse manually using the DOM or SAX API, but I'd still suggest using an XML serialization API such as JAXB, XStream, or Simple instead because writing your own XML serialization/deserialization code is a drag.
Note that the XStream FAQ erroneously claims that you must use generated classes with JAXB:
How does XStream compare to JAXB (Java API for XML Binding)?
JAXB is a Java binding tool. It generates Java code from a schema and
you are able to transform from those classes into XML matching the
processed schema and back. Note, that you cannot use your own objects,
you have to use what is generated.
It seems this was true was true at one time, but JAXB 2.0 no longer requires you to use Java classes generated from a schema.
If you go this route, be sure to check out the side-by-side comparisons of the serialization/marshalling APIs I've mentioned:
http://blog.bdoughan.com/2010/10/how-does-jaxb-compare-to-xstream.html
http://blog.bdoughan.com/2010/10/how-does-jaxb-compare-to-simple.html
Take a look at this: http://www.rgagnon.com/javadetails/java-0573.html
import javax.xml.parsers.*;
import org.xml.sax.InputSource;
import org.w3c.dom.*;
import java.io.*;
public class ParseXMLString {
public static void main(String arg[]) {
String xmlRecords =
"<data>" +
" <employee>" +
" <name>John</name>" +
" <title>Manager</title>" +
" </employee>" +
" <employee>" +
" <name>Sara</name>" +
" <title>Clerk</title>" +
" </employee>" +
"</data>";
try {
DocumentBuilderFactory dbf =
DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(xmlRecords));
Document doc = db.parse(is);
NodeList nodes = doc.getElementsByTagName("employee");
// iterate the employees
for (int i = 0; i < nodes.getLength(); i++) {
Element element = (Element) nodes.item(i);
NodeList name = element.getElementsByTagName("name");
Element line = (Element) name.item(0);
System.out.println("Name: " + getCharacterDataFromElement(line));
NodeList title = element.getElementsByTagName("title");
line = (Element) title.item(0);
System.out.println("Title: " + getCharacterDataFromElement(line));
}
}
catch (Exception e) {
e.printStackTrace();
}
/*
output :
Name: John
Title: Manager
Name: Sara
Title: Clerk
*/
}
public static String getCharacterDataFromElement(Element e) {
Node child = e.getFirstChild();
if (child instanceof CharacterData) {
CharacterData cd = (CharacterData) child;
return cd.getData();
}
return "?";
}
}

Categories

Resources