How to parse GPX files with SAXReader? - java

I'm trying to parse a GPX file. I tried it with JDOM, but it does not work very well.
SAXBuilder builder = new SAXBuilder();
Document document = builder.build(filename);
Element root = document.getRootElement();
System.out.println("Root:\t" + root.getName());
List<Element> listTrks = root.getChildren("trk");
System.out.println("Count trk:\t" + listTrks.size());
for (Element tmpTrk : listTrks) {
List<Element> listTrkpts = tmpTrk.getChildren("trkpt");
System.out.println("Count pts:\t" + listTrkpts.size());
for (Element tmpTrkpt : listTrkpts) {
System.out.println(tmpTrkpt.getAttributeValue("lat") + ":" + tmpTrkpt.getAttributeValue("lat"));
}
}
I opened the example file (CC-BY-SA OpenStreetMap) and the output is just:
Root: gpx
Count trk: 0
What can I do? Should I us a SAXParserFactory (javax.xml.parsers.SAXParserFactory) and implement a Handler class?

Here is my gpx reader. It ignores some of the tags but I hope it will help.
package ch.perry.rando.geocode;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.text.DateFormat;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
/**
*
* #author perrym
*/
public class GpxReader extends DefaultHandler {
private static final DateFormat TIME_FORMAT
= new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss'Z'");
private List<Trackpoint> track = new ArrayList<Trackpoint>();
private StringBuffer buf = new StringBuffer();
private double lat;
private double lon;
private double ele;
private Date time;
public static Trackpoint[] readTrack(InputStream in) throws IOException {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating(true);
SAXParser parser = factory.newSAXParser();
GpxReader reader = new GpxReader();
parser.parse(in, reader);
return reader.getTrack();
} catch (ParserConfigurationException e) {
throw new IOException(e.getMessage());
} catch (SAXException e) {
throw new IOException(e.getMessage());
}
}
public static Trackpoint[] readTrack(File file) throws IOException {
InputStream in = new FileInputStream(file);
try {
return readTrack(in);
} finally {
in.close();
}
}
#Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
buf.setLength(0);
if (qName.equals("trkpt")) {
lat = Double.parseDouble(attributes.getValue("lat"));
lon = Double.parseDouble(attributes.getValue("lon"));
}
}
#Override
public void endElement(String uri, String localName, String qName)
throws SAXException {
if (qName.equals("trkpt")) {
track.add(Trackpoint.fromWGS84(lat, lon, ele, time));
} else if (qName.equals("ele")) {
ele = Double.parseDouble(buf.toString());
} else if (qName.equals("")) {
try {
time = TIME_FORMAT.parse(buf.toString());
} catch (ParseException e) {
throw new SAXException("Invalid time " + buf.toString());
}
}
}
#Override
public void characters(char[] chars, int start, int length)
throws SAXException {
buf.append(chars, start, length);
}
private Trackpoint[] getTrack() {
return track.toArray(new Trackpoint[track.size()]);
}
}

To read GPX files easily in Java see: http://sourceforge.net/p/gpsanalysis/wiki/Home/
example:
//gets points from a GPX file
final List points= GpxFileDataAccess.getPoints(new File("/path/toGpxFile.gpx"));

Ready to use, open source, and fully functional java GpxParser (and much more) here
https://sourceforge.net/projects/geokarambola/
Details here
https://plus.google.com/u/0/communities/110606810455751902142
With the above library parsing a GPX file is a one liner:
Gpx gpx = GpxFileIo.parseIn( "SomeGeoCollection.gpx" ) ;
Getting its points, routes or tracks trivial too:
for(Point pt: gpx.getPoints( ))
Location loc = new Location( pt.getLatitude( ), pt.getLongitude( ) ) ;

Related

dynamic header CSVParser

Below code parses the CSV records if the header is always known in advance and we can declare the array values for FILE_HEADER_MAPPING.
CSVFormat csvFileFormat = CSVFormat.DEFAULT.withHeader(FILE_HEADER_MAPPING);
FileReader fileReader = new FileReader("file");
CSVParser csvFileParser = new CSVParser(fileReader, csvFileFormat);
Iterable<CSVRecord> records = csvFileParser.getRecords();
but how to create the CSVParser for the CSV files in which the headers differs for each csv file.
I will not know the header of the csv file to create with the format
CSVFormat csvFileFormat = CSVFormat.DEFAULT.withHeader(FILE_HEADER_MAPPING);
I want to have a csv parser for each possible csv headers.
Please help me to solve this scenario.
package dfi.fin.dcm.syn.loantrading.engine.source.impl;
import static dfi.fin.dcm.syn.loantrading.engine.task.impl.BackOfficeCSVHelper.AMOUNT;
import static dfi.fin.dcm.syn.loantrading.engine.task.impl.BackOfficeCSVHelper.FCN;
import static dfi.fin.dcm.syn.loantrading.engine.task.impl.BackOfficeCSVHelper.FEE_TYPE;
import static dfi.fin.dcm.syn.loantrading.engine.task.impl.BackOfficeCSVHelper.LINE_TYPE;
import static dfi.fin.dcm.syn.loantrading.engine.task.impl.BackOfficeCSVHelper.LINE_TYPE_VALUE_CARRY_EVT;
import static dfi.fin.dcm.syn.loantrading.engine.task.impl.BackOfficeCSVHelper.MARKIT_ID;
import static dfi.fin.dcm.syn.loantrading.engine.task.impl.BackOfficeCSVHelper.VALUE_DATE;
import java.io.IOException;
import java.io.InputStream;
import java.math.BigDecimal;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Arrays;
import java.util.Calendar;
import java.util.List;
import com.csvreader.CsvReader.CatastrophicException;
import com.csvreader.CsvReader.FinalizedException;
import dfi.fin.dcm.syn.loantrading.engine.source.SourceException;
import dfi.fin.dcm.syn.loantrading.model.portfolio.Portfolio;
#Deprecated
public class CarryEventStreamSource extends AbstractInputStreamSource<CarryEventData> {
private static String [] headers = {LINE_TYPE,VALUE_DATE,MARKIT_ID,FEE_TYPE,AMOUNT};
private SimpleDateFormat dateFormat = null;
public CarryEventStreamSource(InputStream stream) {
super(stream);
dateFormat = new SimpleDateFormat("dd/MM/yy");
}
public CarryEventData readNextElementInternal() throws SourceException, IOException, CatastrophicException, FinalizedException {
//skipping all events which are not Carry
boolean loop = true;
while (loop) {
// skipping all events which are not Carry
if(getReader().readRecord() && !getReader().get(LINE_TYPE).trim().equals(LINE_TYPE_VALUE_CARRY_EVT)) {
loop = true;
} else {
loop = false;
}
}
//EOF?
if (getReader().get(LINE_TYPE).trim().equals(LINE_TYPE_VALUE_CARRY_EVT)) {
CarryEventData toReturn = new CarryEventData();
toReturn.setComputationDate(Calendar.getInstance().getTime());
try {
toReturn.setValueDate(getDateFormat().parse(getReader().get(VALUE_DATE).trim()));
} catch (ParseException e) {
throw new SourceException(e);
}
if (!getPortfolio().getMtmSourceType().equals(Portfolio.MTM_SOURCE_TYPE_NONE)) {
if (getReader().get(MARKIT_ID).trim() == null) {
throw new SourceException("Back Office file invalid data format: the markit id is missing on line "+getReader().getCurrentRecord());
}
toReturn.setTrancheMarkitId(getReader().get(MARKIT_ID).trim());
} else {
if (getReader().get(FCN)==null || "".equals(getReader().get(FCN).trim())) {
throw new SourceException("Back Office file invalid data format: missing loan tranche id on line "+getReader().getCurrentRecord());
}
toReturn.setTrancheMarkitId(getReader().get(FCN).trim());
}
if (getReader().get(FEE_TYPE).equals("")) {
toReturn.setFeeType(null);
} else {
toReturn.setFeeType(getReader().get(FEE_TYPE).trim());
}
if (getReader().get(AMOUNT)==null) {
throw new SourceException("Back Office file invalid data format: missing amount on line "+getReader().getCurrentRecord());
}
try {
toReturn.setAmount(new BigDecimal(getReader().get(AMOUNT)));
} catch (NumberFormatException ex) {
throw new SourceException(ex,"Back Office file invalid data format: invalid amount on line "+getReader().getCurrentRecord());
}
return toReturn;
}
// no carry found, null is returned
return null;
}
public SimpleDateFormat getDateFormat() {
return dateFormat;
}
public void setDateFormat(SimpleDateFormat dateFormat) {
this.dateFormat = dateFormat;
}
#Override
public char getDelimiter() {
return ',';
}
#Override
public List<String> getHeaderSet() {
return Arrays.asList(headers);
}
#Override
public String getName() {
return "File import";
package dfi.fin.dcm.syn.loantrading.engine.source.impl;
import java.io.IOException;
import java.io.InputStream;
import java.math.BigDecimal;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Arrays;
import java.util.Calendar;
import java.util.List;
import com.csvreader.CsvReader.CatastrophicException;
import com.csvreader.CsvReader.FinalizedException;
import dfi.fin.dcm.syn.loantrading.engine.source.SourceException;
import dfi.fin.dcm.syn.loantrading.model.common.LTCurrency;
import dfi.fin.dcm.syn.loantrading.model.engine.event.CurrencyEvent;
public class SpotForexRateStreamSource extends AbstractInputStreamSource<CurrencyEvent> {
private SimpleDateFormat dateFormat;
private static String [] headers = {"CURRENCY","DATE","MID"};
public SpotForexRateStreamSource(InputStream stream) {
super(stream);
dateFormat = new SimpleDateFormat("dd/MM/yy");
}
#Override
public CurrencyEvent readNextElementInternal() throws SourceException, IOException, FinalizedException, CatastrophicException {
//skipping all events which are not Trade
if (getReader().readRecord()) {
CurrencyEvent event = new CurrencyEvent();
//retrieving the currency
LTCurrency currency = getCurrencyDAO().getLTCurrencyByISOCode(getReader().get("CURRENCY"));
event.setCurrency(currency);
try {
event.setDate(getDateFormat().parse(getReader().get("DATE")));
} catch (ParseException e) {
throw new SourceException(e, "Parse error while reading currency event date");
}
event.setExchangeRate(new BigDecimal(getReader().get("MID")));
event.setComputationDate(Calendar.getInstance().getTime());
return event;
}
return null;
}
#Override
public char getDelimiter() {
return ';';
}
public SimpleDateFormat getDateFormat() {
return dateFormat;
}
public void setDateFormat(SimpleDateFormat dateFormat) {
this.dateFormat = dateFormat;
}
#Override
public List<String> getHeaderSet() {
return Arrays.asList(headers);
}
#Override
public String getName() {
return "CSV File";
}
}
}
}

How to access a xml file from a .jar for execute jar from shell_exec php function

jar from a apache server with php.
I use shell_exec and a give her my jar file.
But because i parse an xml file on my java class i have problem. Jar can't access xml.
Files
SampleSax.java
package phpJavaPack;
import java.io.IOException;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
public class SampleSAX {
public static void main(String[] args) throws IOException, SAXException, ParserConfigurationException {
/*int i=0;
MyBufferedReaderWriter f = new MyBufferedReaderWriter();
f.openRFile("dblp.xml");
String sLine="";
while ((i<10) && (sLine=f.readLine()) != null) {
System.out.println(sLine);
i++;
}*/
int i=0;
while(i<args.length){
System.out.println("Argument's: " + args[0]);
System.setProperty("entityExpansionLimit", "1000000");
SAXParserFactory spfac = SAXParserFactory.newInstance();
spfac.setNamespaceAware(true);
SAXParser saxparser = spfac.newSAXParser();
MyHandler handler = new MyHandler(args[i]);
InputSource is = new InputSource("dblpmini.xml");
is.setEncoding("ISO-8859-1");
System.out.println("Please wait...");
saxparser.parse(is, handler);
System.out.println("---->" + handler.getprofessorsPublications());
i++;
}
//System.out.println("#####################################################################################################");
//System.out.println("List of George A. Vouros: " + handler.getprofessorsPublicationsValue("George A. Vouros"));
//System.out.println(handler.getProfessors());
//handler.createHtmlPage();//emfanizei mia html selida me ta apotelesmata
}
}
MyHandler.java
package phpJavaPack;
import java.util.ArrayList;
import java.util.Hashtable;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class MyHandler extends DefaultHandler {
private Publication publication;
protected Hashtable<String, ArrayList<Publication>> professorsPublications = new Hashtable<String, ArrayList<Publication>>();
private String temp ;
private ArrayList<Publication> al;
//private EntryLd entry = new EntryLd();
// public MyHandler() {
// super();
//
// String[] namesTable = entry.getNamesInput();
//
//
// for (String name: namesTable) {
//
// al = new ArrayList<Publication>();
// professorsPublications.put(name, al);
// }
//
// System.out.println("HashTable: " + professorsPublications);
// }
public MyHandler(String authorForSearch){
super();
String name = authorForSearch;
al = new ArrayList<Publication>();
professorsPublications.put(name, al);
System.out.println("HashTable: " + professorsPublications);
}
public Hashtable<String, ArrayList<Publication>> getprofessorsPublications() {
return professorsPublications;
}
public ArrayList<Publication> getprofessorsPublicationsValue(String author) {
return professorsPublications.get(author);
}
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException
{
temp = "";
if (qName.equalsIgnoreCase("article") || qName.equalsIgnoreCase("inproceedings")
|| qName.equalsIgnoreCase("proceedings") || qName.equalsIgnoreCase("book")
|| qName.equalsIgnoreCase("incollection") || qName.equalsIgnoreCase("phdthesis")
|| qName.equalsIgnoreCase("mastersthesis") || qName.equalsIgnoreCase("www")) {
publication = new Publication();
}
}
public void characters(char[] ch, int start, int length) {
temp = new String(ch, start, length);
System.out.println("----->>>" + temp);
//System.out.print(" <--- My temp's start: " + temp + " :My temp's end --> ");
}
public void endElement(String uri, String localName, String qName) throws SAXException
{
if (qName.equalsIgnoreCase("article") || qName.equalsIgnoreCase("inproceedings")
|| qName.equalsIgnoreCase("proceedings") || qName.equalsIgnoreCase("book")
|| qName.equalsIgnoreCase("incollection") || qName.equalsIgnoreCase("phdthesis")
|| qName.equalsIgnoreCase("mastersthesis") || qName.equalsIgnoreCase("www"))
{
for(int i=0; i<publication.getAuthors().size(); i++) {
String authorName = publication.getAuthors().get(i);
if(this.professorsPublications.containsKey(authorName)) {
this.publication.setType(localName);
this.professorsPublications.get(authorName).add(publication);
}
}
}
if(qName.equalsIgnoreCase("author")) {
publication.addAuthor(temp);
}
else if(qName.equalsIgnoreCase("title")) {
publication.setTitle(temp);
}
else if(qName.equalsIgnoreCase("year")) {
publication.setYear(Short.parseShort(temp));
}
else if(qName.equalsIgnoreCase("booktitle")) {
publication.setBooktitle(temp);
}
//String xl = publication.toString();
//System.out.println(xl);
}
}
How can give right the xml on a runnable jar?
Thnx
see PHP exec() command: how to specify working directory? - most probably the proc_open suggestion/example there is the best one
see the other suggestions at that thread too, like chdir

Un/marshalling base64 encoded binary data as stream

I have an application importing and exporting data from an Oracle database to/from XML using JAXB. Now there are some BLOB fields in the DB containing uploaded files which I would like to have in the XML as a base64 encoded string. This works quite well out of the box with JAXB using #XmlSchemaType(name = "base64Binary") as done below:
#XmlType
public class DocumentTemplateFile {
// other fields ommited
#XmlElement(required = true)
#XmlSchemaType(name = "base64Binary")
private byte[] data;
// other code ommited
}
The problem with this solution is that the whole file content is stored in memory because of the byte array. Depending on the size of the file this could cause some issues.
I was therefore wondering if there's a way to create an XmlAdapter or similar that gets Streams from and to the file, so that I can stream it directly to / from the DB's BLOB without having the whole content in memory. I was thinking on something similar to this:
public class BlobXmlAdapter extends XmlAdapter<InputStream, OutputStream> {
#Override
public InputStream marshal(final OutputStream value) throws Exception {
return null;
}
#Override
public OutputStream unmarshal(final InputStream value) throws Exception {
return null;
}
}
This is obviously only an illustrative example so that you can have an idea of what I'm looking for. The end solution doesn't necessarily have to make use of XmlAdaters. All I need is a way to hook on the un/marshalling process and stream the data through a buffer / queue rather than storing everything in a byte array.
this solution uses following third party library. you should use following maven dependency:
<dependency>
<groupId>jlibs</groupId>
<artifactId>jlibs-xsd</artifactId>
<version>2.0</version>
</dependency>
<repository>
<id>jlibs-snapshots-repository</id>
<name>JLibs Snapshots Repository</name>
<url>https://raw.githubusercontent.com/santhosh-tekuri/maven-repository/master</url>
<layout>default</layout>
</repository>
We need to use following custom XmlAdapter:
import javax.xml.bind.annotation.adapters.XmlAdapter;
import java.io.File;
/**
* #author Santhosh Kumar Tekuri
*/
public class Base64Adapter extends XmlAdapter<String, File>{
#Override
public File unmarshal(String v) throws Exception{
return new File(v);
}
#Override
public String marshal(File v) throws Exception{
throw new UnsupportedOperationException();
}
}
now change your pojo to use above adapter:
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.XmlSchemaType;
import javax.xml.bind.annotation.adapters.XmlJavaTypeAdapter;
import java.io.File;
#XmlRootElement
public class DocumentTemplateFile {
#XmlElement(required = true)
public String userName;
#XmlElement(required = true)
#XmlSchemaType(name = "base64Binary")
#XmlJavaTypeAdapter(Base64Adapter.class)
public File data;
}
now the following helper class should be used to read xml file:
import jlibs.xml.Namespaces;
import jlibs.xml.xsd.DOMLSInputList;
import jlibs.xml.xsd.XSParser;
import jlibs.xml.xsd.XSUtil;
import org.apache.xerces.xs.XSElementDeclaration;
import org.apache.xerces.xs.XSModel;
import org.apache.xerces.xs.XSSimpleTypeDefinition;
import org.apache.xerces.xs.XSTypeDefinition;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLFilterImpl;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.SchemaOutputResolver;
import javax.xml.namespace.QName;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.transform.Result;
import javax.xml.transform.sax.SAXSource;
import javax.xml.transform.stream.StreamResult;
import java.io.*;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
/**
* #author Santhosh Kumar Tekuri
*/
public class JAXBBlobUtil{
public static XSModel generateSchemas(Class clazz) throws Exception{
final Map<String, ByteArrayOutputStream> schemas = new HashMap<String, ByteArrayOutputStream>();
JAXBContext.newInstance(clazz).generateSchema(new SchemaOutputResolver(){
#Override
public Result createOutput(String namespaceUri, String suggestedFileName) throws IOException{
ByteArrayOutputStream bout = new ByteArrayOutputStream();
schemas.put(suggestedFileName, bout);
StreamResult result = new StreamResult(bout);
result.setSystemId(suggestedFileName);
return result;
}
});
DOMLSInputList lsInputList = new DOMLSInputList();
for(Map.Entry<String, ByteArrayOutputStream> entry : schemas.entrySet()){
ByteArrayInputStream bin = new ByteArrayInputStream(entry.getValue().toByteArray());
lsInputList.addStream(bin, entry.getKey(), null);
}
return new XSParser().parse(lsInputList);
}
private static Object unmarshal(Class clazz, InputSource is) throws Exception{
XSModel xsModel = generateSchemas(clazz);
JAXBContext context = JAXBContext.newInstance(clazz);
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setNamespaceAware(true);
XMLReader xmlReader = factory.newSAXParser().getXMLReader();
xmlReader = new Base64Filter(xmlReader, xsModel);
return context.createUnmarshaller().unmarshal(new SAXSource(xmlReader, is));
}
private static class Base64Filter extends XMLFilterImpl{
private XSModel schema;
private List<QName> xpath = new ArrayList();
private FileWriter fileWriter;
public Base64Filter(XMLReader parent, XSModel schema){
super(parent);
this.schema = schema;
}
#Override
public void startDocument() throws SAXException{
xpath.clear();
super.startDocument();
}
#Override
public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException{
super.startElement(uri, localName, qName, atts);
xpath.add(new QName(uri, localName));
XSElementDeclaration elem = XSUtil.findElementDeclaration(schema, this.xpath);
if(elem!=null){
XSTypeDefinition type = elem.getTypeDefinition();
if(type.getTypeCategory()==XSTypeDefinition.SIMPLE_TYPE){
XSSimpleTypeDefinition simpleType = (XSSimpleTypeDefinition)type;
while(!Namespaces.URI_XSD.equals(simpleType.getNamespace()))
simpleType = (XSSimpleTypeDefinition)simpleType.getBaseType();
if("base64Binary".equals(simpleType.getName())){
try{
File file = File.createTempFile("data", "binary");
file.deleteOnExit();
fileWriter = new FileWriter(file);
String absolutePath = file.getAbsolutePath();
super.characters(absolutePath.toCharArray(), 0, absolutePath.length());
}catch(IOException ex){
throw new SAXException(ex);
}
}
}
}
}
#Override
public void characters(char[] ch, int start, int length) throws SAXException{
try{
if(fileWriter==null)
super.characters(ch, start, length);
else
fileWriter.write(ch, start, length);
}catch(IOException ex){
throw new SAXException(ex);
}
}
#Override
public void endElement(String uri, String localName, String qName) throws SAXException{
xpath.remove(xpath.size() - 1);
try{
if(fileWriter!=null)
fileWriter.close();
fileWriter = null;
}catch(IOException ex){
throw new SAXException(ex);
}
super.endElement(uri, localName, qName);
}
};
}
Now read xml file as below:
public static void main(String[] args) throws Exception{
DocumentTemplateFile obj = (DocumentTemplateFile)unmarshal(DocumentTemplateFile.class, new InputSource("sample.xml"));
// obj.data refers to File which contains base64 encoded data
}
create custom XmlAdapter as below:
public class Base64FileAdapter extends XmlAdapter<String, File>{
#Override
public String marshal(File file) throws Exception {
// todo: read file and convert to base64 and return
}
#Override
public File unmarshal(String data) throws Exception {
File file = File.createTempFile("dataFile", "binary");
file.deleteOnExit();
//todo: base64 decode string data and write bytes to file
return file;
}
}
now inside you bean, use it:
#XmlElement(required = true)
#XmlJavaTypeAdapter(Base64FileAdapter.class)
private File dataFile;
now the entire binary content is stored in file. you can read/write from this file. and this file is deleted on jvm exit.

Extracting links of a facebook page

How can I extract all the links of a facebook page. Can I extract it using jsoup and pass "like" link as parameter to extract all the user's info who liked that particular page
private static String readAll(Reader rd) throws IOException
{
StringBuilder sb = new StringBuilder();
int cp;
while ((cp = rd.read()) != -1)
{
sb.append((char) cp);
}
return sb.toString();
}
public static JSONObject readurl(String url) throws IOException, JSONException
{
InputStream is = new URL(url).openStream();
try
{
BufferedReader rd = new BufferedReader
(new InputStreamReader(is, Charset.forName("UTF-8")));
String jsonText = readAll(rd);
JSONObject json = new JSONObject(jsonText);
return json;
}
finally
{
is.close();
}
}
public static void main(String[] args) throws IOException,
JSONException, FacebookException
{
try
{
System.out.println("\nEnter the search string:");
#SuppressWarnings("resource")
Scanner sc=new Scanner(System.in);
String s=sc.nextLine();
JSONObject json = readurl("https://graph.facebook.com/"+s);
System.out.println(json);
}}
CAN i MODIFY THIS AND INTEGRATE THIS CODE. BELOW CODE EXTRACTS ALL LINKS OF A PARTICULAR PAGE. i TRIED TO THE ABOVE CODE BUT IT'S NOT WORKING
String url = "http://www.firstpost.com/tag/crime-in-india";
Document doc = Jsoup.connect(url).get();
Elements links = doc.getElementsByTag("a");
System.out.println(links.size());
for (Element link : links)
{
System.out.println(link.absUrl("href") +trim(link.text(), 35));
}
}
public static String trim(String s, int width) {
if (s.length() > width)
return s.substring(0, width-1) + ".";
else
return s;
}
}
you can try alternative way also like this :
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.LinkedHashSet;
import java.util.Set;
import javax.swing.text.MutableAttributeSet;
import javax.swing.text.html.HTML;
import javax.swing.text.html.HTML.Tag;
import javax.swing.text.html.HTMLEditorKit;
import javax.swing.text.html.parser.ParserDelegator;
public class URLExtractor {
private static class HTMLPaserCallBack extends HTMLEditorKit.ParserCallback {
private Set<String> urls;
public HTMLPaserCallBack() {
urls = new LinkedHashSet<String>();
}
public Set<String> getUrls() {
return urls;
}
#Override
public void handleSimpleTag(Tag t, MutableAttributeSet a, int pos) {
handleTag(t, a, pos);
}
#Override
public void handleStartTag(Tag t, MutableAttributeSet a, int pos) {
handleTag(t, a, pos);
}
private void handleTag(Tag t, MutableAttributeSet a, int pos) {
if (t == Tag.A) {
Object href = a.getAttribute(HTML.Attribute.HREF);
if (href != null) {
String url = href.toString();
if (!urls.contains(url)) {
urls.add(url);
}
}
}
}
}
public static void main(String[] args) throws IOException {
InputStream is = null;
try {
String u = "https://www.facebook.com/";
URL url = new URL(u);
is = url.openStream(); // throws an IOException
HTMLPaserCallBack cb = new HTMLPaserCallBack();
new ParserDelegator().parse(new BufferedReader(new InputStreamReader(is)), cb, true);
for (String aUrl : cb.getUrls()) {
System.out.println("Found URL: " + aUrl);
}
} catch (MalformedURLException mue) {
mue.printStackTrace();
} catch (IOException ioe) {
ioe.printStackTrace();
} finally {
try {
is.close();
} catch (IOException ioe) {
// nothing to see here
}
}
}
}
Kind of works, but im not sure you could use jsoup for this I would rather look into casperjs or phantomjs
import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class getFaceBookLinks {
public static Elements getElementsByTag_then_FilterBySelector (String tag, String httplink, String selector){
Document doc = null;
try {
doc = Jsoup.connect(httplink).get();
} catch (IOException e) {
e.printStackTrace();
}
Elements links = doc.getElementsByTag(tag);
return links.select(selector);
}
//Test functionality
public static void main(String[] args){
// The class name for the like links on facebook is UFILikeLink
Elements likeLinks = getElementsByTag_then_FilterBySelector("a", "http://www.facebook.com", ".UFILikeLink");
System.out.println(likeLinks);
}
}

Get line number from xml node - java

I have parsed an XML file and have gotten a Node that I am interested in. How can I now find the line number in the source XML file where this node occurs?
EDIT:
Currently I am using the SAXParser to parse my XML. However I will be happy with a solution using any parser.
Along with the Node, I also have the XPath expression for the node.
I need to get the line number because I am displaying the XML file in a textbox, and need to highlight the line where the node occured. Assume that the XML file is nicely formatted with sufficient line breaks.
I have got this working by following this example:
http://eyalsch.wordpress.com/2010/11/30/xml-dom-2/
This solution follows the method suggested by Michael Kay. Here is how you use it:
// XmlTest.java
import java.io.ByteArrayInputStream;
import java.io.InputStream;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
public class XmlTest {
public static void main(final String[] args) throws Exception {
String xmlString = "<foo>\n"
+ " <bar>\n"
+ " <moo>Hello World!</moo>\n"
+ " </bar>\n"
+ "</foo>";
InputStream is = new ByteArrayInputStream(xmlString.getBytes());
Document doc = PositionalXMLReader.readXML(is);
is.close();
Node node = doc.getElementsByTagName("moo").item(0);
System.out.println("Line number: " + node.getUserData("lineNumber"));
}
}
If you run this program, it will out put: "Line number: 3"
PositionalXMLReader is a slightly modified version of the example linked above.
// PositionalXMLReader.java
import java.io.IOException;
import java.io.InputStream;
import java.util.Stack;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.xml.sax.Attributes;
import org.xml.sax.Locator;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class PositionalXMLReader {
final static String LINE_NUMBER_KEY_NAME = "lineNumber";
public static Document readXML(final InputStream is) throws IOException, SAXException {
final Document doc;
SAXParser parser;
try {
final SAXParserFactory factory = SAXParserFactory.newInstance();
parser = factory.newSAXParser();
final DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
final DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
doc = docBuilder.newDocument();
} catch (final ParserConfigurationException e) {
throw new RuntimeException("Can't create SAX parser / DOM builder.", e);
}
final Stack<Element> elementStack = new Stack<Element>();
final StringBuilder textBuffer = new StringBuilder();
final DefaultHandler handler = new DefaultHandler() {
private Locator locator;
#Override
public void setDocumentLocator(final Locator locator) {
this.locator = locator; // Save the locator, so that it can be used later for line tracking when traversing nodes.
}
#Override
public void startElement(final String uri, final String localName, final String qName, final Attributes attributes)
throws SAXException {
addTextIfNeeded();
final Element el = doc.createElement(qName);
for (int i = 0; i < attributes.getLength(); i++) {
el.setAttribute(attributes.getQName(i), attributes.getValue(i));
}
el.setUserData(LINE_NUMBER_KEY_NAME, String.valueOf(this.locator.getLineNumber()), null);
elementStack.push(el);
}
#Override
public void endElement(final String uri, final String localName, final String qName) {
addTextIfNeeded();
final Element closedEl = elementStack.pop();
if (elementStack.isEmpty()) { // Is this the root element?
doc.appendChild(closedEl);
} else {
final Element parentEl = elementStack.peek();
parentEl.appendChild(closedEl);
}
}
#Override
public void characters(final char ch[], final int start, final int length) throws SAXException {
textBuffer.append(ch, start, length);
}
// Outputs text accumulated under the current node
private void addTextIfNeeded() {
if (textBuffer.length() > 0) {
final Element el = elementStack.peek();
final Node textNode = doc.createTextNode(textBuffer.toString());
el.appendChild(textNode);
textBuffer.delete(0, textBuffer.length());
}
}
};
parser.parse(is, handler);
return doc;
}
}
If you are using a SAX parser then the line number of an event can be obtained using the Locator object, which is notified to the ContentHandler via the setDocumentLocator() callback. This is called at the start of parsing, and you need to save the Locator; then after any event (such as startElement()), you can call methods such as getLineNumber() to obtain the current position in the source file. (After startElement(), the callback is defined to give you the line number on which the ">" of the start tag appears.)
Note that according to the spec (of Locator.getLineNumber()) the method returns the line number where the SAX-event ends!
In the case of "startElement()" this means:
Here the line number for Element is 1:
<Element></Element>
Here the line number for Element is 3:
<Element
attribute1="X"
attribute2="Y">
</Element>
priomsrb's answer is great and works. For my usecase i need to integrate it to an existing framework where e.g. the encoding is also covered. Therefore the following refactoring was applied to have a separate LineNumberHandler class.
Then the code will also work with a Sax InputSource where the encoding can be modified like this:
// read in the xml document
org.xml.sax.InputSource is=new org.xml.sax.InputSource();
is.setByteStream(instream);
if (encoding!=null) {
is.setEncoding(encoding);
if (Debug.CORE)
Debug.log("setting XML encoding to - "+is.getEncoding());
}
Separate LineNumberHandler
/**
* LineNumber Handler
* #author wf
*
*/
public static class LineNumberHandler extends DefaultHandler {
final Stack<Element> elementStack = new Stack<Element>();
final StringBuilder textBuffer = new StringBuilder();
private Locator locator;
private Document doc;
/**
* create a line number Handler for the given document
* #param doc
*/
public LineNumberHandler(Document doc) {
this.doc=doc;
}
#Override
public void setDocumentLocator(final Locator locator) {
this.locator = locator; // Save the locator, so that it can be used
// later for line tracking when traversing
// nodes.
}
#Override
public void startElement(final String uri, final String localName,
final String qName, final Attributes attributes) throws SAXException {
addTextIfNeeded();
final Element el = doc.createElement(qName);
for (int i = 0; i < attributes.getLength(); i++) {
el.setAttribute(attributes.getQName(i), attributes.getValue(i));
}
el.setUserData(LINE_NUMBER_KEY_NAME,
String.valueOf(this.locator.getLineNumber()), null);
elementStack.push(el);
}
#Override
public void endElement(final String uri, final String localName,
final String qName) {
addTextIfNeeded();
final Element closedEl = elementStack.pop();
if (elementStack.isEmpty()) { // Is this the root element?
doc.appendChild(closedEl);
} else {
final Element parentEl = elementStack.peek();
parentEl.appendChild(closedEl);
}
}
#Override
public void characters(final char ch[], final int start, final int length)
throws SAXException {
textBuffer.append(ch, start, length);
}
// Outputs text accumulated under the current node
private void addTextIfNeeded() {
if (textBuffer.length() > 0) {
final Element el = elementStack.peek();
final Node textNode = doc.createTextNode(textBuffer.toString());
el.appendChild(textNode);
textBuffer.delete(0, textBuffer.length());
}
}
};
PositionalXMLReader
public class PositionalXMLReader {
final static String LINE_NUMBER_KEY_NAME = "lineNumber";
/**
* read a document from the given input strem
*
* #param is
* - the input stream
* #return - the Document
* #throws IOException
* #throws SAXException
*/
public static Document readXML(final InputStream is)
throws IOException, SAXException {
final Document doc;
SAXParser parser;
try {
final SAXParserFactory factory = SAXParserFactory.newInstance();
parser = factory.newSAXParser();
final DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
.newInstance();
final DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
doc = docBuilder.newDocument();
} catch (final ParserConfigurationException e) {
throw new RuntimeException("Can't create SAX parser / DOM builder.", e);
}
LineNumberHandler handler = new LineNumberHandler(doc);
parser.parse(is, handler);
return doc;
}
}
JUnit Testcase
package com.bitplan.common.impl;
import static org.junit.Assert.assertEquals;
import java.io.ByteArrayInputStream;
import java.io.InputStream;
import org.junit.Test;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import com.bitplan.bobase.PositionalXMLReader;
public class TestXMLWithLineNumbers {
/**
* get an Example XML Stream
* #return the example stream
*/
public InputStream getExampleXMLStream() {
String xmlString = "<foo>\n" + " <bar>\n"
+ " <moo>Hello World!</moo>\n" + " </bar>\n" + "</foo>";
InputStream is = new ByteArrayInputStream(xmlString.getBytes());
return is;
}
#Test
public void testXMLWithLineNumbers() throws Exception {
InputStream is = this.getExampleXMLStream();
Document doc = PositionalXMLReader.readXML(is);
is.close();
Node node = doc.getElementsByTagName("moo").item(0);
assertEquals("3", node.getUserData("lineNumber"));
}
}

Categories

Resources