Unable to parse JSON from url - java

Write a piece of code that will query a URL that returns JSON and can parse the JSON string to pull out pieces of information. The information that should be parsed and returned is the pageid and the list of “See Also” links. Those links should be formatted to be actual links that can be used by a person to find the appropriate article.
Use the Wikipedia API for the query. A sample query is:
URL
Other queries can be generated changing the “titles” portion of the query string. The code to parse the JSON and pull the “See Also” links should be generic enough to work on any Wikipedia article.
I tried writing the below code:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;
import org.json.JSONException;
import org.json.JSONObject;
public class JsonRead {
private static String readUrl(String urlString) throws Exception {
BufferedReader reader = null;
try {
URL url = new URL(urlString);
reader = new BufferedReader(new InputStreamReader(url.openStream()));
StringBuffer buffer = new StringBuffer();
int read;
char[] chars = new char[1024];
while ((read = reader.read(chars)) != -1)
buffer.append(chars, 0, read);
return buffer.toString();
} finally {
if (reader != null)
reader.close();
}
}
public static void main(String[] args) throws IOException, JSONException {
JSONObject json;
try {
json = new JSONObject(readUrl("https://en.wikipedia.org/w/api.php?format=json&action=query&titles=SMALL&prop=revisions&rvprop=content"));
System.out.println(json.toString());
System.out.println(json.get("pageid"));
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
I have used the json jar from the below link in eclipse:
Json jar
When I run the above code I am getting the below error;
org.json.JSONException: JSONObject["pageid"] not found.
at org.json.JSONObject.get(JSONObject.java:471)
at JsonRead.main(JsonRead.java:35)
How can I extract the details of the pageid and also the "See Also" links from the url?
I have never worked on JSON before hence kindly let me know how to proceed here
The json:
{
"batchcomplete":"",
"query":{
"pages":{
"1808130":{
"pageid":1808130,
"ns":0,
"title":"SMALL",
"revisions":[
{
"contentformat":"text/x-wiki",
"contentmodel":"wikitext",
"*":"{{About|the ALGOL-like programming language|the scripting language formerly named Small|Pawn (scripting language)}}\n\n'''SMALL''', Small Machine Algol Like Language, is a [[computer programming|programming]] [[programming language|language]] developed by Dr. [[Nevil Brownlee]] of [[Auckland University]].\n\n==History==\nThe aim of the language was to enable people to write [[ALGOL]]-like code that ran on a small machine. It also included the '''string''' type for easier text manipulation.\n\nSMALL was used extensively from about 1980 to 1985 at [[Auckland University]] as a programming teaching aid, and for some internal projects. Originally written to run on a [[Burroughs Corporation]] B6700 [[Main frame]] in [[Fortran]] IV, subsequently rewritten in SMALL and ported to a DEC [[PDP-10]] Architecture (on the [[Operating System]] [[TOPS-10]]) and IBM S360 Architecture (on the Operating System VM/[[Conversational Monitor System|CMS]]).\n\nAbout 1985, SMALL had some [[Object-oriented programming|object-oriented]] features added to handle structures (that were missing from the early language), and to formalise file manipulation operations.\n\n==See also==\n*[[ALGOL]]\n*[[Lua (programming language)]]\n*[[Squirrel (programming language)]]\n\n==References==\n*[http://www.caida.org/home/seniorstaff/nevil.xml Nevil Brownlee]\n\n[[Category:Algol programming language family]]\n[[Category:Systems programming languages]]\n[[Category:Procedural programming languages]]\n[[Category:Object-oriented programming languages]]\n[[Category:Programming languages created in the 1980s]]"
}
]
}
}
}
}

If You Read your Exception Carefully you will find your solution at your own.
Exception in thread "main" org.json.JSONException: A JSONObject text must begin with '{' at 1 [character 2 line 1]
at org.json.JSONTokener.syntaxError(JSONTokener.java:433)
Your Exception says A JSONObject text must begin with '{' it means the the json you received from the api is probably not Correct.
So, I suggest you to debug your code and try to find out what you actually received in your String Variable jsonText.

You get the exception org.json.JSONException: JSONObject["pageid"] not found. when calling json.get("pageid") because pageid is not a direct sub-element of your root. You have to go all the way down through the object graph:
int pid = json.getJSONObject("query")
.getJSONObject("pages")
.getJSONObject("1808130")
.getInt("pageid");
If you have an array in there you will even have to iterate the array elements (or pick the one you want).
Edit Here's the code to get the field containing the 'see also' values
String s = json.getJSONObject("query")
.getJSONObject("pages")
.getJSONObject("1808130")
.getJSONArray("revisions")
.getJSONObject(0)
.getString("*");
The resulting string contains no valid JSON. You will have to parse it manually.

Related

Parse a String java

I have a BuilderString that contain the same result as in this link:
https://hadoop.apache.org/docs/current/hadoop-project-dist/
I'm looking to extract the values of the ``. And return a list of String that contain all the files name.
My code is:
try {
HttpURLConnection conHttp = (HttpURLConnection) url.openConnection();
conHttp.setRequestMethod("GET");
conHttp.setDoInput(true);
InputStream in = conHttp.getInputStream();
int ch;
StringBuilder sb = new StringBuilder();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
How can I parse JSON to take all the values of pathSuffix and return a list of string that contains the file names ?
Could you please give me a suggestion ? Thanks
That is JSON formatted data; JSON is not regular, tehrefore, trying to parse this with a regular expression is impossible, and trying to parse it out with substring and friends will take you a week and will be very error prone.
Read up on what JSON is (no worries; it's very simple to understand!), then get a good JSON library (the standard json.org library absolutely sucks, don't get that one), such as Jackson or GSON, and the code to extract what you need will be robust and easy to write and test.
The good option
Do the following steps:
Convert to JSON
Get the value using: JSONObject.get("FileStatuses").getAsJson().get("FileStatus").getAsJsonArray()
Iterate over all objects in the array to get the value you want
The bad option
Although as mentioned it is not recommended- If you want to stay with Strings you can use:
String str_to_find= "pathSuffix" : \"";
while (str.indexOf(str_to_find) != -1){
str = str.substring(str.indexOf(str_to_find)+str_to_find.length);
value = str.substring(0,str.indexOf("\""));
System.out.println("Value is " + value);
}
I would not recommend to build from scratch an API binding for hadoop.
This binding exist already for the Java language:
https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileSystem.html#listLocatedStatus-org.apache.hadoop.fs.Path-org.apache.hadoop.fs.PathFilter-

SLOW SPEED in using SAX Parser to parse XML data and save it to mysql localhost (JAVA)

I am programming in JAVA for my current program with the problem.
I have to parse a big .rdf file(XML format) which is 1.60 GB in size,
and then insert the parsed data to mysql localhost server.
After googling, I decided to use SAX parser in my code.
Many sites encouraged using SAX parser over DOM parser,
saying that SAX parser is much faster than DOM parser.
However, when I executed my code which uses SAX parser, I found out that
my program executes so slow.
One senior in my lab told me that the slow speed issue might have occurred
from file I/O process.
In the code of 'javax.xml.parsers.SAXParser.class',
'InputStream' is used for file input, which could make the job slow compared
to using 'Scanner' class or 'BufferedReader' class.
My question is..
1. Are SAX parsers good for parsing large-scale xml documents?
My program took 10 minutes to parse a 14MB sample file and insert data
to mysql localhost.
Actually, another senior in my lab who made a similar program
as mine but using DOM parser parses the 1.60GB xml file and saves data
in an hour.
How can I use 'BufferedReader' instead of using 'InputStream',
while using the SAX parser library?
This is my first question asking to stackoverflow, so any kinds of advices would be thankful and helpful. Thank you for reading.
Added part after receiving initial feedbacks
I should have uploaded my code to clarify my problem, I apologize for it..
package xml_parse;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.FileInputStream;
import java.io.FileOutputStream;
public class Readxml extends DefaultHandler {
Connection con = null;
String[] chunk; // to check /A/, /B/, /C/ kind of stuff.
public Readxml() throws SQLException {
// connect to local mysql database
con = DriverManager.getConnection("jdbc:mysql://localhost/lab_first",
"root", "2030kimm!");
}
public void getXml() {
try {
// obtain and configure a SAX based parser
SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
// obtain object for SAX parser
SAXParser saxParser = saxParserFactory.newSAXParser();
// default handler for SAX handler class
// all three methods are written in handler's body
DefaultHandler default_handler = new DefaultHandler() {
String topic_gate = "close", category_id_gate = "close",
new_topic_id, new_catid, link_url;
java.sql.Statement st = con.createStatement();
public void startElement(String uri, String localName,
String qName, Attributes attributes)
throws SAXException {
if (qName.equals("Topic")) {
topic_gate = "open";
new_topic_id = attributes.getValue(0);
// apostrophe escape in SQL query
new_topic_id = new_topic_id.replace("'", "''");
if (new_topic_id.contains("International"))
topic_gate = "close";
if (new_topic_id.equals("") == false) {
chunk = new_topic_id.split("/");
for (int i = 0; i < chunk.length - 1; i++)
if (chunk[i].length() == 1) {
topic_gate = "close";
break;
}
}
if (new_topic_id.startsWith("Top/"))
new_topic_id.replace("Top/", "");
}
if (topic_gate.equals("open") && qName.equals("catid"))
category_id_gate = "open";
// add each new link to table "links" (MySQL)
if (topic_gate.equals("open") && qName.contains("link")) {
link_url = attributes.getValue(0);
link_url = link_url.replace("'", "''"); // take care of
// apostrophe
// escape
String insert_links_command = "insert into links(link_url, catid) values('"
+ link_url + "', " + new_catid + ");";
try {
st.executeUpdate(insert_links_command);
} catch (SQLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
public void characters(char ch[], int start, int length)
throws SAXException {
if (category_id_gate.equals("open")) {
new_catid = new String(ch, start, length);
// add new row to table "Topics" (MySQL)
String insert_topics_command = "insert into topics(topic_id, catid) values('"
+ new_topic_id + "', " + new_catid + ");";
try {
st.executeUpdate(insert_topics_command);
} catch (SQLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
public void endElement(String uri, String localName,
String qName) throws SAXException {
if (qName.equals("Topic"))
topic_gate = "close";
if (qName.equals("catid"))
category_id_gate = "close";
}
};
// BufferedInputStream!!
String filepath = null;
BufferedInputStream buffered_input = null;
/*
* // Content filepath =
* "C:/Users/Kim/Desktop/2016여름/content.rdf.u8/content.rdf.u8";
* buffered_input = new BufferedInputStream(new FileInputStream(
* filepath)); saxParser.parse(buffered_input, default_handler);
*
* // Adult filepath =
* "C:/Users/Kim/Desktop/2016여름/ad-content.rdf.u8"; buffered_input =
* new BufferedInputStream(new FileInputStream( filepath));
* saxParser.parse(buffered_input, default_handler);
*/
// Kids-and-Teens
filepath = "C:/Users/Kim/Desktop/2016여름/kt-content.rdf.u8";
buffered_input = new BufferedInputStream(new FileInputStream(
filepath));
saxParser.parse(buffered_input, default_handler);
System.out.println("Finished.");
} catch (SQLException sqex) {
System.out.println("SQLException: " + sqex.getMessage());
System.out.println("SQLState: " + sqex.getSQLState());
} catch (Exception e) {
e.printStackTrace();
}
}
}
This is my whole code of my program..
My original code from yesterday tried file I/O like the following way
(instead of using 'BufferedInputStream')
saxParser.parse("file:///C:/Users/Kim/Desktop/2016여름/content.rdf.u8/content.rdf.u8",
default_handler);
I expected some speed improvements in my program after I used
'BufferedInputStream', but speed didn't improve at all.
I am having trouble figuring out the bottleneck causing the speed issue.
Thank you very much.
the rdf file being read in the code is about 14 MB in size, and it takes about
11 minutes for my computer to execute this code.
Are SAX parsers good for parsing large-scale xml documents?
Yes clearly SAX and StAX parsers are the best choices to parse big XML documents as they are low memory and CPU consumers which is not the case of DOM parsers that load everything into memory which is clearly not the right choice in this case.
Response Update:
Regarding your code for me your slowness issue is more related to how you store your data in your database. Your current code executes your queries in auto commit mode while you should use the transactional mode for better performances as you have a lot of data to insert, read this for a better understanding. To reduce the round trips between the database and your application you should also consider using batch update like in this good example.
With a SAX parser you should be able to achieve a parsing speed of 1Gb/minute without too much difficulty. If it's taking 10min to parse 14Mb then either you are doing something wrong, or the time is being spent doing something other than SAX parsing (e.g. database updating).
You can keep with the SAX parser, and use a BufferedInputStream rather than a BufferedReader (as you then need not guess the charset encoding of the XML).
It could be for XML in general, that extra files are read: DTDs and such. For instance there is a huge number of named entities for (X)HTML. The usage of an XML catalog for having those remote files locally then helps enormously.
Maybe you can switch off validation.
Also you might compare network traffic versus calculation power using gzip compression. By setting headers and inspecting headers, a GZipInputStream-by-case might be more efficient (or not).

pass object to another JVM using serialization - same Java version and jars (both running our app)

Updates:
For now using a Map. Class that wants to send something to other instance sends the object, the routing string.
Use an object stream, use Java serializable to write the object to servlet.
Write String first and then the object.
Receiving servlet wraps input stream around a ObjectInputStream. Reads string first and then the Object. Routing string decides were it goes.
A more generic way might have been to send a class name and its declared method or a Spring bean name, but this was enough for us.
Original question
Know the basic way but want details of steps. Also know I can use Jaxb or RMI or EJB ... but would like to do this using pure serialization to a bytearray and then encode that send it from servlet 1 in jvm 1 to servlet 2 in jvm 2 (two app server instances in same LAN, same java versions and jars set up in both J2EE apps)
Basic steps are (Approcah 1) :-
serialize any Serializable object to a byte array and make a string. Exact code see below
Base64 output of 1. Is it required to base 64 or can skip step 2?
use java.util.URLEncode.encode to encode the string
use apache http components or URL class to send from servlet 1 to 2 after naming params
on Servlet 2 J2EE framework would have already URLDecoced it, now just do reverse steps and cast to object according to param name.
Since both are our apps we would know the param name to type / class mapping. Basically looking for the fastest & most convenient way of sending objects between JVMs.
Example :
POJO class to send
package tst.ser;
import java.io.Serializable;
public class Bean1 implements Serializable {
/**
* make it 2 if add something without default handling
*/
private static final long serialVersionUID = 1L;
private String s;
public String getS() {
return s;
}
public void setS(String s) {
this.s = s;
}
}
* Utility *
package tst.ser;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.net.URLEncoder;
public class SerUtl {
public static String serialize(Object o) {
String s = null;
ObjectOutputStream os = null;
try {
os = new ObjectOutputStream(new ByteArrayOutputStream());
os.writeObject(o);
s = BAse64.encode(os.toByeArray());
//s = URLEncoder.encode(s, "UTF-8");//keep this for sending part
} catch (Exception e) {
// TODO: logger
e.printStackTrace();
return null;
} finally {
// close OS but is in RAM
try {
os.close();// not required in RAM
} catch (Exception e2) {// TODO: handle exception logger
}
os = null;
}
return s;
}
public static Object deserialize(String s) {
Object o = null;
ObjectInputStream is = null;
try {
// do base 64 decode if done in serialize
is = new ObjectInputStream(new ByteArrayInputStream(
Base64.decode(s)));
o = is.readObject();
} catch (Exception e) {
// TODO: logger
e.printStackTrace();
return null;
} finally {
// close OS but is in RAM
try {
is.close();// not required in RAM
} catch (Exception e2) {// TODO: handle exception logger
}
is = null;
}
return o;
}
}
**** sample sending servlet ***
Bean1 b = new Bean1(); b.setS("asdd");
String s = SerUtl.serialize(b);
//do UrlEncode.encode here if sending lib does not.
HttpParam p = new HttpParam ("bean1", s);
//http components send obj
**** sample receiving servlet ***
String s = request.getParameter("bean1");
Bean1 b1 = (Beean1)SerUtl.deserialize(s);
Serialize any Serializable object with to a byte array
Yes.
and make a string.
No.
Exact statements see below
os = new ObjectOutputStream(new ByteArrayOutputStream());
os.writeObject(o);
s = os.toString();
// s = Base64.encode(s);//Need this some base 64 impl like Apache ?
s = URLEncoder.encode(s, "UTF-8");
These statements don't even do what you have described, which is in any case incorrect. OutputStream.toString() doesn't turn any bytes into Strings, it just returns a unique object identifier.
Base64 output of 1.
The base64 output should use the byte array as the input, not a String. String is not a container for binary data. See below for corrected code.
ByteArrayOutputStream baos = new ByteArrayOutputStream();
os = new ObjectOutputStream(baos);
os.writeObject(o);
os.close();
s = Base64.encode(baos.toByeArray()); // adjust to suit your API
s = URLEncoder.encode(s, "UTF-8");
This at least accomplishes your objective.
Is it required to base 64 or can skip step 2?
If you want a String you must encode it somehow.
Use java.util.URLEncode.encode to encode the string
This is only necessary if you're sending it as a GET or POST parameter.
Use apache http components or URL class to send from servlet 1 to 2 after naming params
Yes.
On Servlet 2 J2EE framework would have already URLDecoded it, now just do reverse steps and cast to object according to param name.
Yes, but remember to go directly from the base64-encoded string to the byte array, no intermediate String.
Basically looking for the fastest & most convenient way of sending objects between JVMs.
These objectives aren't necessarily reconcilable. The most convenient these days is probably XML or JSON but I doubt that these are faster than Serialization.
os = null;
Setting references that are about to fall out of scope to null is pointless.
HttpParam p = new HttpParam ("bean1", s);
It's possible that HttpParam does the URLEncoding for you. Check this.
You need not convert to string. You can post the binary data straight to the servlet, for example by creating an ObjectOutputStream on top of a HttpUrlConnection's outputstream. Set the request method to POST.
The servlet handling the post can deserialize from an ObjectStream created from the HttpServletRequest's ServletInputStream.
I'd recommend JAXB any time over binary serialization, though. The frameworks are not only great for interoperability, they also speed up development and create more robust solutions.
The advantages I see are way better tooling, type safety, and code generation, keeping your options open so you can call your code from another version or another language, and easier debugging. Don't underestimate the cost of hard to solve bugs caused by accidentally sending the wrong type or doubly escaped data to the servlet. I'd expect the performance benefits to be too small to compensate for this.
Found this Base64 impl that does a lot of the heavy lifting for me : http://iharder.net/base64
Has utility methods :
String encodeObject(java.io.Serializable serializableObject, int options )
Object decodeToObject(String encodedObject, int options, final ClassLoader loader )
Using :
try {
String dat = Base64.encodeObject(srlzblObj, options);
StringBuilder data = new StringBuilder().append("type=");
data.append(appObjTyp).append("&obj=").append(java.net.URLEncoder.encode(dat, "UTF-8"));
Use the type param to tell the receiving JVM what type of object I'm sending. Each servlet/ jsps at most receives 4 types, usually 1. Again since its our own app and classes that we are sending this is quick (as in time to send over the network) and simple.
On the other end unpack it by :
String objData = request.getParameter("obj");
Object obj = Base64.decodeToObject(objData, options, null);
Process it, encode the result, send result back:
reply = Base64.encodeObject(result, options);
out.print("rsp=" + reply);
Calling servlet / jsp gets the result:
if (reply != null && reply.length() > 4) {
String objDataFromServletParam = reply.substring(4);
Object obj = Base64.decodeToObject(objDataFromServletParam, options, null);
options can be 0 or Base64.GZIP
You can use JMS as well.
Apache Active-MQ is one good solution. You will not have to bother with all this conversion.
/**
* #param objectToQueue
* #throws JMSException
*/
public void sendMessage(Serializable objectToQueue) throws JMSException
{
ObjectMessage message = session.createObjectMessage();
message.setObject(objectToQueue);
producerForQueue.send(message);
}
/**
* #param objectToQueue
* #throws JMSException
*/
public Serializable receiveMessage() throws JMSException
{
Message message = consumerForQueue.receive(timeout);
if (message instanceof ObjectMessage)
{
ObjectMessage objMsg = (ObjectMessage) message;
Serializable sobject = objMsg.getObject();
return sobject;
}
return null;
}
My point is do not write custom code for Serialization, iff it can be avoided.
When you use AMQ, all you need to do is make your POJO serializable.
Active-MQ functions take care of serialization.
If you want fast response from AMQ, use vm-transport. It will minimize n/w overhead.
You will automatically get benefits of AMQ features.
I am suggesting this because
You have your own Applications running on network.
You need a mechanism to transfer objects.
You will need a way to monitor it as well.
If you go for custom solution, you might have to solve above things yourselves.

Parsing Java syntax with regex

I am currently developing a corrector for java in my text editor. To do so I think the best way is to use Pattern to look for element of java syntax (import or package declaration, class or method declaration...). I have already written some of these pattern:
private String regimport = "^import(\\s+)(static |)(\\w+\\.)*(\\w+)(\\s*);(\\s*)$",
regpackage="^package(\\s+)[\\w+\\.]*[\\w+](\\s*);(\\s*)$",
regclass="^((public(\\s+)abstract)|(abstract)|(public)|(final)|(public(\\s+)final)|)(\\s+)class(\\s+)(\\w+)(((\\s+)(extends|implements)(\\s+)(\\w+))|)(\\s*)(\\{)?(\\s*)$";
It's not very difficult for now but I am afraid it will take a long time to achieve it. Does someone know if something similar already exists?
To do so I think the best way is to use Pattern to look for element of java syntax
Incorrect. Regular Expression patterns cannot adequately identify Java syntax elements. That is why the much more complex parsers exist. For a simple example, just imagine how you would you avoid the false match for a reserved word inside a comment, such as following
/* this is not importing anything
import java.util.*;
*/
But if you are very keen to use regular expressions, and willing to spend lot of effort, look at Emacs font-lock-mode, which uses regular expressions to identify and fontify syntax elements.
PS: The "lot of effort" I mention refers to learning how Emacs works, reading elisp code and translating Emacs regexp to Java. if you already know all that then you will need less effort.
Thank you all for your answers. I think I'm going to work with javaparser AST, it will be a lot easier :)
Here is a code to check for error with AST
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import org.eclipse.jdt.core.compiler.IProblem;
import org.eclipse.jdt.core.dom.AST;
import org.eclipse.jdt.core.dom.ASTParser;
import org.eclipse.jdt.core.dom.CompilationUnit;
public class Main {
public static void main(String[] args) {
ASTParser parser = ASTParser.newParser(AST.JLS2);
FileInputStream in=null;
try {
in = new FileInputStream("/root/java/Animbis.java"); //your personal java source file
int n;
String text="";
while( (n=in.read()) !=-1) {
text+=(char)n;
}
CompilationUnit cu;
// parse the file
parser.setSource(text.toCharArray());
in.close();
}catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
CompilationUnit unit = (CompilationUnit) parser.createAST(null);
//unit.recordModifications();
AST ast = unit.getAST();
IProblem[] problems = unit.getProblems();
boolean error = false;
for (IProblem problem : problems) {
StringBuffer buffer = new StringBuffer();
buffer.append(problem.getMessage());
buffer.append(" line: ");
buffer.append(problem.getSourceLineNumber());
String msg = buffer.toString();
if(problem.isError()) {
error = true;
msg = "Error:\n" + msg;
}
else
if(problem.isWarning())
msg = "Warning:\n" + msg;
System.out.println(msg);
}
}
}
To run with the following jar:
org.eclipse.core.contenttype.jar
org.eclipse.core.jobs.jar
org.eclipse.core.resources.jar
org.eclipse.core.runtime.jar
org.eclipse.equinox.common.jar
org.eclipse.equinox.preferences.jar
org.eclipse.jdt.core.jar
org.eclipse.osgi.jar
Got infos from
Eclipse ASTParser and Example of ASTParser
Java's complete syntax cannot be parsed by RegEx. They are different classes of language. Java is at least a Chomsky type 2 language, whereas RegEx is type 3, and type 2 is fundamentally more complex than type 3. See also this famous answer about parsing HTML with RegEx... it's essentially the same problem.

Strange Whitespace Error when Accessing RSS Feed

I'm not sure if anyone else has encountered or asked about this before, but for my application I make use of two Yahoo! RSS Feeds: Top News and Weather Forcast. I'm new to the idea of using these in the first place, but from what I've read, I simply need to make an HTTP GET request to a specific URL to retrieve an XML file which I can parse for the information I want. I have the parser working just fine, for I tested it with a sample XML file from each feed; however, a strange error is occuring when I use the AJAX GET call to the urls:
The XML page cannot be displayed
Cannot view XML input using XSL style sheet. Please correct the error and then click the Refresh button, or try again later.
Whitespace is not allowed at this location.
Error processing resource 'http://localhost:8080/BBS/fservlet?p=n'. Line 28, P...
for (i = 0; i < s.length; i++){
-------------------^
Note that I have this applciation "BBS" currently deployed on my local system with Tomcat. I looked into whitespace errors like this, and most seem to point to some line within the XML file itself that's having a problem. In most cases, it had something to do with escaping the "&" symbol, but it appears as though IE is telling me that the error is within a for-loop. I'm no XML expert, but I've never seen a for-loop within an XML. Even so, I've gone to the url directly in my browser and viewed the XML file (its the one I used to test my parsing) and found no such line. In addition, no such loop exists anywhere in my code. In other words, I'm not sure if this is an error on my end, or some configuration setting. Here's the code I'm working with, however:
jQuery Code
// Located in my JSP file
var baseContext = "<%=request.getContextPath()%>";
$(document).ready(function() {
ParseWeather();
ParseNews();
}
// Located in a separate JS file
function ParseWeather() {
$.get(baseContext + "/servlet?p=w", function(data) {
// XML Parser
}
// Data Manipulation
}
function ParseNews() {
$.get(baseContext + "/servlet?p=n", function(data) {
// XML Parser
}
// Data Manipulation
}
Java Code
import java.io.*;
import javax.servlet.*;
import javax.servlet.http.*;
import javax.servlet.http.HttpServlet;
import java.net.URL;
public class FeedServlet extends HttpServlet {
protected void doGet(final HttpServletRequest request, final HttpServletResponse response) throws ServletException, IOException {
try {
response.setContentType("text/xml");
final URL url;
String line = "";
if(request.getParameter("p").equals("w")) {
// Configuration setting that returns: "http://xml.weather.yahoo.com/forecastrss?p=USOR0186"
url = new URL(AppConfiguration.getInstance().getForcastUrl());
} else {
// Configuration setting that returns: "http://news.yahoo.com/rss/"
url = new URL(AppConfiguration.getInstance().getNewsUrl());
}
final BufferedReader reader = new BufferedReader(new InputStreamReader(url.openStream());
final PrintWriter writer = response.getWriter();
while((line = reader.readLine()) != null) {
writer.println(line);
writer.flush();
}
writer.close();
} catch(IOException e) {
e.printStackTrace();
}
}
}
My company has a AppConfiguration class that allows for certain variables, like the URL's, to be changed through the configuration page. At any rate, those two calls simple return the urls...
Yahoo! Forcast RSS Feed:
http://xml.weather.yahoo.com/forecastrss?p=USOR0186
Yahoo! News: Top Stories Feed:
http://news.yahoo.com/rss/
Anyway, any help would be incredibly helpful.
for (i = 0; i < s.length; i++){
The error is at the less-than symbol, which means that the XML parser is reading your source code! Use WGET to get the resource and check that actual XML is returned and not source code.

Categories

Resources