Parsing Java syntax with regex - java

I am currently developing a corrector for java in my text editor. To do so I think the best way is to use Pattern to look for element of java syntax (import or package declaration, class or method declaration...). I have already written some of these pattern:
private String regimport = "^import(\\s+)(static |)(\\w+\\.)*(\\w+)(\\s*);(\\s*)$",
regpackage="^package(\\s+)[\\w+\\.]*[\\w+](\\s*);(\\s*)$",
regclass="^((public(\\s+)abstract)|(abstract)|(public)|(final)|(public(\\s+)final)|)(\\s+)class(\\s+)(\\w+)(((\\s+)(extends|implements)(\\s+)(\\w+))|)(\\s*)(\\{)?(\\s*)$";
It's not very difficult for now but I am afraid it will take a long time to achieve it. Does someone know if something similar already exists?

To do so I think the best way is to use Pattern to look for element of java syntax
Incorrect. Regular Expression patterns cannot adequately identify Java syntax elements. That is why the much more complex parsers exist. For a simple example, just imagine how you would you avoid the false match for a reserved word inside a comment, such as following
/* this is not importing anything
import java.util.*;
*/
But if you are very keen to use regular expressions, and willing to spend lot of effort, look at Emacs font-lock-mode, which uses regular expressions to identify and fontify syntax elements.
PS: The "lot of effort" I mention refers to learning how Emacs works, reading elisp code and translating Emacs regexp to Java. if you already know all that then you will need less effort.

Thank you all for your answers. I think I'm going to work with javaparser AST, it will be a lot easier :)
Here is a code to check for error with AST
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import org.eclipse.jdt.core.compiler.IProblem;
import org.eclipse.jdt.core.dom.AST;
import org.eclipse.jdt.core.dom.ASTParser;
import org.eclipse.jdt.core.dom.CompilationUnit;
public class Main {
public static void main(String[] args) {
ASTParser parser = ASTParser.newParser(AST.JLS2);
FileInputStream in=null;
try {
in = new FileInputStream("/root/java/Animbis.java"); //your personal java source file
int n;
String text="";
while( (n=in.read()) !=-1) {
text+=(char)n;
}
CompilationUnit cu;
// parse the file
parser.setSource(text.toCharArray());
in.close();
}catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
CompilationUnit unit = (CompilationUnit) parser.createAST(null);
//unit.recordModifications();
AST ast = unit.getAST();
IProblem[] problems = unit.getProblems();
boolean error = false;
for (IProblem problem : problems) {
StringBuffer buffer = new StringBuffer();
buffer.append(problem.getMessage());
buffer.append(" line: ");
buffer.append(problem.getSourceLineNumber());
String msg = buffer.toString();
if(problem.isError()) {
error = true;
msg = "Error:\n" + msg;
}
else
if(problem.isWarning())
msg = "Warning:\n" + msg;
System.out.println(msg);
}
}
}
To run with the following jar:
org.eclipse.core.contenttype.jar
org.eclipse.core.jobs.jar
org.eclipse.core.resources.jar
org.eclipse.core.runtime.jar
org.eclipse.equinox.common.jar
org.eclipse.equinox.preferences.jar
org.eclipse.jdt.core.jar
org.eclipse.osgi.jar
Got infos from
Eclipse ASTParser and Example of ASTParser

Java's complete syntax cannot be parsed by RegEx. They are different classes of language. Java is at least a Chomsky type 2 language, whereas RegEx is type 3, and type 2 is fundamentally more complex than type 3. See also this famous answer about parsing HTML with RegEx... it's essentially the same problem.

Related

How to insert a line of Java from a text file into my Java code [duplicate]

This question already has answers here:
Java interpreter? [closed]
(10 answers)
Closed 9 years ago.
For debug reasons, I want to be able to run code that is typed in through the console. For example:
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
while(true){
String str = br.readLine(); //This can return 'a = 5;','b = "Text";' or 'pckg.example.MyClass.run(5);'
if(str == null)
return;
runCode(str); //How would I do this?
}
PLEASE DON'T ACTUALLY USE THIS
I was under the assumption you wanted to evaluate a string as Java code, not some scripting engine like Javascript, so
I created this on a whim after reading this, using the compiler API mark mentioned. It's probably very bad practice but it (somewhat) works like you wanted it to. I doubt it'll be much use in debugging since it runs the code in the context of a new class. Sample usage is included at the bottom.
import javax.tools.JavaCompiler;
import javax.tools.StandardJavaFileManager;
import javax.tools.ToolProvider;
import java.io.BufferedReader;
import java.io.File;
import java.io.InputStreamReader;
import java.io.PrintWriter;
import java.net.URL;
import java.net.URLClassLoader;
import java.util.Arrays;
public class main {
public static void runCode(String s) throws Exception{
JavaCompiler jc = ToolProvider.getSystemJavaCompiler();
StandardJavaFileManager sjfm = jc.getStandardFileManager(null, null, null);
File jf = new File("test.java"); //create file in current working directory
PrintWriter pw = new PrintWriter(jf);
pw.println("public class test {public static void main(){"+s+"}}");
pw.close();
Iterable fO = sjfm.getJavaFileObjects(jf);
if(!jc.getTask(null,sjfm,null,null,null,fO).call()) { //compile the code
throw new Exception("compilation failed");
}
URL[] urls = new URL[]{new File("").toURI().toURL()}; //use current working directory
URLClassLoader ucl = new URLClassLoader(urls);
Object o= ucl.loadClass("test").newInstance();
o.getClass().getMethod("main").invoke(o);
}
public static void main(String[] args) {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
while(true){
try {
String str = br.readLine(); //This can return 'a = 5;','b = "Text";' or 'pckg.example.MyClass.run(5);'
if(str == null)
return;
runCode(str); //How would I do this?
} catch(Exception e) {
e.printStackTrace();
}
}
}
}
//command line
> System.out.println("hello");
hello
> System.out.println(3+2+3+4+5+2);
19
> for(int i = 0; i < 10; i++) {System.out.println(i);}
0
1
2
3
4
5
6
7
8
9
With the SimpleJavaFileObject you could actually avoid using a file, as shown here, but the syntax seems a bit cumbersome so I just opted for a file in the current working directory.
EDIT: Convert String to Code offers a similar approach but it's not fully fleshed out
If the code is in JavaScript then you can run it with JavaScript engine:
Object res = new ScriptEngineManager().getEngineByName("js").eval(str);
JavaScript engine is part of Java SE since 1.6. See this guide http://download.java.net/jdk8/docs/technotes/guides/scripting/programmer_guide/index.html for details
You can use the Java scripting API which is located in the Package javax.script. There you can include several scripting languages like bsh for example.
You can find a programmer's guide on the web page of Oracle.
Rhino, which is some kind of JavaScript is already included with the Oracle JVM.
For this you may want to look into Java Compiler API. I haven't studied much as to how this works, but it allows you to load a java file, compile and load the class in an already running system. Maybe it can be repurposed into accepting input from console.
For a general compiler you could use Janino which will allow you to compile and run Java code. The expression evaluator may help with your example.
If you are just looking to evaluate expressions while debugging then Eclispe has the Display view which allows you to execute expressions. See this question.

Parse a file using ANTLR4

I have a Lexer and a Parser called y86 Lexer and Parser which work as far as I know. But I have a file with y86 commands and I want to parse them using Java. So far I have code as follows.
y86Lexer y86 = null;
CommonTokenStream tokenStream = null;
y86Parser y86p = null;
try
{
y86 = new y86Lexer(CharStreams.fromFileName("C:\\Users\\saigbomian\\Documents"
+ "\\LearnANTLR\\src\\sum.ys"));
tokenStream = new CommonTokenStream(y86);
y86p = new y86Parser(tokenStream);
}
catch (IOException e)
{
log.error("Error occured while reading from file");
e.printStackTrace();
}
I'm not sure how to do the parsing. I have seen people use something like y86Parser.CompilationUnitContext but I can seem to find that class. I have tried printing from the Listeners antlr creates but I don't know how to trigger these listeners
For each rule ruleName in your grammar, the y86Parser class will contain a class named RuleNameContext and a method named ruleName(), which will parse the input according to that rule and return an instance of the RuleNameContext class containing the parse tree. You can then use listeners or visitors to walk that parse tree.
So if you don't have a compilationUnit method or a CompilationUnitContext class, your grammar probably just doesn't have a rule named compilationUnit. Instead you should pick a rule that you do have and call the method corresponding to that rule.

Unable to parse JSON from url

Write a piece of code that will query a URL that returns JSON and can parse the JSON string to pull out pieces of information. The information that should be parsed and returned is the pageid and the list of “See Also” links. Those links should be formatted to be actual links that can be used by a person to find the appropriate article.
Use the Wikipedia API for the query. A sample query is:
URL
Other queries can be generated changing the “titles” portion of the query string. The code to parse the JSON and pull the “See Also” links should be generic enough to work on any Wikipedia article.
I tried writing the below code:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;
import org.json.JSONException;
import org.json.JSONObject;
public class JsonRead {
private static String readUrl(String urlString) throws Exception {
BufferedReader reader = null;
try {
URL url = new URL(urlString);
reader = new BufferedReader(new InputStreamReader(url.openStream()));
StringBuffer buffer = new StringBuffer();
int read;
char[] chars = new char[1024];
while ((read = reader.read(chars)) != -1)
buffer.append(chars, 0, read);
return buffer.toString();
} finally {
if (reader != null)
reader.close();
}
}
public static void main(String[] args) throws IOException, JSONException {
JSONObject json;
try {
json = new JSONObject(readUrl("https://en.wikipedia.org/w/api.php?format=json&action=query&titles=SMALL&prop=revisions&rvprop=content"));
System.out.println(json.toString());
System.out.println(json.get("pageid"));
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
I have used the json jar from the below link in eclipse:
Json jar
When I run the above code I am getting the below error;
org.json.JSONException: JSONObject["pageid"] not found.
at org.json.JSONObject.get(JSONObject.java:471)
at JsonRead.main(JsonRead.java:35)
How can I extract the details of the pageid and also the "See Also" links from the url?
I have never worked on JSON before hence kindly let me know how to proceed here
The json:
{
"batchcomplete":"",
"query":{
"pages":{
"1808130":{
"pageid":1808130,
"ns":0,
"title":"SMALL",
"revisions":[
{
"contentformat":"text/x-wiki",
"contentmodel":"wikitext",
"*":"{{About|the ALGOL-like programming language|the scripting language formerly named Small|Pawn (scripting language)}}\n\n'''SMALL''', Small Machine Algol Like Language, is a [[computer programming|programming]] [[programming language|language]] developed by Dr. [[Nevil Brownlee]] of [[Auckland University]].\n\n==History==\nThe aim of the language was to enable people to write [[ALGOL]]-like code that ran on a small machine. It also included the '''string''' type for easier text manipulation.\n\nSMALL was used extensively from about 1980 to 1985 at [[Auckland University]] as a programming teaching aid, and for some internal projects. Originally written to run on a [[Burroughs Corporation]] B6700 [[Main frame]] in [[Fortran]] IV, subsequently rewritten in SMALL and ported to a DEC [[PDP-10]] Architecture (on the [[Operating System]] [[TOPS-10]]) and IBM S360 Architecture (on the Operating System VM/[[Conversational Monitor System|CMS]]).\n\nAbout 1985, SMALL had some [[Object-oriented programming|object-oriented]] features added to handle structures (that were missing from the early language), and to formalise file manipulation operations.\n\n==See also==\n*[[ALGOL]]\n*[[Lua (programming language)]]\n*[[Squirrel (programming language)]]\n\n==References==\n*[http://www.caida.org/home/seniorstaff/nevil.xml Nevil Brownlee]\n\n[[Category:Algol programming language family]]\n[[Category:Systems programming languages]]\n[[Category:Procedural programming languages]]\n[[Category:Object-oriented programming languages]]\n[[Category:Programming languages created in the 1980s]]"
}
]
}
}
}
}
If You Read your Exception Carefully you will find your solution at your own.
Exception in thread "main" org.json.JSONException: A JSONObject text must begin with '{' at 1 [character 2 line 1]
at org.json.JSONTokener.syntaxError(JSONTokener.java:433)
Your Exception says A JSONObject text must begin with '{' it means the the json you received from the api is probably not Correct.
So, I suggest you to debug your code and try to find out what you actually received in your String Variable jsonText.
You get the exception org.json.JSONException: JSONObject["pageid"] not found. when calling json.get("pageid") because pageid is not a direct sub-element of your root. You have to go all the way down through the object graph:
int pid = json.getJSONObject("query")
.getJSONObject("pages")
.getJSONObject("1808130")
.getInt("pageid");
If you have an array in there you will even have to iterate the array elements (or pick the one you want).
Edit Here's the code to get the field containing the 'see also' values
String s = json.getJSONObject("query")
.getJSONObject("pages")
.getJSONObject("1808130")
.getJSONArray("revisions")
.getJSONObject(0)
.getString("*");
The resulting string contains no valid JSON. You will have to parse it manually.

Run code from a string in Java [duplicate]

This question already has answers here:
Java interpreter? [closed]
(10 answers)
Closed 9 years ago.
For debug reasons, I want to be able to run code that is typed in through the console. For example:
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
while(true){
String str = br.readLine(); //This can return 'a = 5;','b = "Text";' or 'pckg.example.MyClass.run(5);'
if(str == null)
return;
runCode(str); //How would I do this?
}
PLEASE DON'T ACTUALLY USE THIS
I was under the assumption you wanted to evaluate a string as Java code, not some scripting engine like Javascript, so
I created this on a whim after reading this, using the compiler API mark mentioned. It's probably very bad practice but it (somewhat) works like you wanted it to. I doubt it'll be much use in debugging since it runs the code in the context of a new class. Sample usage is included at the bottom.
import javax.tools.JavaCompiler;
import javax.tools.StandardJavaFileManager;
import javax.tools.ToolProvider;
import java.io.BufferedReader;
import java.io.File;
import java.io.InputStreamReader;
import java.io.PrintWriter;
import java.net.URL;
import java.net.URLClassLoader;
import java.util.Arrays;
public class main {
public static void runCode(String s) throws Exception{
JavaCompiler jc = ToolProvider.getSystemJavaCompiler();
StandardJavaFileManager sjfm = jc.getStandardFileManager(null, null, null);
File jf = new File("test.java"); //create file in current working directory
PrintWriter pw = new PrintWriter(jf);
pw.println("public class test {public static void main(){"+s+"}}");
pw.close();
Iterable fO = sjfm.getJavaFileObjects(jf);
if(!jc.getTask(null,sjfm,null,null,null,fO).call()) { //compile the code
throw new Exception("compilation failed");
}
URL[] urls = new URL[]{new File("").toURI().toURL()}; //use current working directory
URLClassLoader ucl = new URLClassLoader(urls);
Object o= ucl.loadClass("test").newInstance();
o.getClass().getMethod("main").invoke(o);
}
public static void main(String[] args) {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
while(true){
try {
String str = br.readLine(); //This can return 'a = 5;','b = "Text";' or 'pckg.example.MyClass.run(5);'
if(str == null)
return;
runCode(str); //How would I do this?
} catch(Exception e) {
e.printStackTrace();
}
}
}
}
//command line
> System.out.println("hello");
hello
> System.out.println(3+2+3+4+5+2);
19
> for(int i = 0; i < 10; i++) {System.out.println(i);}
0
1
2
3
4
5
6
7
8
9
With the SimpleJavaFileObject you could actually avoid using a file, as shown here, but the syntax seems a bit cumbersome so I just opted for a file in the current working directory.
EDIT: Convert String to Code offers a similar approach but it's not fully fleshed out
If the code is in JavaScript then you can run it with JavaScript engine:
Object res = new ScriptEngineManager().getEngineByName("js").eval(str);
JavaScript engine is part of Java SE since 1.6. See this guide http://download.java.net/jdk8/docs/technotes/guides/scripting/programmer_guide/index.html for details
You can use the Java scripting API which is located in the Package javax.script. There you can include several scripting languages like bsh for example.
You can find a programmer's guide on the web page of Oracle.
Rhino, which is some kind of JavaScript is already included with the Oracle JVM.
For this you may want to look into Java Compiler API. I haven't studied much as to how this works, but it allows you to load a java file, compile and load the class in an already running system. Maybe it can be repurposed into accepting input from console.
For a general compiler you could use Janino which will allow you to compile and run Java code. The expression evaluator may help with your example.
If you are just looking to evaluate expressions while debugging then Eclispe has the Display view which allows you to execute expressions. See this question.

Integrating ANTLR4 into Java

I have generated and compiled a grammar with ANTLR4. VIA the command line I am able to see if there is an error, but I am having issues integrating this parser into a java program successfully. I am able to use ANTLR4 methods as I've added the JAR's to my library in Eclipse, however I am completely unable to retrieve token text or find out if an error is being generated in any sort of meaningful manner. Any help would be appreciated. If I'm being ambiguous by any means, please let me know and I'll delve into more detail.
Looking at previous versions, an equivalent method to something like compilationUnit() might be what I want.
Something like this should work (assuming you generated GeneratedLexer and GeneratedParser from your grammar):
import java.io.FileInputStream;
import java.io.InputStream;
import org.antlr.v4.runtime.ANTLRInputStream;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.tree.ParseTree;
import test.GeneratedLexer;
import test.GeneratedParser;
public class Main {
public static void main(String[] args) throws Exception {
String inputFile = null;
if (args.length > 0) {
inputFile = args[0];
}
InputStream is = System.in;
if (inputFile != null) {
is = new FileInputStream(inputFile);
}
ANTLRInputStream input = new ANTLRInputStream(is);
GeneratedLexer lexer = new GeneratedLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
GeneratedParser parser = new GeneratedParser(tokens);
ParseTree tree = parser.startRule();
// Do something useful with the tree (e.g. use a visitor if you generated one)
System.out.println(tree.toStringTree(parser));
}
}
You could also use a parser and lexer interpreter if you don't want to pregenerate them from your grammar (or you have a dynamic grammar).

Categories

Resources