I am attempting to parse this Javascript via Nashorn:
function someFunction() { return b + 1 };
and navigate to all of the statements. This including statements inside the function.
The code below just prints:
"function {U%}someFunction = [] function {U%}someFunction()"
How do I "get inside" the function node to it's body "return b + 1"? I presume I need to traverse the tree with a visitor and get the child node?
I have been following the second answer to the following question:
Javascript parser for Java
import jdk.nashorn.internal.ir.Block;
import jdk.nashorn.internal.ir.FunctionNode;
import jdk.nashorn.internal.ir.Statement;
import jdk.nashorn.internal.parser.Parser;
import jdk.nashorn.internal.runtime.Context;
import jdk.nashorn.internal.runtime.ErrorManager;
import jdk.nashorn.internal.runtime.Source;
import jdk.nashorn.internal.runtime.options.Options;
import java.util.List;
public class Main {
public static void main(String[] args){
Options options = new Options("nashorn");
options.set("anon.functions", true);
options.set("parse.only", true);
options.set("scripting", true);
ErrorManager errors = new ErrorManager();
Context context = new Context(options, errors, Thread.currentThread().getContextClassLoader());
Source source = Source.sourceFor("test", "function someFunction() { return b + 1; } ");
Parser parser = new Parser(context.getEnv(), source, errors);
FunctionNode functionNode = parser.parse();
Block block = functionNode.getBody();
List<Statement> statements = block.getStatements();
for(Statement statement: statements){
System.out.println(statement);
}
}
}
Using private/internal implementation classes of nashorn engine is not a good idea. With security manager on, you'll get access exception. With jdk9 and beyond, you'll get module access error w/without security manager (as jdk.nashorn.internal.* packages not exported from nashorn module).
You've two options to parse javascript using nashorn:
Nashorn parser API ->https://docs.oracle.com/javase/9/docs/api/jdk/nashorn/api/tree/Parser.html
To use Parser API, you need to use jdk9+.
For jdk8, you can use parser.js
load("nashorn:parser.js");
and call "parse" function from script. This function returns a JSON object that represents AST of the script parsed.
See this sample: http://hg.openjdk.java.net/jdk8u/jdk8u-dev/nashorn/file/a6d0aec77286/samples/astviewer.js
Related
I have a pkb file. It contain a package and under that package it has multiple functions.
I have to get the following details out of it:
package name
function names (for all functions one by one)
params in function
return type of function
Approach: I am parsing the pkb file. I have taken the grammar from these sources:
Presto
Antlrv4 Grammer for plsql
After getting these grammar I downloaded the jar from antlr-4.5.3-complete.jar. Then using
java -cp org.antlr.v4.Tool grammar.g
one by one I execute this command on these grammars separately to generate listener, lexer, parser and other files.
After this I created two project in eclipse one for each grammar. I imported these generated file into the respective and set antlr-4.5.3-complete.jar file into the path. After this I used following code to check if my .pkb file is correct or not?
public static void parse(String file) {
try {
SqlBaseLexer lex = new SqlBaseLexer(new org.antlr.v4.runtime.ANTLRInputStream(file));
CommonTokenStream tokens = new CommonTokenStream(lex);
SqlBaseParser parser = new SqlBaseParser(tokens);
System.err.println(parser.getNumberOfSyntaxErrors()+" Errors");
} catch (RecognitionException e) {
System.err.println(e.toString());
} catch (java.lang.OutOfMemoryError e) {
System.err.println(file + ":");
System.err.println(e.toString());
} catch (java.lang.ArrayIndexOutOfBoundsException e) {
System.err.println(file + ":");
System.err.println(e.toString());
}
}
I am not getting any error in parsing the file.
But after this I am stuck with next steps. I need to get all the package name, functions, params etc.
How to get these details?
Also is my approach is correct to attain the required output.
The Presto grammar is a generic SQL grammar which is not suitable for parsing Oracle packages. The ANTLRv4 grammar for PL/SQL is the right tool for your task.
Generally an ANTLR grammar as such works as a validator. When you want to make some additional processing while parsing you should use ANTLR actions (see overview slide in this presentation). These are blocks of text written in the target language (e.g. Java) and enclosed in curly braces (see documentation).
There are at least two ways to solve your task with ANTLR actions.
Stdout output
The simplest way is to add println()s for certain rules.
To print package name modify package_body rule in plsql.g4 as follows:
package_body
: BODY package_name (IS | AS) package_obj_body*
(BEGIN seq_of_statements | END package_name?)
{System.out.println("Package name is "+$package_name.text);}
;
Similarly to print information about function's arguments and return type: add prinln()s in create_function_body rule. But there is an issue whith printing of parameters. If you use $parameter.text it will return name, type specification and default value according to parameter rule without spaces (as token sequence). If you add println() to parameter rule and use $parameter_name.text it will print all parameter's names (including parameters of procedures, not only functions). So you can add an ANTLR return value for parameter rule and assign $parameter_name.text to the return value:
parameter returns [String p_name]
: parameter_name (IN | OUT | INOUT | NOCOPY)*
type_spec? default_value_part?
{$p_name=$parameter_name.text;}
;
Thus is context of create_function_body we can access the parameter's name by $parameter.p_name:
create_function_body
: (CREATE (OR REPLACE)?)? FUNCTION function_name
{System.out.println("Parameters of function "+$function_name.text+":");}
('(' parameter {System.out.println($parameter.p_name);}
(',' parameter {System.out.println($parameter.p_name);})* ')')?
RETURN type_spec
(invoker_rights_clause|parallel_enable_clause|result_cache_clause|DETERMINISTIC)*
((PIPELINED? (IS | AS) (DECLARE? declare_spec* body | call_spec))
| (PIPELINED | AGGREGATE) USING implementation_type_name) ';'
{System.out.println("Return type of function "
+$function_name.text+" is "
+ $type_spec.text);}
;
Accumulation
Also you can save some calculations to variables and access them as parser class members. E.g. you can accumulate function's name in variable func_name. For this add #members section at beginning of the grammar:
grammar plsql;
#members{
String func_name = "";
}
And modify function_name rule as follows:
function_name
: id ('.' id_expression)? {func_name = func_name+$id.text + " ";}
;
Using lexer and parser classes
Here is an example of application to run your parser parse.java:
import org.antlr.v4.runtime.*;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
public class parse {
static String readFile(String path) throws IOException
{
byte[] encoded = Files.readAllBytes(Paths.get(path));
return new String(encoded, "UTF-8");
}
public static void main(String[] args) throws Exception {
// create input stream `in`
ANTLRInputStream in = new ANTLRInputStream( readFile(args[0]) );
// create lexer `lex` with `in` at input
plsqlLexer lex = new plsqlLexer(in);
// create token stream `tokens` with `lex` at input
CommonTokenStream tokens = new CommonTokenStream(lex);
// create parser with `tokens` at input
plsqlParser parser = new plsqlParser(tokens);
// call start rule of parser
parser.sql_script();
// print func_name
System.out.println("Function names: "+parser.func_name);
}
}
Compile and run
After this generate java code by ANTLR:
java org.antlr.v4.Tool plsql.g4
and compile your Java code:
javac plsqlLexer.java plsqlParser.java plsqlListener.java parse.java
then run it for some .pkb file:
java parse green_tools.pkb
You can find modified parse.java, plsql.g4 and green_tools.pkb here.
Im tring to run this example from the Renjin website, http://www.renjin.org/documentation/developer-guide.html , im tring to run the first "A simple primer" example.
following is my directory layout:
And here is my code:
package stackoverflow;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import org.renjin.sexp.*; // <-- import Renjin's object classes
/**
*
* #author yschellekens
*/
public class StackOverflow {
public static void main(String[] args) throws Exception {
ScriptEngineManager factory = new ScriptEngineManager();
// create a Renjin engine
ScriptEngine engine = factory.getEngineByName("Renjin");
// evaluate R code from String, cast SEXP to a DoubleVector and store in the 'res' variable
DoubleVector res = (DoubleVector)engine.eval("a <- 2; b <- 3; a*b");
System.out.println("The result of a*b is: " + res);
}
}
Why am i getting the following Exception? (i should get of 6)
run:
Exception in thread "main" java.lang.NullPointerException
at stackoverflow.StackOverflow.main(StackOverflow.java:22)
Java Result: 1
BUILD SUCCESSFUL (total time: 0 seconds)
thanks in advance
The exception is throw because your application can't find the Renjin ScriptEngine. You have provided renjin-studio as a library, but you need the renjin-script-engine library which is available from http://build.bedatadriven.com/job/renjin/lastSuccessfulBuild/org.renjin$renjin-script-engine/ (use the JAR with dependencies).
Unfortunately ScriptEngineManager.getEngineByName() only returns null if it can't find the engine so you can add the following check to ensure that the engine has loaded:
// check if the engine has loaded correctly:
if(engine == null) {
throw new RuntimeException("Renjin Script Engine not found on the classpath.");
}
Also note: it is called Renjin, not Rengin!
I am creating a parser using ANTLR 3.x that targets java. I have written both parser grammar (for creating Abstract Syntax Tree, AST) and Tree Grammar (for performing operations on AST). Finally, to test both grammar files, I have written a test file in Java.
Have a look at the below code,
protocol grammar
grammar protocol;
options {
language = Java;
output = AST;
}
tokens{ //imaginary tokens
PROT;
INITIALP;
PROC;
TRANSITIONS;
}
#header {
import twoprocess.Configuration;
package com.javadude.antlr3.x.tutorial;
}
#lexer::header {
package com.javadude.antlr3.x.tutorial;
}
/*
parser rules, in lowercase letters
*/
program
: declaration+
;
declaration
:protocol
|initialprocess
|process
|transitions
;
protocol
:'protocol' ID ';' -> ^(PROT ID)
;
initialprocess
:'pin' '=' INT ';' -> ^(INITIALP INT)
;
process
:'p' '=' INT ';' -> ^(PROC INT)
;
transitions
:'transitions' '=' INT ('(' INT ',' INT ')') + ';' -> ^(TRANSITIONS INT INT INT*)
;
/*
lexer rules (tokens), in upper case letters
*/
ID
: (('a'..'z' | 'A'..'Z'|'_')('a'..'z' | 'A'..'Z'|'0'..'9'|'_'))*;
INT
: ('0'..'9')+;
WHITESPACE
: ('\t' | ' ' | '\r' | '\n' | '\u000C')+ {$channel = HIDDEN;};
protocolWalker
grammar protocolWalker;
options {
language = Java;
//Error, eclipse can't access tokenVocab named protocol
tokenVocab = protocol; //import tokens from protocol.g i.e, from protocol.tokens file
ASTLabelType = CommonTree;
}
#header {
import twoprocess.Configuration;
package com.javadude.antlr3.x.tutorial;
}
program
: declaration+
;
declaration
:protocol
|initialprocess
|process
|transitions
;
protocol
:^(PROT ID)
{System.out.println("create protocol " +$ID.text);}
;
initialprocess
:^(INITIALP INT)
{System.out.println("");}
;
process
:^(PROC INT)
{System.out.println("");}
;
transitions
:^(TRANSITIONS INT INT INT*)
{System.out.println("");}
;
Protocoltest.java
package com.javadude.antlr3.x.tutorial;
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.runtime.tree.CommonTree;
import org.antlr.runtime.tree.CommonTreeNodeStream;
public class Protocoltest {
/**
* #param args
*/
public static void main(String[] args) throws Exception {
//create input stream from standard input
ANTLRInputStream input = new ANTLRInputStream(System.in);
//create a lexer attached to that input stream
protocolLexer lexer = new protocolLexer(input);
//create a stream of tokens pulled from the lexer
CommonTokenStream tokens = new CommonTokenStream(lexer);
//create a pareser attached to teh token stream
protocolParser parser = new protocolParser(tokens);
//invoke the program rule in get return value
protocolParser.program_return r =parser.program();
CommonTree t = (CommonTree)r.getTree();
//output the extracted tree to the console
System.out.println(t.toStringTree());
//walk resulting tree; create treenode stream first
CommonTreeNodeStream nodes = new CommonTreeNodeStream(t);
//AST nodes have payloads that point into token stream
nodes.setTokenStream(tokens);
//create a tree walker attached to the nodes stream
//Error, can't create TreeGrammar object called walker
protocolWalker walker = new protocolWalker(nodes);
//invoke the start symbol, rule program
walker.program();
}
}
Problems:
In protocolWalker, I can't access the tokens (protocol.tokens)
//Error, eclipse can't access tokenVocab named protocol
tokenVocab = protocol; //import tokens from protocol.g i.e, from protocol.tokens file
In In protocolWalker, can I create the object of java class, called Configuration, in the action list?
protocol
:^(PROT ID)
{System.out.println("create protocol " +$ID.text);
Configuration conf = new Configuration();
}
;
In Protocoltest.java
//create a tree walker attached to the nodes stream
//Error, can't create TreeGrammar object called walker
protocolWalker walker = new protocolWalker(nodes);
Object of protocolWalker can't be created. I have seen in the examples and the tutorials that such object is created.
In protocolWalker, I can't access the tokens (protocol.tokens)...
It seems to be accessing protocol.tokens fine: changing tokenVocab to something else produces an error that it doesn't produce now. The problem with protocolWalker.g is that it's defined as a token parser (grammar protocolWalker) but it's being used like a tree parser. Defining the grammar as tree grammar protocolWalker took away the errors that I was seeing about the undefined tokens.
In protocolWalker, can I create the object of java class, called Configuration, in the action list?
Yes, you can. The normal Java programming caveats apply about importing the class and so on, but it's as available to you as code like System.out.println.
In Protocoltest.java ... Object of protocolWalker can't be created.
protocolWalker.g (as it is now) produces a token parser named protocolWalkerParser. When you change it to a tree grammar, it'll produce a tree parser named protocolWalker instead.
Thanks a lot for posting the whole grammars. That made answering the question much easier.
Thank you for your reply, that was a silly mistake.
Tokens problem and creating object of protocolWalker is resolved now but whenever, I change the grammar whether, protocol.g or protocolWalker.g, I had to write package name again(every time) in protocolParser.java and protocolWalker.java. I had the same problem with lexer file before but that was overcomed by the following declaration.
#header {
package com.javadude.antlr3.x.tutorial;
}
but I don't know how to overcome this problem?
Also, I have developed a GUI in Java using SWING where I have a textarea. In that text area,
user will write the input, like for my grammar user will write,
protocol test;
pin = 5;
p = 3;
transitions = 2(5,0) (5,1);
How can I process this input in Java Swing GUI and produce output there?
Moreover, if I give the following section of protocolWalker.g to
protocol
:^(PROT ID)
{
System.out.println("create protocol " +$ID.text);
Configuration conf = new Configuration();
conf.showConfiguration();
}
;
initialprocess
:^(INITIALP INT)
{System.out.println("create initial process (with state) ");}
;
process
:^(PROC INT)
{System.out.println("create processes ");}
;
and run the test file with the following input,
protocol test;
pin = 5;
p = 3;
transitions = 2(5,0) (5,1);
I get the following output
(PROT test) (INITIALP 5) (PROC 3) (TRANSITIONS 2 5 0 5 1)
create protocol test
why the second and the third println in the protocolWalker.g are not shown in the output?
Any thoughts/help?
Thank you once again.
So far I have tried the sling implementation for jsr223 scripting for scala, but was not able to get it set up correctly.
when I do this:
public static void main(String[] args) {
try {
new ScriptEngineManager().getEngineByName("scala").
eval("object HelloWorld {def main(args: Array[String]) {
println(\"Hello, world!\") }}");
} catch (ScriptException e) {
e.printStackTrace();
}
}
I got nothing but:
javax.script.ScriptException: ERROR
org.apache.sling.scripting.scala.Script line 13 : not found: type
Script at org.apache.sling.scripting.scala.ScalaScriptEngine.eval(ScalaScriptEngine.scala:117)
at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:247)
similar Problems are discussed here:
http://scala-programming-language.1934581.n4.nabble.com/How-to-compile-Scala-code-from-java-using-the-current-ClassLoader-instead-of-a-string-based-classpat-td1955873.html#a1955873
and
http://dev.day.com/discussion-groups/content/lists/sling-dev/2009-12/2009-12-01_Scala_scripting_support_was_Re_And_another_one____Michael_D_rig.html
maybe there is another Implementation that I'm not aware of.
Any help appreciated
Have a look at the test cases in the scala/script module of Apache Sling for a working example. The script and its entry point (that is the object) need to follow certain conventions. I'll provide more information on these if required later.
For a general overview of the scripting engine see my session slides from Scala Days 2010.
Update: Scripts must be of the following form:
package my.cool.script {
class foo(args: fooArgs) {
import args._ // import the bindings
println("bar:" + bar)
}
}
The type of args is generated by the script engine and is named after the simple class name of the script appended with 'Args'. Further the example assumes, that the Bindings passed for script evaluation contains a value for the name 'bar'. For further details see the class comment on ScalaScriptEngine.
You need to pass the name of your script class to the script engine. You do this by putting the fully qualified script name (i.e. my.cool.script.foo) into the ScriptContext by the name 'scala.script.class'.
With the conclusion of https://issues.scala-lang.org/browse/SI-874 in version 2.11, it should be as easy as what is shown in the ticket:
import javax.script.*;
ScriptEngine e = new ScriptEngineManager().getEngineByName("scala");
e.getContext().setAttribute("label", new Integer(4), ScriptContext.ENGINE_SCOPE);
try {
engine.eval("println(2+label)");
} catch (ScriptException ex) {
ex.printStackTrace();
}
Unfortunately my comment was unreadable without linebreaks - so...
To be able to run the Codesnippet mentioned I needed to make the following changes.
I used Scala 2.11.0-M4
public static void main(String args[]){
ScriptEngine engine = new ScriptEngineManager().getEngineByName("scala");
// Set up Scriptenvironment to use the Java classpath
List nil = Nil$.MODULE$;
$colon$colon vals = $colon$colon$.MODULE$.apply((String) "true", nil);
((IMain)engine).settings().usejavacp().tryToSet(vals);ScriptContext.ENGINE_SCOPE);
engine.getContext().setAttribute("labelO", new Integer(4), ScriptContext.ENGINE_SCOPE);
try {
engine.eval("val label = labelO.asInstanceOf[Integer]\n"+
"println(\"ergebnis: \" + (2 + label ))");
} catch (ScriptException ex) {
ex.printStackTrace();
}
}
In all dbpedia pages, e.g.
http://dbpedia.org/page/Ireland
there's a link to a RDF file.
In my application I need to analyse the rdf code and run some logic on it.
I could rely on the dbpedia SPARQL endpoint, but I prefer to download the rdf code locally and parse it, to have full control over it.
I installed JENA and I'm trying to parse the code and extract for example a property called: "geo:geometry".
I'm trying with:
StringReader sr = new StringReader( node.rdfCode )
Model model = ModelFactory.createDefaultModel()
model.read( sr, null )
How can I query the model to get the info I need?
For example, if I wanted to get the statement:
<rdf:Description rdf:about="http://dbpedia.org/resource/Ireland">
<geo:geometry xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" rdf:datatype="http://www.openlinksw.com/schemas/virtrdf#Geometry">POINT(-7 53)</geo:geometry>
</rdf:Description>
Or
<rdf:Description rdf:about="http://dbpedia.org/resource/Ireland">
<dbpprop:countryLargestCity xmlns:dbpprop="http://dbpedia.org/property/" xml:lang="en">Dublin</dbpprop:countryLargestCity>
</rdf:Description>
What is the right filter?
Many thanks!
Mulone
Once you have the file parsed in a Jena model you can iterate and filter with something like:
//Property to filter the model
Property geoProperty =
model. createProperty("http://www.w3.org/2003/01/geo/wgs84_pos#",
"geometry");
//Iterator based on a Simple selector
StmtIterator iter =
model.listStatements(new SimpleSelector(null, geoProperty, (RDFNode)null));
//Loop to traverse the statements that match the SimpleSelector
while (iter.hasNext()) {
Statement stmt = iter.nextStatement();
System.out.print(stmt.getSubject().toString());
System.out.print(stmt.getPredicate().toString());
System.out.println(stmt.getObject().toString());
}
The SimpleSelector allows you to pass any (subject,predicate,object) pattern to match statements in the model. In your case if you only care about a specific predicate then first and third parameters of the constructor are null.
Allowing filtering two different properties
To allow more complex filtering you can implement the selects method in the
SimpleSelector interface like here:
Property geoProperty = /* like before */;
Property countryLargestCityProperty =
model. createProperty("http://dbpedia.org/property/",
"countryLargestCity");
SimpleSelector selector = new SimpleSelector(null, null, (RDFNode)null) {
public boolean selects(Statement s)
{ return s.getPredicate().equals(geoProperty) ||
s.getPredicate().equals(countryLargestCityProperty) ;}
}
StmtIterator iter = model.listStatements(selector);
while(it.hasNext()) {
/* same as in the previous example */
}
Edit: including a full example
This code includes a full example that works for me.
import com.hp.hpl.jena.util.FileManager;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.SimpleSelector;
import com.hp.hpl.jena.rdf.model.Property;
import com.hp.hpl.jena.rdf.model.RDFNode;
import com.hp.hpl.jena.rdf.model.Literal;
import com.hp.hpl.jena.rdf.model.StmtIterator;
import com.hp.hpl.jena.rdf.model.Statement;
public class TestJena {
public static void main(String[] args) {
FileManager fManager = FileManager.get();
fManager.addLocatorURL();
Model model = fManager.loadModel("http://dbpedia.org/data/Ireland.rdf");
Property geoProperty =
model. createProperty("http://www.w3.org/2003/01/geo/wgs84_pos#",
"geometry");
StmtIterator iter =
model.listStatements(new SimpleSelector(null, geoProperty,(RDFNode) null));
//Loop to traverse the statements that match the SimpleSelector
while (iter.hasNext()) {
Statement stmt = iter.nextStatement();
if (stmt.getObject().isLiteral()) {
Literal obj = (Literal) stmt.getObject();
System.out.println("The geometry predicate value is " +
obj.getString());
}
}
}
}
This full example prints out:
The geometry predicate value is POINT(-7 53)
Notes on Linked Data
http://dbpedia.org/page/Ireland is the HTML document version of the resource http://dbpedia.org/resource/Ireland
In order to get the RDF you should resolve :
http://dbpedia.org/data/Ireland.rdf
or
http://dbpedia.org/resource/Ireland + Accept: application/rdfxml in the HTTP header.
With curl it'd be something like:
curl -L -H 'Accept: application/rdf+xml' http://dbpedia.org/resource/Ireland