Java Parser for JavaScript: List all Functions/Variables - java

I am planning to implement a JavaScript parser in java. I know that there are several ways to do it. There are view frameworks/engines/parsers which could help to do it right, like:
ANTLR 3/4:
it seems like there is only a js grammer for v3
Mozilla Rhino: atm i can parse variable names on initital (top-) namespace. but i am not able to parse nested scopes e.g. object members.. hm..
Nashorn: maybe i should give it a try..?
Maybe:
closure-compiler: IMHO this is very nice. but not for "non-google" js-code :) e.g. you have to apply several coding conventions to your javascript sources to get it working properly..
maybe it is possible to adapt Packer to do it? Is there a Java implementation of Packer???
There is EcmaScript 5.1 related to this article. it seems to be very comfortable. But this is not exactly what I´am looking for.. And still no java :)
My question is:
What could/would be the best way to parse JavaScript for:
(object-)function names
(object-)member names e.g. variables
Is it even possible to do it?
What would be your approach? For me it is not essential to parse ALL special markups of JavaScript.. The important factor would be to parse function/variables in a consistent context for the typical markups like this:
// Avoid `console` errors in browsers that lack a console.
function Object() {
var method;
var noop = function() {
};
var methods = ['assert', 'clear', 'count', 'debug', 'dir', 'dirxml', 'error', 'exception', 'group', 'groupCollapsed', 'groupEnd', 'info', 'log', 'markTimeline', 'profile', 'profileEnd', 'table', 'time', 'timeEnd', 'timeStamp', 'trace', 'warn'];
var length = methods.length;
var console = (window.console = window.console || {});
while (length--) {
method = methods[length];
// Only stub undefined methods.
if (!console[method]) {
console[method] = noop;
}
}
};
var obj = new Object();
var test = "Hello World";
The parse should be able to deliver this information:
Node: Object
Node: Object.method
Node: Object.noop
Node: Object.length
Node: Object.console
Node: Object
Node: obj
Node: test
There is no direct need of any determination if the node is a function/variable.

Related

How do I track variable dependencies in Nashorn?

I would like to use the Nashorn engine as a general computation engine. It is powerful, fast has plenty of built-in functions and new functions are very easy to add, using #FunctionalInterface or static methods. Even better, it also provides value-adds like cyclic dependency checking, syntax checking, etc.
However I need to automatically update "output" variables when a dependency changes.
The general idea is that in Java, I'll have something like:
class CalculationEngine {
Data addData(String name, Number value){
...
}
Data addData(String name, String formula){
...
}
String getScript(){
...
}
}
CalculationEngine engine = new CalculationEngine();
Data datum1 = engine.addData("datum1", 1); // Constant integer 1
Data datum2 = engine.addData("datum2", 2); // Constant integer 2
Data datum3 = engine.addData("datum3", "datum1*10");
Data datum4 = engine.addData("datum4", "datum3+datum2");
The CalculationEngine service class knows how to use Nashorn to create a script string out of the Data objects that looks like this:
final String script = engine.getScript(); // "var datum1=1; var datum2=2; var datum3=datum1*10; var datum4=datum3+datum2;"
I know I can parse the script with the Nashorn Parser:
final CompilationUnitTree tree = parser.parse("test", script, null);
But how do I extract the dependencies:
List<Data> whatDependsOn(Data input){
// Process the parsed tree
return list;
}
such that whatDependsOn(datum2) returns [datum4] and whatDependsOn(datum1) returns [datum3, datum4] ?
Or the inverse function getReferencedVariables such that getReferencedVariables(datum3) returns [datum1] and getReferencedVariables(datum4) returns [datum2, datum3] (and I can recursively query getReferencedVariables until all referenced variables have been found).
Basically, when the "value" of one of my Data objects change (due to an external event), how I determine which of my script formulae are affected and need to be recomputed?
I know that the Nashorn script can be parsed but I can not figure out how to use the SimpleTreeVisitorES6 to build up a variable dependency graph:
final CompilationUnitTree tree = parser.parse("test", script, null);
if (tree != null) {
tree.accept(new SimpleTreeVisitorES6<Void, Void>() {
#Override
public Void visitVariable(VariableTree tree, Void v) {
final Kind kind = tree.getKind();
System.out.println("Found a variable: " + kind);
System.out.println(" name: " + kind.toString());
IdentifierTree binding = (IdentifierTree) tree.getBinding();
System.out.println(" kind: " + binding.getKind().name());
System.out.println(" name: " + binding.getName());
System.out.println(" val: " + kind.name());
return null;
}
}, null);
}
one of Nashorn devs here. What you are trying to do is compute the so called def-use relations on source code (well, more likely their transitive closure, but I digress). That's a well-understood compiler theory concept. The good news is that CompilationUnitTree and friends should give you enough information to implement an algorithm for computing this information. The bad news is you'll have to roll up your sleeves and roll your own implementation, I'm afraid. You'll basically have to gather this information, produce merges at control flow join points (back edges and exits of loops, ends of if statements, but you'll also have to handle more exotic stuff like switch/case with their fallthrough semantics and also try/catch/finally, which is the least fun of these as basically control can transfer from anywhere in try block to a catch block.) Your algorithm will also have to repeatedly evaluate loop bodies until the static information you're gathering reaches a fixpoint.
FWIW, while writing Nashorn I had to implement these kinds of things few times using Nashorn's internal parser API (which is different but similar to the public one). If you want some inspiration, you can look into the source code for Nashorn static type analyzer for inferring types of local variables in a JavaScript function which is something I wrote some years ago. If nothing else, it'll give you an idea how to walk an AST tree and keep track of control flow edges and partially computed static analysis data at the edges.
I wish there were an easier way to do this… FWIW, a generalized static analyzer that helps you with bookeeping of flow control could be possible. Good luck.

Groovy/Java - JSON - Update JSON through variable path

Anyone know how to efficiently set json in groovy with variable paths?
Context: I am working with soapui, a testing tool. Some tests are candidates to be data-driven. I have alot of variables. To make something sustainable that is easily implementable in similar circumstances, I would like a Groovy script that enables me to set variables.
I would name the variables 'parent.subParent.child'.
What I found:
http://groovy-lang.org/json.html
Referencing groovy variable as part of JSON path
I did find other things, but did not record them all.
The straight-forward thing I found was evaluation. With evaluation it was possible to get the values, but not the set them.
Eval.x(jsonbuilder, 'x.content.' + path) = 'newValue'
will return an error. But like I said, no problem retrieving the values in the json this way.
What I tried:
I have got an implementation which works for one level.
I can say:
jsonbuilder.content.parent.subParent[child] = 'newValue'
This will set the value of the requested entity.
Then I tried to expand this to an undefined number of levels.
//Assuming there is a jsonbuilder initialized
def jsonString = "{"parent":{"subParent":{"child":"oldValue"}}}"
def json = new JsonSlurper().parseText(jsonString)
def jsonbuilder = new JsonBuilder(json)
def path = 'parent.subParent.child'
def listPath = path.split("\\.")
def element = jsonbuilder.content
for(int i = 0; i < listPath.size(); i++) {
element = element[listPath[i]]
}
element = 'newValue'
assert jsonbuilder.toString() == "{"parent":{"subParent":{"child":"newValue"}}}"
The issue: the value in the original json is not updated. Likely because I leave the jsonbuilder variable once I assign it to 'element' and continue with that entity.
That leaves me with two questions:
How do I get the element value in the original json?
More general: How do I update json with a variable path?
The rudimentary JSON assign function with jsonbuilder like this: jsonbuilder.content.parent.subParent.child = 'newValue' as given in one of the answers below is not what I am eyeing for. I am looking for a way to make the entire thing dynamic. I don't want to build a simple assignment, that already exists and works well. I am looking to build a machine that does the assignment for me, with the variable names parsed as the paths. Preferably within the groovy.json.* environment, but if I have to involve external libraries, so be it.
I was staring myself blind on a specific implementation of Eval. My solution was actually simple if I would have read the docs from the start.
You can find the docs for Eval here: http://docs.groovy-lang.org/2.4.7/html/api/groovy/util/Eval.html
Instead of trying to assign a value to an evaluated method/function, which is not logical now I think of if, you need to integrate everything into the evaluated expression. For what I find, you can have up to three variables you can use in you Eval function.
I only need two. I need the jsonbuilder object to be able to get the source of information. And I need to get the value to set. The path itself can be used as it exists because it is already what it needs to be with respect to the evaluation: a String.
The code:
import groovy.json.*
def jsonString = '{"parent":{"child":"oldValue"}}'
def newValue = 'newValue'
def stringPath = 'parent.child'
def json = new JsonSlurper().parseText(jsonString)
def jsonbuilder = new JsonBuilder(json)
Eval.xy(jsonbuilder, newValue, 'x.content.' + stringPath + '= y')
System.out.println(jsonbuilder.toString()=='{"parent":{"child":"newValue"}}')
System.out.println(jsonbuilder.content.parent.child == 'newValue')​​​​​​​
By using Eval.xy(objectOne, objectTwo, StringExpression), I am telling that I am passing a string to be evaluated as an expression, in which x represents objectOne and y represents objectTwo.
The code can be viewed in an online groovy script engine here: https://groovyconsole.appspot.com/edit/5202721384693760
Small disclaimer: I can't imagine using an evaluated expression in a code base that lets variables be randomly manipulated by the outside world. This expression, if used, will sit comfortably inside the context of my SoapUI project.
Since you are willing to use library, json-path does that.
Credits to #kalle from here
Download the zip files from here
Extract the libraries and its dependencies from above zip
Copy them under SOAPUI_HOME/bin/ext directory
Restart SoapUI
Here you go:
import com.jayway.jsonpath.Configuration
import com.jayway.jsonpath.JsonPath
import com.jayway.jsonpath.spi.json.JacksonJsonNodeJsonProvider
import com.jayway.jsonpath.spi.mapper.JacksonMappingProvider
Configuration configuration = Configuration.builder()
.jsonProvider(new JacksonJsonNodeJsonProvider())
.mappingProvider(new JacksonMappingProvider())
.build()
//You need to prepend $. before the path which becomes valid jsonpath
def path = '$.parent.subParent.child'
def originalJson = """{
"parent": {
"subParent": {
"child": "oldValue"
}
}
}"""
def updatedJson = JsonPath.using(configuration).parse(originalJson).set(path, 'newValue').json()
println(updatedJson.toString())
Here you go:
import groovy.json.JsonSlurper
import groovy.json.JsonBuilder
def jsonString = """{ "parent": {
"subParent": {
"child": "oldValue"
}
}
}"""
def json = new JsonSlurper().parseText(jsonString)
def jsonbuilder = new JsonBuilder(json)
//Assign the value for child with new value
jsonbuilder.content.parent.subParent.child = 'newValue'
println jsonbuilder.toPrettyString()​​​​​​​​​​
You can try online Demo

Static analysis of Javascript with Java

I need to do a static analysis of Javascript files using Java. Here I need to check whether the Javascript file has any function calls to document.write() or reference to properties like innerHTML etc. Can I use javax.script.* package to achieve this? or Which Java api do I need to use for Parsing? Also can you provide examples for the same?
You can't statically analyze Javascript in the way you intend because Javascript is not a statically typed language.
You can check for document.write() but what if my code was this:
var whatever = document; whatever.write()
Or do you want to reject any function named write() even if it didn't write to the document?
Furthermore, Javascript has an eval function so you could always do:
var m = "ment";
eval("docu" + m + ".wri" + "te('hahahaha')");`.
How are you going to check for that?
Similarly, property access can be done in many ways.
Imagine this piece of code:
var x = document.children[0];
x.innerHTML = ...;
x["inner" + "HTML"] = ...;
var y = "inner";
x[y + "HTML"] = ...;
You're not going to be able to detect all those variants, and the hundreds more variants that you could make, using static analysis.

interoperation between Rhino and Java via JSR223: working with Javascript Object instances

This is very similar to this other SO question about arrays.
If I evaluate:
y = {a: 1, b: 2, "momomomo": function() { return "hi"; }, zz: "wham"}
in a Javascript script instantiated via JSR223 (ScriptingEngine), I get a NativeObject of some sort (I see this in Eclipse's debugger) and have no idea how to access its properties. Furthermore I don't even know which .jar file, if any, I need to add to my build path to be able to work with the class in question, and if I find an approach that works in Rhino Javascript, it is useless for Jython.
Seems like JSR223 should have included language-agnostic access methods to ScriptingEngine to provide the ability to wrap a returned object as a List<Object> for arrays or a Map<String, Object> for associative arrays.
Any suggestions?
I too am trying to embed different scripting languages with more features than jsr223 or bsf. For that i have had to define my own interfaces and implement thse around each different scripting engine.
One feature i wanted was the ability to pass a Function (java interface with a single method) to my scripting engine and have it just work when passed parameters. Each of my embedded scripting engines has a layer where i wrap/unwrap from/to java values from the scripting environment.
I would suggest the best way to solve the problem is for your wrapper around the scripting engine to provide a getValue( String name ) and have it fix up javascript arrays convertoing them to java Lists. Naturally the setValue(String, Object) would check if the value is a List and convert it back to a js array and so on. Its tedious :()
Convert it to a java object and return it. You can then work with the java object as you would normally.
The following is an example conversion function
function convertToJava(o) {
var rval;
if (Array.isArray(o)) {
rval = new java.util.ArrayList();
for (var key in o) {
rval.add(convertToJava(o[key]));
}
}
else if (typeof o === 'object') {
rval = new java.util.HashMap();
for (var key in o) {
rval.put(key, convertToJava(o[key]));
}
}
else if (typeof o === 'function') {
// skip
}
else if (typeof o === 'undefined') {
// skip
}
else {
rval = o;
}
return rval;
}

Rhino: How to return a string from Java to Javascript?

How do I use Rhino return a string from Java to Javascript, all I get is org.mozilla.javascript.JavaNativeObject when I use
var jsString = new java.lang.String("test");
inside my js file.
Is this the right way to do it?
var jsString = String(new java.lang.String("test"));
The goal is to have a Java method to return the String object instead of creating it on the fly like above.
In general, you would call Context.javaToJS which converts a Java object to its closest representation in Javascript. However, for String objects, that function returns the string itself without needing to wrap it. So if you're always returning a string, you don't need to do anything special.
Although in most cases the returned Java String type can be used just like the JS String type within the JS code, it does not have the same methods!
In particular I found it cannot be used in a JS object passed to 'stringify()' as it does not have the toJSON() method.
The only solution I found is to explicitly do the addition of "" in the JS, to convert the Java String to a JS String. I found no way to code the java method to return a good JS string directly... (as Context.javaToJS() doesn't convert a Java String)
Eg:
var jstr = MyJavaObj.methodReturningAString();
JSON.stringify({ "toto":jstr}); // Fails
JSON.stringify({ "toto": ""+jstr}); // OK
Turn off the wrapping of Primitives and then the value returned in your expression will be a JS string:
Context cx = Context.enter();
cx.getWrapFactory().setJavaPrimitiveWrap(false);
For me this is a Rhino bug. The s+"" trick inside JavaScript works, but here's a quick patch to fix it Java-side - after this line in NativeJavaMethod.call()
Object retval = meth.invoke(javaObject, args);
add this check to convert it to a native JavaScript string (ie typeof returns "string" not "object")
if (retval instanceof String) {
return NativeJavaObject.coerceTypeImpl(String.class, retval);
}
This is important otherwise s.replace() calls the Java version so is wrong for eg "h e l l o".replace(" ", "")
https://github.com/mozilla/rhino/issues/638

Categories

Resources