How do I track variable dependencies in Nashorn? - java

I would like to use the Nashorn engine as a general computation engine. It is powerful, fast has plenty of built-in functions and new functions are very easy to add, using #FunctionalInterface or static methods. Even better, it also provides value-adds like cyclic dependency checking, syntax checking, etc.
However I need to automatically update "output" variables when a dependency changes.
The general idea is that in Java, I'll have something like:
class CalculationEngine {
Data addData(String name, Number value){
...
}
Data addData(String name, String formula){
...
}
String getScript(){
...
}
}
CalculationEngine engine = new CalculationEngine();
Data datum1 = engine.addData("datum1", 1); // Constant integer 1
Data datum2 = engine.addData("datum2", 2); // Constant integer 2
Data datum3 = engine.addData("datum3", "datum1*10");
Data datum4 = engine.addData("datum4", "datum3+datum2");
The CalculationEngine service class knows how to use Nashorn to create a script string out of the Data objects that looks like this:
final String script = engine.getScript(); // "var datum1=1; var datum2=2; var datum3=datum1*10; var datum4=datum3+datum2;"
I know I can parse the script with the Nashorn Parser:
final CompilationUnitTree tree = parser.parse("test", script, null);
But how do I extract the dependencies:
List<Data> whatDependsOn(Data input){
// Process the parsed tree
return list;
}
such that whatDependsOn(datum2) returns [datum4] and whatDependsOn(datum1) returns [datum3, datum4] ?
Or the inverse function getReferencedVariables such that getReferencedVariables(datum3) returns [datum1] and getReferencedVariables(datum4) returns [datum2, datum3] (and I can recursively query getReferencedVariables until all referenced variables have been found).
Basically, when the "value" of one of my Data objects change (due to an external event), how I determine which of my script formulae are affected and need to be recomputed?
I know that the Nashorn script can be parsed but I can not figure out how to use the SimpleTreeVisitorES6 to build up a variable dependency graph:
final CompilationUnitTree tree = parser.parse("test", script, null);
if (tree != null) {
tree.accept(new SimpleTreeVisitorES6<Void, Void>() {
#Override
public Void visitVariable(VariableTree tree, Void v) {
final Kind kind = tree.getKind();
System.out.println("Found a variable: " + kind);
System.out.println(" name: " + kind.toString());
IdentifierTree binding = (IdentifierTree) tree.getBinding();
System.out.println(" kind: " + binding.getKind().name());
System.out.println(" name: " + binding.getName());
System.out.println(" val: " + kind.name());
return null;
}
}, null);
}

one of Nashorn devs here. What you are trying to do is compute the so called def-use relations on source code (well, more likely their transitive closure, but I digress). That's a well-understood compiler theory concept. The good news is that CompilationUnitTree and friends should give you enough information to implement an algorithm for computing this information. The bad news is you'll have to roll up your sleeves and roll your own implementation, I'm afraid. You'll basically have to gather this information, produce merges at control flow join points (back edges and exits of loops, ends of if statements, but you'll also have to handle more exotic stuff like switch/case with their fallthrough semantics and also try/catch/finally, which is the least fun of these as basically control can transfer from anywhere in try block to a catch block.) Your algorithm will also have to repeatedly evaluate loop bodies until the static information you're gathering reaches a fixpoint.
FWIW, while writing Nashorn I had to implement these kinds of things few times using Nashorn's internal parser API (which is different but similar to the public one). If you want some inspiration, you can look into the source code for Nashorn static type analyzer for inferring types of local variables in a JavaScript function which is something I wrote some years ago. If nothing else, it'll give you an idea how to walk an AST tree and keep track of control flow edges and partially computed static analysis data at the edges.
I wish there were an easier way to do this… FWIW, a generalized static analyzer that helps you with bookeeping of flow control could be possible. Good luck.

Related

Parse a single POJO from multiple YAML documents representing different classes

I want to use a single YAML file which contains several different objects - for different applications. I need to fetch one object to get an instance of MyClass1, ignoring the rest of docs for MyClass2, MyClass3, etc. Some sort of selective de-serializing: now this class, then that one... The structure of MyClass2, MyClass3 is totally unknown to the application working with MyClass1. The file is always a valid YAML, of course.
The YAML may be of any structure we need to implement such a multi-class container. The preferred parsing tool is snakeyaml.
Is it sensible? How can I ignore all but one object?
UPD: replaced all "document" with "object". I think we have to speak about the single YAML document containing several objects of different structure. More of it, the parser knows exactly only 1 structure and wants to ignore the rest.
UDP2: I think it is impossible with snakeyaml. We have to read all objects anyway - and select the needed one later. But maybe I'm wrong.
UPD2: sample config file
---
-
exportConfiguration781:
attachmentFieldName: "name"
baseSftpInboxPath: /home/user/somedir/
somebool: false
days: 9999
expected:
- ABC w/o quotes
- "Cat ABC"
- "Some string"
dateFormat: yyyy-MMdd-HHmm
user: someuser
-
anotherConfiguration:
k1: v1
k2:
- v21
- v22
This is definitely possible with SnakeYAML, albeit not trivial. Here's a general rundown what you need to do:
First, let's have a look what loading with SnakeYAML does. Here's the important part of the YAML class:
private Object loadFromReader(StreamReader sreader, Class<?> type) {
Composer composer = new Composer(new ParserImpl(sreader), resolver, loadingConfig);
constructor.setComposer(composer);
return constructor.getSingleData(type);
}
The composer parses YAML input into Nodes. To do that, it doesn't need any knowledge about the structure of your classes, since every node is either a ScalarNode, a SequenceNode or a MappingNode and they just represent the YAML structure.
The constructor takes a root node generated by the composer and generates native POJOs from it. So what you want to do is to throw away parts of the node graph before they reach the constructor.
The easiest way to do that is probably to derive from Composer and override two methods like this:
public class MyComposer extends Composer {
private final int objIndex;
public MyComposer(Parser parser, Resolver resolver, int objIndex) {
super(parser, resolver);
this.objIndex = objIndex;
}
public MyComposer(Parser parser, Resolver resolver, LoaderOptions loadingConfig, int objIndex) {
super(parser, resolver, loadingConfig);
this.objIndex = objIndex;
}
#Override
public Node getNode() {
return strip(super.getNode());
}
private Node strip(Node input) {
return ((SequenceNode)input).getValue().get(objIndex);
}
}
The strip implementation is just an example. In this case, I assumed your YAML looks like this (object content is arbitrary):
- {first: obj}
- {second: obj}
- {third: obj}
And you simply select the object you actually want to deserialize by its index in the sequence. But you can also have something more complex like a searching algorithm.
Now that you have your own composer, you can do
Constructor constructor = new Constructor();
// assuming we want to get the object at index 1 (i.e. second object)
Composer composer = new MyComposer(new ParserImpl(sreader), new Resolver(), 1);
constructor.setComposer(composer);
MyObject result = (MyObject)constructor.getSingleData(MyObject.class);
The answer of #flyx was very helpful for me, opening the way to workaround the library (in our case - snakeyaml) limitations by overriding some methods. Thanks a lot! It's quite possible there is a final solution in it - but not now. Besides, the simple solution below is robust and should be considered even if we'd found the complete library-intruding solution.
I've decided to solve the task by double distilling, sorry, processing the configuration file. Imagine the latter consisting of several parts and every part is marked by the unique token-delimiter. For the sake of keeping the YAML-likenes, it may be
---
#this is a unique key for the configuration A
<some YAML document>
---
#this is another key for the configuration B
<some YAML document
The first pass is pre-processing. For the given String fileString and String key (and DELIMITER = "\n---\n". for example) we select a substring with the key-defined configuration:
int begIndex;
do {
begIndex= fileString.indexOf(DELIMITER);
if (begIndex == -1) {
break;
}
if (fileString.startsWith(DELIMITER + key, begIndex)) {
fileString = fileString.substring(begIndex + DELIMITER.length() + key.length());
break;
}
// spoil alien delimiter and repeat search
fileString = fileString.replaceFirst(DELIMITER, " ");
} while (true);
int endIndex = fileString.indexOf(DELIMITER);
if (endIndex != -1) {
fileString = fileString.substring(0, endIndex);
}
Now we feed the fileString to the simple YAML parsing
ExportConfiguration configuration = new Yaml(new Constructor(ExportConfiguration.class))
.loadAs(fileString, ExportConfiguration.class);
This time we have a single document that must co-respond to the ExportConfiguration class.
Note 1: The structure and even the very content of the rest of configuration file plays absolutely no role. This was the main idea, to get independent configurations in a single file
Note 2: the rest of configurations may be JSON or XML or whatever. We have a method-preprocessor that returns a String configuration - and the next processor parses it properly.

Best way to parse commands in a java text-based game

I'm developing a text based game in java and I'm looking for the best way to deal with player's commands. Commands allow the player to interact with the environment, like :
"look north" : to have a full description of what you have in the north direction
"drink potion" : to pick an object named "potion" in your inventory and drink it
"touch 'strange button'" : touch the object called 'strange button' and trigger an action if there is one attached to it, like "oops you died..."
"inventory" : to have a full description of your inventory
etc...
My objective is now to develop a complete set of those simple commands but I'm having trouble to find an easy way to parse it. I would like to develop a flexible and extensible parser which could call the main command like "look", "use", "attack", etc... and each of them would have a specific syntax and actions in the game.
I found a lot of tools to parse command line arguments like -i -v --verbose but none of them seems to have the sufficient flexibility to fit my needs. They can parse one by one argument but without taking into account a specific syntax for each of them. I tried JCommander which seems to be perfect but I'm lost between what is an argument, a parameter, who call who, etc...
So if someone could help me to pick the correct java library to do that, that would be great :)
Unless you're dealing with complex command strings that involve for instance arithmetic expressions or well balanced parenthesis I would suggest you go with a plain Scanner.
Here's an example that I would find readable and easy to maintain:
interface Action {
void run(Scanner args);
}
class Drink implements Action {
#Override
public void run(Scanner args) {
if (!args.hasNext())
throw new IllegalArgumentException("What should I drink?");
System.out.println("Drinking " + args.next());
}
}
class Look implements Action {
#Override
public void run(Scanner args) {
if (!args.hasNext())
throw new IllegalArgumentException("Where should I look?");
System.out.println("Looking " + args.next());
}
}
And use it as
Map<String, Action> actions = new HashMap<>();
actions.put("look", new Look());
actions.put("drink", new Drink());
String command = "drink coke";
// Parse
Scanner cmdScanner = new Scanner(command);
actions.get(cmdScanner.next()).run(cmdScanner);
You could even make it fancier and use annotations instead as follows:
#Retention(RetentionPolicy.RUNTIME)
#interface Command {
String value();
}
#Command("drink")
class Drink implements Action {
...
}
#Command("look")
class Look implements Action {
...
}
And use it as follows:
List<Action> actions = Arrays.asList(new Drink(), new Look());
String command = "drink coke";
// Parse
Scanner cmdScanner = new Scanner(command);
String cmd = cmdScanner.next();
for (Action a : actions) {
if (a.getClass().getAnnotation(Command.class).value().equals(cmd))
a.run(cmdScanner);
}
I don't think you want to parse command line arguments. That would mean each "move" in your game would require running a new JVM instance to run a different program and extra complexity of saving state between JVM sessions etc.
This looks like a text based game where you prompt users for what to do next. You probably just want to have users enter input on STDIN.
Example, let's say your screen says:
You are now in a dark room. There is a light switch
what do you want to do?
1. turn on light
2. Leave room back the way you came.
Please choose option:
then the user types 1 or 2 or if you want to be fancy turn on light etc. then you readLine() from the STDIN and parse the String to see what the user chose. I recommend you look at java.util.Scannerto see how to easily parse text
Scanner scanner = new Scanner(System.in);
String userInput = scanner.readLine();
//parse userInput string here
the fun part of it is to have some command is human readable, which at the same time, it's machine parsable.
first of all, you needs to define the syntax of your language, for example:
look (north|south|east|west)
but it's in regular expression, it's generally speaking not a best way to explain a syntactical rule, so i would say this is better:
Sequence("look", Xor("north", "south", "east", "west"));
so by doing this, i think you've got the idea. you need to define something like:
public abstract class Syntax { public abstract boolean match(String cmd); }
then
public class Atom extends Syntax { private String keyword; }
public class Sequence extends Syntax { private List<Syntax> atoms; }
public class Xor extends Syntax { private List<Syntax> atoms; }
use a bunch of factory functions to wrap the constructors, returning Syntax. then you will have something like this eventually:
class GlobeSyntax
{
Syntax syntax = Xor( // exclusive or
Sequence(Atom("look"),
Xor(Atom("north"), Atom("south"), Atom("east"), Atom("west"))),
Sequence(Atom("drink"),
Or(Atom("Wine"), Atom("Drug"), Atom("Portion"))), // may drink multiple at the same time
/* ... */
);
}
or so.
now what you need is just a recursive parser according to these rules.
you can see, it's recursive structure, very easy to code up, and very easy to maintain. by doing this, your command is not only human readable, but machine parsable.
sure it's not finished yet, you needs to define action. but it's easy right? it's typical OO trick. all to need to do is to perform something when Atom is matched.

Static analysis of Javascript with Java

I need to do a static analysis of Javascript files using Java. Here I need to check whether the Javascript file has any function calls to document.write() or reference to properties like innerHTML etc. Can I use javax.script.* package to achieve this? or Which Java api do I need to use for Parsing? Also can you provide examples for the same?
You can't statically analyze Javascript in the way you intend because Javascript is not a statically typed language.
You can check for document.write() but what if my code was this:
var whatever = document; whatever.write()
Or do you want to reject any function named write() even if it didn't write to the document?
Furthermore, Javascript has an eval function so you could always do:
var m = "ment";
eval("docu" + m + ".wri" + "te('hahahaha')");`.
How are you going to check for that?
Similarly, property access can be done in many ways.
Imagine this piece of code:
var x = document.children[0];
x.innerHTML = ...;
x["inner" + "HTML"] = ...;
var y = "inner";
x[y + "HTML"] = ...;
You're not going to be able to detect all those variants, and the hundreds more variants that you could make, using static analysis.

How to merge two ASTs?

I'm trying to implement a tool for merging different versions of some source code. Given two versions of the same source code, the idea would be to parse them, generate the respective Abstract Source Trees (AST), and finally merge them into a single output source keeping grammatical consistency - the lexer and parser are those of question ANTLR: How to skip multiline comments.
I know there is class ParserRuleReturnScope that helps... but getStop() and getStart() always return null :-(
Here is a snippet that illustrates how I modified my perser to get rules printed:
parser grammar CodeTableParser;
options {
tokenVocab = CodeTableLexer;
backtrack = true;
output = AST;
}
#header {
package ch.bsource.ice.parsers;
}
#members {
private void log(ParserRuleReturnScope rule) {
System.out.println("Rule: " + rule.getClass().getName());
System.out.println(" getStart(): " + rule.getStart());
System.out.println(" getStop(): " + rule.getStop());
System.out.println(" getTree(): " + rule.getTree());
}
}
parse
: codeTabHeader codeTable endCodeTable eof { log(retval); }
;
codeTabHeader
: comment CodeTabHeader^ { log(retval); }
;
...
Assuming you have the ASTs (often difficult to get in the first place, parsing real languages is often harder than it looks), you first have to determine what they have in common, and build a mapping collecting that information. That's not as easy as it looks; do you count a block of code that has moved, but is the same exact subtree, as "common"? What about two subtrees that are the same except for consistent renaming of an identifier? What about changed comments? (most ASTs lose the comments; most programmers will think this is a really bad idea).
You can build a variation of the "Longest Common Substring" algorithm to compare trees. I've used that in tools that I have built.
Finally, after you've merged the trees, now you need to regenerate the text, ideally preserving most of the layout of the original code. (Programmers hate when you change the layout they so loving produced). So your ASTs need to capture position information, and your regeneration has to honor that where it can.
The call to log(retval) in your parser code looks like it's going to happen at the end of the rule, but it's not. You'll want to move the call into an #after block.
I changed log to spit out a message as well as the scope information and added calls to it to my own grammar like so:
script
#init {log("#init", retval);}
#after {log("#after", retval);}
: statement* EOF {log("after last rule reference", retval);}
-> ^(STMTS statement*)
;
Parsing test input produced the following output:
Logging from #init
getStart(): [#0,0:4='Print',<10>,1:0]
getStop(): null
getTree(): null
Logging from after last rule reference
getStart(): [#0,0:4='Print',<10>,1:0]
getStop(): null
getTree(): null
Logging from #after
getStart(): [#0,0:4='Print',<10>,1:0]
getStop(): [#4,15:15='<EOF>',<-1>,1:15]
getTree(): STMTS
The call in the after block has both the stop and tree fields populated.
I can't say whether this will help you with your merging tool, but I think this will at least get you past the problem with the half-populated scope object.

How to use an array value as field in Java? a1.section[2] = 1;

New to Java, and can't figure out what I hope to be a simple thing.
I keep "sections" in an array:
//Section.java
public static final String[] TOP = {
"Top News",
"http://www.mysite.com/RSS/myfeed.csp",
"top"
};
I'd like to do something like this:
Article a1 = new Article();
a1.["s_" + section[2]] = 1; //should resolve to a1.s_top = 1;
But it won't let me, as it doesn't know what "section" is. (I'm sure seasoned Java people will cringe at this attempt... but my searches have come up empty on how to do this)
Clarification:
My article mysqlite table has fields for the "section" of the article:
s_top
s_sports
...etc
When doing my import from an XML file, I'd like to set that field to a 1 if it's in that category. I could have switch statement:
//whatever the Java version of this is
switch(section[2]) {
case "top": a1.s_top = 1; break;
case "sports": a1.s_sports = 1; break;
//...
}
But I thought it'd be a lot easier to just write it as a single line:
a1["s_"+section[2]] = 1;
In Java, it's a pain to do what you want to do in the way that you're trying to do it.
If you don't want to use the switch/case statement, you could use reflection to pull up the member attribute you're trying to set:
Class articleClass = a1.getClass();
Field field = articleClass.getField("s_top");
field.set(a1, 1);
It'll work, but it may be slow and it's an atypical approach to this problem.
Alternately, you could store either a Map<String> or a Map<String,Boolean> inside of your Article class, and have a public function within Article called putSection(String section), and as you iterate, you would put the various section strings (or string/value mappings) into the map for each Article. So, instead of statically defining which sections may exist and giving each Article a yes or no, you'd allow the list of possible sections to be dynamic and based on your xml import.
Java variables are not "dynamic", unlink actionscript for exemple. You cannot call or assign a variable without knowing it at compile time (well, with reflection you could but it's far to complex)
So yes, the solution is to have a switch case (only possible on strings with java 1.7), or using an hashmap or equivalent
Or, if it's about importing XML, maybe you should take a look on JAXB
If you are trying to get an attribute from an object, you need to make sure that you have "getters" and "setters" in your object. You also have to make sure you define Section in your article class.
Something like:
class Article{
String section;
//constructor
public Article(){
};
//set section
public void setSection(Section section){
this.section = section;
}
//get section
public String getSection(){
return this.section;
}

Categories

Resources