SecureASTCustomizer: how to restrict loops?

SecureASTCustomizer: how to restrict loops? - java

I'm trying to restrict using loops(FOR and WHILE operators) in Groovy script.
I tried http://groovy-sandbox.kohsuke.org/ but it seems to be not possible to restrict loops with this lib.
Code:
final String script = "while(true){}";
final ImportCustomizer imports = new ImportCustomizer();
imports.addStaticStars("java.lang.Math");
imports.addStarImports("groovyx.net.http");
imports.addStaticStars("groovyx.net.http.ContentType", "groovyx.net.http.Method");
final SecureASTCustomizer secure = new SecureASTCustomizer();
secure.setClosuresAllowed(true);
List<Integer> tokensBlacklist = new ArrayList<>();
tokensBlacklist.add(Types.KEYWORD_WHILE);
secure.setTokensBlacklist(tokensBlacklist);
final CompilerConfiguration config = new CompilerConfiguration();
config.addCompilationCustomizers(imports, secure);
Binding intBinding = new Binding();
GroovyShell shell = new GroovyShell(intBinding, config);
final Object eval = shell.evaluate(script);
Whats wrong with my code or probably some one knows how I can restrict some loops or operators?

WHILE and FOR are statements. You should rather try adding them as statementsBlacklist instead of tokenBlacklist.
List<Class> statementBlacklist = new ArrayList<>();
statementBlacklist.add( org.codehaus.groovy.ast.stmt.WhileStatement );
secure.setStatementsBlacklist( statementBlacklist );

Related

Reuse normalizer in ND4J/DL4J

I wonder what's the proper way to reuse a normalizer in ND4J/DL4J. Currently, I save it follows:
final DataNormalization normalizer = new NormalizerStandardize();
normalizer.fit( trainingData );
normalizer.transform( trainingData );
normalizer.transform( testData );
try {
final NormalizerSerializer normalizerSerializer = new NormalizerSerializer();
normalizerSerializer.addStrategy( new StandardizeSerializerStrategy() );
normalizerSerializer.write( normalizer, path );
} catch ( final IOException e ) {
// ...
}
And load it via:
try {
final NormalizerSerializer normalizerSerializer = new NormalizerSerializer();
normalizerSerializer.addStrategy( new StandardizeSerializerStrategy() );
final DataNormalization normalizer = normalizerSerializer.restore( path );
} catch ( final Exception e ) { // Throws Exception instead of IOException.
// ...
}
Is that OK? Unfortunately, I wasn't able to find more information in the docs.

This is what I do...
DataNormalization normalizer = new NormaizerStandardize();
normalizer.fit(trainingData);
normalizer.transform(trainingData);
save it
NormalizerSerializer saver = NormalizerSerializer.getDefaults();
File normalsFile = new File("fileName");
saver.write(normalizer,normalsFile);
restore it
NormalizerSerializer loader = NormalizerSerializer.getDefaults();
DataNormalization restoredNormalizer = loader.restore(normalsFile);
restoredNormalizer.transform(testData);
The ND4J Java Docs say that .getDefaults() gets a serializer, configured with strategies for the built-in normalizer implementations. As you are using NormalizerStandardize the getDefaults() offers a short-hand way of achieving the same end without explicitly adding the strategy.

Get groovy script free variables in runtime

I want to know how get all free variables from Groovy script from Java code.
Groovy script:
Integer v = a - b;
Integer k = 6 * v;
return k > 0;
Call from java:
Binding binding = new Binding();
GroovyShell groovyShell = new GroovyShell(binding);
Script script = groovyShell.parse(...);
script.getFreeVariables(); // == set with "a","b". Want something like this.
I know rude way - script.run() and then catch exception. In exception I get name of var that I don't pass to the script.

groovy:
def s0=a+b+2
s1=a+b+1
a=b*2
a+b+3 //the last expression will be returned. the same as return a+b+3
java:
GroovyShell groovyShell = new GroovyShell();
Script script = groovyShell.parse(...);
Map bindings = script.getBinding().getVariables();
bindings.put("a",new Long(1));
bindings.put("b",new Long(2));
Object ret = script.run(); //a+b+3
//and if you changed variables in script you can get their values
Object aAfter = bindings.get("a"); //4 because `a=b*2` in groovy
Object bAfter = bindings.get("b"); //2 not changed
//also bindings will have all undeclared variables
Object s1 = bindings.get("s1"); //a+b+1
//however declared variables will not be visible - they are local
Object s0 = bindings.get("s0"); //null

by default groovy has dynamic resolver at runtime and not at compiletime.
you can try catch access to those properties:
1/
import org.codehaus.groovy.control.CompilerConfiguration;
abstract class MyScript extends groovy.lang.Script{
public Object getProperty(String name){
System.out.println("getProperty "+name);
return new Long(5);
}
}
CompilerConfiguration cc = new CompilerConfiguration()
cc.setScriptBaseClass( MyScript.class.getName() )
GroovyShell groovyShell = new GroovyShell(this.getClass().getClassLoader(),cc);
Script script = groovyShell.parse("1+2+a");
Object ret = script.run();
2/
GroovyShell groovyShell = new GroovyShell();
Script script = groovyShell.parse("1+2+a");
Map bindings = new HashMap(){
public Object get(Object key){
System.out.println ("get "+key);
return new Long(5);
}
}
script.setBinding(new Binding(bindings));
Object ret = script.run();

Generate multiple output files using MultiSinkTap

I had the following dataset as input
id,name,gender
asinha161,Aniruddha,Male
vic,Victor,Male
day1,Daisy,Female
jazz030,Jasmine,Female
Mic002,Michael,Male
I aimed at segregating the males and females into two separate output files as follows
Dataset for males
id,name,gender
asinha161,Aniruddha,Male
vic,Victor,Male
Mic002,Michael,Male
Dataset for females
id,name,gender
day1,Daisy,Female
jazz030,Jasmine,Female
Now, I attempted to write a Cascading Framework code which is supposed to do the above task, the code is as follows
public class Main {
public static void main(String[] args) {
Tap sourceTap = new FileTap(new TextDelimited(true, ","), "inputFile.txt");
Tap sink_one = new FileTap(new TextDelimited(true, ","), "maleFile.txt");
Tap sink_two = new FileTap(new TextDelimited(true, ","), "FemaleFile.txt");
Pipe assembly = new Pipe("inputPipe");
// ...split into two pipes
Pipe malePipe = new Pipe("for_male", assembly);
malePipe=new Each(malePipe,new CustomFilterByGender("male"));
Pipe femalePipe = new Pipe("for_female", assembly);
femalePipe=new Each(femalePipe, new CustomFilterByGender("female"));
// create the flow
List<Pipe> pipes = new ArrayList<Pipe>(2)
{{pipes.add(countOne);
pipes.add(countTwo);}};
Tap outputTap=new MultiSinkTap<>(sink_one,sink_two);
FlowConnector flowConnector = new LocalFlowConnector();
Flow flow = flowConnector.connect(sourceTap, outputTap, pipes);
flow.complete();
}
where CustomFilterByGender(String gender); is a custom function that returns tuples according to the gender value passed as argument.
Please note that I have not used Custom Buffer for the sake of efficiency.
Using MultiSinkTap, I am not able to get the desired output since the connect() method of the LocalFlowConnector object is not accepting the MultiSinkTap Object which results to a compilation time error.
It will be imperative if you suggest possible changes in the above code to make it work or the way to use MultiSinkTap.
Thankyou for patiently going through the question :)

I think you want to write output of different pipes into different output files, I made some changes in your code that should solve your purpose definitely.
public class Main {
public static void main(String[] args) {
Tap sourceTap = new FileTap(new TextDelimited(true, ","), "inputFile.txt");
Tap sink_one = new FileTap(new TextDelimited(true, ","), "maleFile.txt");
Tap sink_two = new FileTap(new TextDelimited(true, ","), "FemaleFile.txt");
Pipe assembly = new Pipe("inputPipe");
Pipe malePipe = new Pipe("for_male", assembly);
malePipe=new Each(malePipe,new CustomFilterByGender("male"));
Pipe femalePipe = new Pipe("for_female", assembly);
femalePipe=new Each(femalePipe, new CustomFilterByGender("female"));
List<Pipe> pipes = new ArrayList<Pipe>(2);
pipes.add(malePipe);
pipes.add(femalePipe);
Map<String, Tap> sinks = new HashMap<String, Tap>();
sinks.put("for_male", sink_one);
sinks.put("for_female", sink_two);
FlowConnector flowConnector = new LocalFlowConnector();
Flow flow = flowConnector.connect(sourceTap, sinks, pipes);
flow.complete();
}
Instead of using MultiSinkTap you can directly give the Map<> of Sinks those you want to connect to the output pipes in this case malePipe and femalePipe.

Drools In Spark for Streaming File

We were able to successfully Integrate drools with spark, When we try to apply rules from Drools we were able to do for Batch file, which is present in HDFS, But we tried to use drools for Streaming file so that we can make decision instantly, But we couldn't figure out how to do it.Below is the snippet of the code what we are trying to achieve.
Case1: .
SparkConf conf = new SparkConf().setAppName("sample");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDD<String> javaRDD = sc.textFile("/user/root/spark/sample.dat");
List<String> store = new ArrayList<String>();
store = javaRDD.collect();
Case 2: when we use streaming context
SparkConf sparkconf = new SparkConf().setAppName("sparkstreaming");
JavaStreamingContext ssc =
new JavaStreamingContext(sparkconf, new Duration(1));
JavaDStream<String> lines = ssc.socketTextStream("xx.xx.xx.xx", xxxx);
In the first case we were able apply our rules on the variable store, but in the second case we were not able to apply rules on the dstream lines.
If someone has some idea, how it can be done, will be a great help.

Here is one way to get it done.
Create your knowledge session with business rules first.
//Create knowledge and session here
KnowledgeBase kbase = KnowledgeBaseFactory.newKnowledgeBase();
KnowledgeBuilder kbuilder = KnowledgeBuilderFactory.newKnowledgeBuilder();
kbuilder.add( ResourceFactory.newFileResource( "rulefile.drl"),
ResourceType.DRL );
Collection<KnowledgePackage> pkgs = kbuilder.getKnowledgePackages();
kbase.addKnowledgePackages( pkgs );
final StatelessKnowledgeSession ksession = kbase.newStatelessKnowledgeSession();
Create JavaDStream using StreamingContext.
SparkConf sparkconf = new SparkConf().setAppName("sparkstreaming");
JavaStreamingContext ssc =
new JavaStreamingContext(sparkconf, new Duration(1));
JavaDStream<String> lines = ssc.socketTextStream("xx.xx.xx.xx", xxxx);
Call DStream's foreachRDD to create facts and fire your rules.
lines.foreachRDD(new Function<JavaRDD<String>, Void>() {
#Override
public Void call(JavaRDD<String> rdd) throws Exception {
List<String> facts = rdd.collect();
//Apply rules on facts here
ksession.execute(facts);
return null;
}
});

How to print out the predicted class after cross-validation in WEKA

Once a 10-fold cross-validation is done with a classifier, how can I print out the prediced class of every instance and the distribution of these instances?
J48 j48 = new J48();
Evaluation eval = new Evaluation(newData);
eval.crossValidateModel(j48, newData, 10, new Random(1));
When I tried something similar to below, it said that the classifier is not built.
for (int i=0; i<data.numInstances(); i++){
System.out.println(j48.distributionForInstance(newData.instance(i)));
}
What I'm trying to do is the same function as in the WEKA GUI wherein once a classifier is trained, I can click on Visualize classifier error" > Save, and I will find the predicted class in the file. But now I need it in to work in my own Java code.
I have tried something like below:
J48 j48 = new J48();
Evaluation eval = new Evaluation(newData);
StringBuffer forPredictionsPrinting = new StringBuffer();
weka.core.Range attsToOutput = null;
Boolean outputDistribution = new Boolean(true);
eval.crossValidateModel(j48, newData, 10, new Random(1), forPredictionsPrinting, attsToOutput, outputDistribution);
Yet it prompts me the error:
Exception in thread "main" java.lang.ClassCastException: java.lang.StringBuffer cannot be cast to weka.classifiers.evaluation.output.prediction.AbstractOutput

The crossValidateModel() method can take a forPredictionsPrinting varargs parameter that is a weka.classifiers.evaluation.output.prediction.AbstractOutput instance.
The important part of that is a StringBuffer to hold a string representation of all the predictions. The following code is in untested JRuby, but you should be able to convert it for your needs.
j48 = j48.new
eval = Evalution.new(newData)
predictions = java.lange.StringBuffer.new
eval.crossValidateModel(j48, newData, 10, Random.new(1), predictions, Range.new('1'), true)
# variable predictions now hold a string of all the individual predictions

I was stuck some days ago. I wanted to to evaluate a Weka classifier in matlab using a matrix instead of loading from an arff file. I use http://www.mathworks.com/matlabcentral/fileexchange/21204-matlab-weka-interface and the following source code. I hope this help someone else.
import weka.classifiers.*;
import java.util.*
wekaClassifier = javaObject('weka.classifiers.trees.J48');
wekaClassifier.buildClassifier(processed);%Loaded from loadARFF
e = javaObject('weka.classifiers.Evaluation',processed);%Loaded from loadARFF
myrand = Random(1);
plainText = javaObject('weka.classifiers.evaluation.output.prediction.PlainText');
buffer = javaObject('java.lang.StringBuffer');
plainText.setBuffer(buffer)
bool = javaObject('java.lang.Boolean',true);
range = javaObject('weka.core.Range','1');
array = javaArray('java.lang.Object',3);
array(1) = plainText;
array(2) = range;
array(3) = bool;
e.crossValidateModel(wekaClassifier,testing,10,myrand,array)
e.toClassDetailsString
Asdrúbal López-Chau

clc
clear
%Load from disk
fileDataset = 'cm1.arff';
myPath = 'C:\Users\Asdrubal\Google Drive\Respaldo\DoctoradoALCPC\Doctorado ALC PC\AlcMobile\AvTh\MyPapers\Papers2014\UnderOverSampling\data\Skewed\datasetsKeel\';
javaaddpath('C:\Users\Asdrubal\Google Drive\Respaldo\DoctoradoALCPC\Doctorado ALC PC\AlcMobile\JarsForExperiments\weka.jar');
wekaOBJ = loadARFF([myPath fileDataset]);
%Transform from data into Matlab
[data, featureNames, targetNDX, stringVals, relationName] = ...
weka2matlab(wekaOBJ,'[]');
%Create testing and training sets in matlab format (this can be improved)
[tam, dim] = size(data);
idx = randperm(tam);
testIdx = idx(1 : tam*0.3);
trainIdx = idx(tam*0.3 + 1:end);
trainSet = data(trainIdx,:);
testSet = data(testIdx,:);
%Trasnform the training and the testing sets into the Weka format
testingWeka = matlab2weka('testing', featureNames, testSet);
trainingWeka = matlab2weka('training', featureNames, trainSet);
%Now evaluate classifier
import weka.classifiers.*;
import java.util.*
wekaClassifier = javaObject('weka.classifiers.trees.J48');
wekaClassifier.buildClassifier(trainingWeka);
e = javaObject('weka.classifiers.Evaluation',trainingWeka);
myrand = Random(1);
plainText = javaObject('weka.classifiers.evaluation.output.prediction.PlainText');
buffer = javaObject('java.lang.StringBuffer');
plainText.setBuffer(buffer)
bool = javaObject('java.lang.Boolean',true);
range = javaObject('weka.core.Range','1');
array = javaArray('java.lang.Object',3);
array(1) = plainText;
array(2) = range;
array(3) = bool;
e.crossValidateModel(wekaClassifier,testingWeka,10,myrand,array)%U
e.toClassDetailsString

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

SecureASTCustomizer: how to restrict loops? - java

WHILE and FOR are statements. You should rather try adding them as statementsBlacklist instead of tokenBlacklist. List<Class> statementBlacklist = new ArrayList<>(); statementBlacklist.add( org.codehaus.groovy.ast.stmt.WhileStatement ); secure.setStatementsBlacklist( statementBlacklist );

Related

Reuse normalizer in ND4J/DL4J

Get groovy script free variables in runtime

Generate multiple output files using MultiSinkTap

Drools In Spark for Streaming File

How to print out the predicted class after cross-validation in WEKA

Categories

Resources