Reformatting code with Regular Expressions

Reformatting code with Regular Expressions - java

We have an ArrayList of items in several classes which are giving me trouble every time I'd like to insert a new item into the list. It was a mistake on my part to have designed the classes in the way I did but changing the design now would be more headache than it's worth (bureaucratic waterfall model.) I should have anticipated format changes to the documents the customer was supplying us waterfall be damned.
I'd like to write a simple script in python which goes into a class, adds the item to the list, and then increments all retrievals for the following items. That doesn't sound very explanatory:
Foo extends Bar{
public Foo(){
m_Tags.add("Jane");
m_Tags.add("Bob");
m_Tags.add("Jim");
}
public String GetJane() { return m_ParsedValue.get( m_Tags.get(1) ); }
public String GetBob() { return m_ParsedValue.get( m_Tags.get(2) ); }
public String GetJim() { return m_ParsedValue.get( m_Tags.get(3) ); }
}
You see if I want to add a value between "Jane" and "Bob" I then have to increment the integers in the Get* functions. I just want to write a simple script in Python that does the work for me. Someone I very much respect suggested regex.
Edit:
Yes, LinkedHashMap. So simple, so easy and so not in the design specs now. I hate waterfall. Hate it with a passion. This whole bit was a "small" and "easy" part that "shouldn't take much time to design." I made mistakes. It's stuck in stone now.

You want your regular expression to be as flexible as the compiler will be with respect to whitespace between tokens. Doing so and mimicking whitespace usage makes the pattern pretty messy. The code below (sorry: Perl, not Python) edits your source files in-place.
#! /usr/bin/perl -i.bak
use warnings;
use strict;
my $template =
'^( public
String
Get)(\w+)( \( \) { return
m_ParsedValue . get \( m_Tags . get \( )(\d+)( \) \) ; } )$';
$template =~ s/ +/\\s*/g;
$template =~ s/(\r?\n)+/\\s+/g;
my $getter = qr/$template/x;
die "Usage: $0 after new-name source ..\n" unless #ARGV >= 3;
my $after = shift;
my $add = shift;
my $index;
while (<>) {
unless (/$getter/) {
print;
next;
}
my($abc,$name,$lmno,$i,$xyz) = ($1,$2,$3,$4,$5);
if (defined $index) {
print join "" => $abc, $name, $lmno, ++$index, $xyz;
}
else {
if ($name eq $after) {
$index = $i;
print; print join "" => $abc, $add, $lmno, ++$index, $xyz;
}
else { print; }
}
}
For example,
$ ./add-after Jane Foo code.java
$ cat code.java
Foo extends Bar{
public Foo(){
m_Tags.add("Jane");
m_Tags.add("Bob");
m_Tags.add("Jim");
}
public String GetJane() { return m_ParsedValue.get( m_Tags.get(1) ); }
public String GetFoo() { return m_ParsedValue.get( m_Tags.get(2) ); }
public String GetBob() { return m_ParsedValue.get( m_Tags.get(3) ); }
public String GetJim() { return m_ParsedValue.get( m_Tags.get(4) ); }
}

Don't do this with regexp. Create symbolic constants (using for example an enum) that map the names to numbers.

Comments about bad-practices apart - here is the code you asked in the language you asked for.
The best thing if you are keeping the system this way, probably would be to make these java files be automatically generated in the build process itself -- you 'd just keep a names list in a .txt file in the directory. This script is suitable to do that.
(It won't modify your files, it genrate new ones based on the template you posted here)
import re, sys
template = """Foo extends Bar{
public Foo(){
%s
}
%s
}
"""
tag_templ = """ m_Tags.add("%s");"""
getter_templ = """ public String GetJane() { return m_ParsedValue.get( m_Tags.get(%d) ); }"""
def parse_names(filename):
data = open(filename).read()
names = re.findall(r'm_Tags\.add\("(.*?)"', data)
return names
def create_file(filename, names):
tag_lines = [tag_templ % name for name in names]
getter_lines = [getter_templ % (i + 1) for i in range(len(names))]
code = template % ("\n".join(tag_lines), "\n".join(getter_lines))
file = open(filename,"wt")
file.write(code)
file.close()
def insert_name(after, new_name, names):
names.insert(names.index(after) + 1, new_name)
if __name__ == "__main__":
if len(sys.argv ) < 4:
sys.stderr.write("Usage: changer.py <filename> <name-before-insertion> <new-name>")
sys.exit(1)
filename, name_before, new_name = sys.argv[1:]
names = parse_names(filename)
insert_name(name_before, new_name, names)
create_file(filename, names)

I'm doing this (well, something very similar) right now, but using Excel and VBA macros. All of the Business Values are organized and ordered in a spreadsheet. I just have to click a button to generate the appropriate code for the selected cells, then copy-paste to the IDE. Better yet, I have several "code columns" for each row. Some of them generate queries, some XSL transformations, and some procedures. For one row of business data, I can get all three types of generated code very easily.
I found this (re-generate) to be MUCH easier than re-formatting the existing code I had.

Related

Checking whether any of the string from a list of strings is present in a paragraph(string)?

I have a list of strings and a string value in my bean file.
Here is my bean file
public class DataBean {
List<String> combinations;
private String story;
public String getStory() {
return story;
}
public void setStory(String story) {
this.story = story;
}
public List<String> getCombinations() {
return combinations;
}
public void setCombinations(List<String> combinations) {
this.combinations = combinations;
}
public void addString(String value) {
if (combinations == null) {
combinations = new ArrayList<String>();
}
combinations.add(value);
}
I want to check whether the story consists of any of the strings from the list combinations
If yes then print true and if no then print false.
I want to create this rule in my DRL file file but I couldn't understand the syntax. I am totally new at drools kindly help me with this.
I could create and execute simple rules but I am not able to figure out rules of such nature.

Unless you have grossly mis-stated the nature of your problem, there is no simple way of writing a rule telling you whether the story is made up from the strings in combinations. I suggest that you write a
public static boolean isComposed(){
// ...
}
in class DataBean implementing the algorithm (see my comment) and use this on the RHS to check whether a fact of type DataBean contains data according to this condition.
Edit According to the comment below my answer (but not in agreement with the phrase "whether the story consists of any of the strings from the list) you can write the following rule:
rule "check for word"
when
Wordlist( $words: words )
$word: String() from $words
$story: Story( $text: text, $text.contains( $word ) )
then
// String $story.text has $word as a substring
end
with classes
class Wordlist { List<String> words; ... }
class Story { String text; ... }
Note that the contains in the rule is java.lang.String.contains, which simply tests whether the argument is a substring of the String object. The rule will fire once for each word from the list words that occurs in text.
Also, this will produce misleading results. If, for instance, the story contains the word "portmanteau" and the list is made up from "port", "or", "man" and "ant", you'll get four incorrect firings. You can "cook" the preceding rule by using
rule "check for word"
when
Wordlist( $words: words )
$word: String() from $words
$story: Story( $text: text,
$text.matches( ".*+\\b" + $word + "\\b.*" ) )
then
// String $story.text contains the $word
end
I add this to emphasize that a question must be stated precisely in order to elicit useful answers. Maybe neither of my proposals is what you are looking for.

Get last evaluated expression inside function

This is related to this other question:
Last evaluated expression in Javascript
But I wanted to provide more details about what I wanted to do and show how I finally solved the problem as some users requested in the comments.
I have snippets of Javascript that are written by users of my app. This snippets need to go to a kind of template like this:
var foo1 = function(data, options) {
<snippet of code written by user>
}
var foo2 = function(data, options) {
<snippet of code written by user>
}
...
Expressions can be very different, from simple things like this:
data.price * data.qty
To more complex tings like this:
if (data.isExternal) {
data.email;
} else {
data.userId;
}
The value returned by the function should be always the last evaluated expression.
Before we had something like this:
var foo1 = function(data, options) {
return eval(<snippet of code written by user>);
}
But due to optimizations and changes we are making, we cannot keep using eval, but we need to return the last evaluated expression.
Just adding a 'return' keyword won't work because expressions can have several statements. So I need to make those functions return the last evaluated expressions.
Restrictions and clarification:
I cannot force users to add the 'return' keyword to all the scripts they have because there are many scripts written already and it is not very intuitive for simple expressions like 'a * b'.
I'm using Java and Rhino to run Javascripts on server side.

As people pointed out in Last evaluated expression in Javascript, getting the last evaluated expression is not possible in standard Javascript.
What I finally ended up doing, as suggested by FelixKling, was to manipulate the AST of the script written by the user. This way I store the user written script and the modified version, which is the one I finally run.
For manipulating the AST I used Rhino and basically modify all EXPR_RESULT nodes to store the result in a variable that I finally return at the end of the script. Here is the code to do that:
public class ScriptsManipulationService {
private static final Logger logger = LoggerFactory.getLogger(ScriptsManipulationService.class);
public String returnLastExpressionEvaluated(String script) {
Parser jsParser = new Parser();
try {
AstRoot ast = jsParser.parse(script, "script", 1);
ast.getType();
ast.visitAll(new NodeVisitor() {
#Override
public boolean visit(AstNode node) {
if (node.getType() == Token.EXPR_RESULT) {
ExpressionStatement exprNode = (ExpressionStatement) node;
Assignment assignmentNode = createAssignmentNode("_returnValue", exprNode.getExpression());
assignmentNode.setParent(exprNode);
exprNode.setExpression(assignmentNode);
}
return true;
}
});
StringBuilder result = new StringBuilder();
result.append("var _returnValue;\n");
result.append(ast.toSource());
result.append("return _returnValue;\n");
return result.toString();
} catch (Exception e) {
logger.debug(LogUtils.format("Error parsing script"), e);
return script;
}
}
private Assignment createAssignmentNode(String varName, AstNode rightNode) {
Assignment assignmentNode = new Assignment();
assignmentNode.setType(Token.ASSIGN);
Name leftNode = new Name();
leftNode.setType(Token.NAME);
leftNode.setIdentifier(varName);
leftNode.setParent(assignmentNode);
assignmentNode.setLeft(leftNode);
rightNode.setParent(assignmentNode);
assignmentNode.setRight(rightNode);
return assignmentNode;
}
}
This way, if you pass the following script:
data.price * data.qty;
You will get back:
var _returnValue;
_returnValue = data.price * data.qty;
return _returnValue;
Or if you pass:
var _returnValue;
if (data.isExternal) {
_returnValue = data.email;
} else {
_returnValue = data.userId;
}
return _returnValue;
Please keep in mind that I haven't done an exhaustive testing and will be polishing it over time, but this should show the general idea.

How to select subjects with specific properties from RDF with Jena?

This is a follow up question on my last one, since I am still struggling with this topic... I need to select some subjects from my model that meet specific requirements..
If I list my statements (this is only short part of the output), I get something like this:
WorkOrder2 hasType Workorder .
WorkOrder2 hasResult Fuselage22 .
WorkOrder2 type NamedIndividual .
Now, I would like to select and iterate thourgh all subjects that hasType Workorder. My idea was something like this:
public static ArrayList<String> listAllWorkorders(Model model) {
ArrayList<String> workorders = new ArrayList<String>();
// list of all work orders associated with given fuselage and work
// station
ResIterator it = model.listSubjectsWithProperty(
ResourceFactory.createProperty(ArumCorePrefix + "hasType"), ArumCorePrefix + "Workorder");
while (it.hasNext()) {
Resource r = it.next();
String workorder = trimPrefix(r.toString());
workorders.add(workorder);
}
// sort the result alphabetically
Collections.sort(workorders);
return workorders;
}
However, it does not return anything... If I use listSubjectsWithProperty without the second argument (String), it works but returns not only Workorders but some toher stuff with hasType property, which I do not want to. What is wrong with my code! Can I use something like this and make it work?
Dont worry about the static use of this function (I will take care of this non-elegant way as soon as I udnerstand whats wrong.)
Also, I would like to implement some more compelx filtering - for example selecting subjects with multiple properties that all has to match in order to return them, like hasType Workorder, hasResult someResult, inStation station etc... Does Jena support something like this! If not, what is the common approach?
Thanks for any tips!
And a follow-up: How do I check whether some statement is present in my model? I know that there is model.contains(Statements s) method but do I have to create the statement in argument in roder to call this method? Isnt there some more elegant way like model.contains(Resource r, Property p, Resource o)?

There are a number of ways you can do this in Jena, but they mostly come down to calling
StmtIterator Model.listStatements(Resource,Property,RDFNode)
with the resource as the first argument, and null as the second and third arguments, as a wildcard. The other methods that do similar things are really just special cases of this. For instance,
listObjectsOfProperty(Property p) — listStatements(null,p,null) and take the object from each statement.
listObjectsOfProperty(Resource s, Property p) — listStatements(s,p,null) and take the object from each statement.
listResourcesWithProperty(Property p) — listStatements(null,p,null) and take the subject from each statement
listResourcesWithProperty(Property p, RDFNode o) — listStatements(null,p,o) and take the subject from each statement
For convenience, you might prefer to use the method
StmtIterator Resource.listProperties()
which returns an iterator over all the statements in the resource's model with the given resource as a subject.
Here's some example code that includes your model and uses each of these methods:
import java.io.ByteArrayInputStream;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;
import com.hp.hpl.jena.rdf.model.RDFNode;
import com.hp.hpl.jena.rdf.model.Resource;
import com.hp.hpl.jena.rdf.model.StmtIterator;
public class ResourcePropertiesExample {
final static String NS = "http://example.org/";
final static String modelText = "" +
"#prefix : <"+NS+"> .\n" +
":WorkOrder2 :hasType :Workorder .\n" +
":WorkOrder2 :hasResult :Fuselage22 .\n" +
":WorkOrder2 :type :NamedIndividual .\n" +
"";
public static void main(String[] args) {
final Model model = ModelFactory.createDefaultModel();
model.read( new ByteArrayInputStream( modelText.getBytes()), null, "TTL" );
final Resource workOrder2 = model.getResource( NS+"WorkOrder2" );
System.out.println( "Using Model.listStatements()" );
StmtIterator stmts = model.listStatements( workOrder2, null, (RDFNode) null );
while ( stmts.hasNext() ) {
System.out.println( stmts.next() );
}
System.out.println( "Using Resource.listProperties()" );
stmts = workOrder2.listProperties();
while ( stmts.hasNext() ) {
System.out.println( stmts.next() );
}
}
}
The output is:
Using Model.listStatements()
[http://example.org/WorkOrder2, http://example.org/type, http://example.org/NamedIndividual]
[http://example.org/WorkOrder2, http://example.org/hasResult, http://example.org/Fuselage22]
[http://example.org/WorkOrder2, http://example.org/hasType, http://example.org/Workorder]
Using Resource.listProperties()
[http://example.org/WorkOrder2, http://example.org/type, http://example.org/NamedIndividual]
[http://example.org/WorkOrder2, http://example.org/hasResult, http://example.org/Fuselage22]
[http://example.org/WorkOrder2, http://example.org/hasType, http://example.org/Workorder]
As for checking whether a model contains certain statements, you can, as you noted, use Model.contains, and I don't think there's anything the matter with that. You can also use the various Resource has* methods, such as
hasLiteral( Property p, [various types of literal] )
hasProperty( Property p, RDFNode / String / String, String )
Using these, you could use, continuing the example above, and assuming you'd defined the property hasResult and resources fuselage21 and fuselage22:
workOrder2.hasProperty( hasResult, fuselage21 ); // false
workOrder2.hasProperty( hasResult, fuselage22 ); // true

Evaluate logical expression at runtime

How do I go about evaluating logical expression like "VERB1 OR (VERB2 AND VERB3) OR (VERB4)" entered at runtime. VERB* are placeholder to evaluate certain conditions. For example, VERB1 might mean check for the existence of a record in database.
In expression "VERB1 OR (VERB2 AND VERB3) OR (VERB4)", other verbs should not be executed if VERB1 is true
EDIT: Example described at http://www.alittlemadness.com/2006/06/05/antlr-by-example-part-1-the-language/ seems very similar to what I am trying to do. However, the optimization step (other verbs should not be executed if VERB1 is true) doesn't seem to be there.

If you can use || and && in place of AND and OR, you can just use groovy's missing property methods and the GroovyShell base class setting like so:
import org.codehaus.groovy.control.CompilerConfiguration
// The command to be executes
def command = "VERB1 || (VERB2 && VERB3) || (VERB4)"
// Set a base class for the GroovyShell
new CompilerConfiguration().with { compiler ->
compiler.scriptBaseClass = 'VerbHandlingBaseClass'
new GroovyShell( this.class.classLoader, new Binding(), compiler ).with { shell ->
// and evaluate the command
shell.evaluate( command )
}
}
abstract class VerbHandlingBaseClass extends Script {
boolean VERB1() {
System.out.println( 'CHECK THE DATABASE, RETURN FALSE' )
false
}
boolean VERB2() {
System.out.println( 'WRITE A LOG ENTRY RETURN TRUE' )
true
}
boolean VERB3() {
System.out.println( 'VALIDATE SOMETHING, RETURN TRUE' )
true
}
boolean VERB4() {
System.out.println( 'THIS WONT BE REACHED, AS VERB2 && VERB3 == true' )
true
}
def propertyMissing( String name ) {
"$name"()
}
}
That should print:
CHECK THE DATABASE, RETURN FALSE
WRITE A LOG ENTRY RETURN TRUE
VALIDATE SOMETHING, RETURN TRUE

You mentioned ANTLR in your tags: have you given this a go? You can create a full boolean grammar in ANTLR but it gets much harder when you get down to the level of how to evaluate the verbs.
If there is a small, fixed set of verbs which may be queried you can easily create a mapping between the verbs and the functions.
If there is a larger list of verbs, you may be able to use reflection to call specific methods to evaluate them.
If your verbs can include mathematical comparisons, this all gets a bit harder as you create a mathematical lexer and parser as well.
Without a more specific question and knowledge of what you have tried in ANTLR I'm not sure I can give you much more advice.
EDIT: Based on your comments, I'll add some more.
You can add parsing rules to your grammar:
boolean_or returns [boolean b]
: b1=boolean_and {$b = $b1.b;}
(OR b2=boolean_and {$b = $b || $b2.b;})*
;
boolean_atom returns [boolean b]
:
((numeric_comparison)=> b1=numeric_comparison {$b = $b1.b;}
| TRUE {$b = true;} | FALSE {$b = false;}
| s1=VERB {$b = evalVerb($s1.s);}
| LPAREN b1=boolean_expr RPAREN {$b = $b1.b;}
)
;
Thats a small part of a boolean parser I'm currently using. You can fill in the blanks.
And then call the parser using something like
ANTLRStringStream in = new ANTLRStringStream(booleanString);
ActionLexer lexer = new ActionLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
BooleanParser parser = new BooleanParser(tokens);
try {
return parser.eval();
} catch (Exception e) {
}
This doesn't account for your requirement of returning early, but I'm sure you can figure out how to do that.
This might not be the best way to do things, but its the way that I've gotten it to work for me in the past. Hope this helps.

complex if( ) or enum?

In my app, I need to branch out if the input matches some specific 20 entries.
I thought of using an enum
public enum dateRule { is_on, is_not_on, is_before,...}
and a switch on the enum constant to do a function
switch(dateRule.valueOf(input))
{
case is_on :
case is_not_on :
case is_before :
.
.
.
// function()
break;
}
But the input strings will be like 'is on', 'is not on', 'is before' etc without _ between words.
I learnt that an enum cannot have constants containing space.
Possible ways I could make out:
1, Using if statement to compare 20 possible inputs that giving a long if statement like
if(input.equals("is on") ||
input.equals("is not on") ||
input.equals("is before") ...) { // function() }
2, Work on the input to insert _ between words but even other input strings that don't come under this 20 can have multiple words.
Is there a better way to implement this?

You can define your own version of valueOf method inside the enum (just don't call it valueOf).
public enum State {
IS_ON,
IS_OFF;
public static State translate(String value) {
return valueOf(value.toUpperCase().replace(' ', '_'));
}
}
Simply use it like before.
State state = State.translate("is on");
The earlier switch statement would still work.

It is possible to seperate the enum identifier from the value. Something like this:
public enum MyEnumType
{
IS_BEFORE("is before"),
IS_ON("is on"),
IS_NOT_ON("is not on")
public final String value;
MyEnumType(final String value)
{
this.value = value;
}
}
You can also add methods to the enum-type (the method can have arguments as well), something like this:
public boolean isOnOrNotOn()
{
return (this.value.contentEquals(IS_ON) || this.value.contentEquals(IS_NOT_ON));
}
Use in switch:
switch(dateRule.valueOf(input))
{
case IS_ON: ...
case IS_NOT_ON: ...
case IS_BEFORE: ...
}
And when you get the value of IS_ON like for example System.out.println(IS_ON) it will show is on.

If you're using Java 7, you can also choose the middle road here, and do a switch statement with Strings:
switch (input) {
case "is on":
// do stuff
break;
case "is not on":
// etc
}

You're not really breaking the concept up enough, both solutions are brittle...
Look at your syntax
"is", can remove, seems to be ubiquitous
"not", optional, apply a ! to the output comparison
on, before, after, apply comparisons.
So do a split between spaces. Parse the split words to ensure they exist in the syntax definition and then do a step-by-step evaluation of the expression passed in. This will allow you to easily extend the syntax (without having to add an "is" and "is not" for each combination and keep your code easy to read.
Having multiple conditions munged into one for the purposes of switch statements leads to huge bloat over time.

Thanks for the suggestions. They guided me here.
This is almost same as other answers, just a bit simplified.
To summarize, I need to compare the input string with a set of 20 strings and
if they match, do something. Else, do something else.
Static set of strings to which input needs to be compared :
is on,is not on,is before,is after, etc 20 entries
I created an enum
public enum dateRules
{
is_on
,is_not_on
,is_before
,is_after
.
.
.
}
and switching on formatted value of input
if(isARule(in = input.replace(" ","_"))
{
switch(dateRule.valueOf(in))
{
case is_on,
case is_not_on,
case is_before, ...
}
}
I copied the formatted value of 'input' to 'in' so that I can reuse input without another replace of '_' with ' '.
private static boolean isARule(String value)
{
for(dateRule rule : dateRule.values())
{
if(rule.toString().equals(value))
{
return true;
}
}
return false;
}
Problem solved.
Reference : https://stackoverflow.com/a/4936895/1297564

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Reformatting code with Regular Expressions - java

Don't do this with regexp. Create symbolic constants (using for example an enum) that map the names to numbers.

Related

Checking whether any of the string from a list of strings is present in a paragraph(string)?

Get last evaluated expression inside function

How to select subjects with specific properties from RDF with Jena?

Evaluate logical expression at runtime

complex if( ) or enum?

Categories

Resources