Translating logic into Kotlin (or Java) code

Translating logic into Kotlin (or Java) code - java

I have a use-case where I want to enable users to write simple logic, and behind the scenes, convert this logic into a condition in the code.
For example, the user might write:
someFieldName > 10 AND otherFieldName is NULL
And I'd like that to generate the following code:
if (data["someFieldName"] > 10 && data["otherFieldName"] == null) {
// Do something
}
After doing some research, I saw that one of the options is using eval (by leveraging a JS engine), although it doesn't fit all use cases.
I also saw that it's possible to use tools like ANTLR, which seems a bit like overkill.
Are there any simple off-the-shelf products we can use for such purposes? Or would creating a simple parser ourselves be the best way to handle it?

Your use case can be adequately addressed by MVEL2.
There is no need for you to write a parser and AST with ANTLR or convert to Java code, just evaluate the expression with appropriate parameters.
In fact any Java expression language library would do. You could also look at JUEL.
However, looking at the expression, I would say it aligns more towards MVEL2.
Give both the libraries a try.

Related

Is it bad practice to create XML files directly without using a class to store the structure? [duplicate]

In the thread What’s your favorite “programmer ignorance” pet peeve?, the following answer appears, with a large amount of upvotes:
Programmers who build XML using string concatenation.
My question is, why is building XML via string concatenation (such as a StringBuilder in C#) bad?
I've done this several times in the past, as it's sometimes the quickest way for me to get from point A to point B when to comes to the data structures/objects I'm working with. So far, I have come up with a few reasons why this isn't the greatest approach, but is there something I'm overlooking? Why should this be avoided?
Probably the biggest reason I can think of is you need to escape your strings manually, and most new programmers (and even some experienced programmers) will forget this. It will work great for them when they test it, but then "randomly" their apps will fail when someone throws an & symbol in their input somewhere. Ok, I'll buy this, but it's really easy to prevent the problem (SecurityElement.Escape to name one).
When I do this, I usually omit the XML declaration (i.e. <?xml version="1.0"?>). Is this harmful?
Performance penalties? If you stick with proper string concatenation (i.e. StringBuilder), is this anything to be concerned about? Presumably, a class like XmlWriter will also need to do a bit of string manipulation...
There are more elegant ways of generating XML, such as using XmlSerializer to automatically serialize/deserialize your classes. Ok sure, I agree. C# has a ton of useful classes for this, but sometimes I don't want to make a class for something really quick, like writing out a log file or something. Is this just me being lazy? If I am doing something "real" this is my preferred approach for dealing w/ XML.

You can end up with invalid XML, but you will not find out until you parse it again - and then it is too late. I learned this the hard way.

I think readability, flexibility and scalability are important factors. Consider the following piece of Linq-to-Xml:
XDocument doc = new XDocument(new XDeclaration("1.0","UTF-8","yes"),
new XElement("products", from p in collection
select new XElement("product",
new XAttribute("guid", p.ProductId),
new XAttribute("title", p.Title),
new XAttribute("version", p.Version))));
Can you find a way to do it easier than this? I can output it to a browser, save it to a document, add attributes/elements in seconds and so on ... just by adding couple lines of code. I can do practically everything with it without much of effort.

Actually, I find the biggest problem with string concatenation is not getting it right the first time, but rather keeping it right during code maintenance. All too often, a perfectly-written piece of XML using string concat is updated to meet a new requirement, and string concat code is just too brittle.
As long as the alternatives were XML serialization and XmlDocument, I could see the simplicity argument in favor of string concat. However, ever since XDocument et. al., there is just no reason to use string concat to build XML anymore. See Sander's answer for the best way to write XML.
Another benefit of XDocument is that XML is actually a rather complex standard, and most programmers simply do not understand it. I'm currently dealing with a person who sends me "XML", complete with unquoted attribute values, missing end tags, improper case sensitivity, and incorrect escaping. But because IE accepts it (as HTML), it must be right! Sigh... Anyway, the point is that string concatenation lets you write anything, but XDocument will force standards-complying XML.

I wrote a blog entry back in 2006 moaning about XML generated by string concatenation; the simple point is that if an XML document fails to validate (encoding issues, namespace issues and so on) it is not XML and cannot be treated as such.
I have seen multiple problems with XML documents that can be directly attributed to generating XML documents by hand using string concatenation, and nearly always around the correct use of encoding.
Ask yourself this; what character set am I currently encoding my document with ('ascii7', 'ibm850', 'iso-8859-1' etc)? What will happen if I write a UTF-16 string value into an XML document that has been manually declared as 'ibm850'?
Given the richness of the XML support in .NET with XmlDocument and now especially with XDocument, there would have to be a seriously compelling argument for not using these libraries over basic string concatenation IMHO.

I think that the problem is that you aren't watching the xml file as a logical data storage thing, but as a simple textfile where you write strings.
It's obvious that those libraries do string manipulation for you, but reading/writing xml should be something similar to saving datas into a database or something logically similar

If you need trivial XML then it's fine. Its just the maintainability of string concatenation breaks down when the xml becomes larger or more complex. You pay either at development or at maintenance time. The choice is yours always - but history suggests the maintenance is always more costly and thus anything that makes it easier is worthwhile generally.

You need to escape your strings manually. That's right. But is that all? Sure, you can put the XML spec on your desk and double-check every time that you've considered every possible corner-case when you're building an XML string. Or you can use a library that encapsulates this knowledge...

Another point against using string concatenation is that the hierarchical structure of the data is not clear when reading the code. In #Sander's example of Linq-to-XML for example, it's clear to what parent element the "product" element belongs, to what element the "title" attribute applies, etc.

As you said, it's just awkward to build XML correct using string concatenation, especially now you have XML linq that allows for simple construction of an XML graph and will get namespaces, etc correct.
Obviously context and how it is being used matters, such as in the logging example string.Format can be perfectly acceptable.
But too often people ignore these alternatives when working with complex XML graphs and just use a StringBuilder.

The main reason is DRY: Don't Repeat Yourself.
If you use string concat to do XML, you will constantly be repeating the functions that keep your string as a valid XML document. All the validation would be repeated, or not present. Better to rely on a class that is written with XML validation included.

I've always found creating an XML to be more of a chore than reading in one. I've never gotten the hang of serialization - it never seems to work for my classes - and instead of spending a week trying to get it to work, I can create an XML file using strings in a mere fraction of the time and write it out.
And then I load it in using an XMLReader tree. And if the XML file doesn't read as valid, I go back and find the problem within my saving routines and corret it. But until I get a working save/load system, I refuse to perform mission-critical work until I know my tools are solid.
I guess it comes down to programmer preference. Sure, there are different ways of doing things, for sure, but for developing/testing/researching/debugging, this would be fine. However I would also clean up my code and comment it before handing it off to another programmer.
Because regardless of the fact you're using StringBuilder or XMLNodes to save/read your file, if it is all gibberish mess, nobody is going to understand how it works.

Maybe it won't ever happen, but what if your environment switches to XML 2.0 someday? Your string-concatenated XML may or may not be valid in the new environment, but XDocument will almost certainly do the right thing.
Okay, that's a reach, but especially if your not-quite-standards-compliant XML doesn't specify an XML version declaration... just saying.

How to script input for a Java program

I'm writing a Java program that requires its (technical) users to write scripts that it uses as input; it interprets these scripts into a series of actions and executes them. I am currently looking for the cleanest way to implement the script/configuration language. I was originally thinking of heading down the XML route, but the nature of the required input really is a procedural, linear flow of actions that need to be executed:
function move(Block b, Position p) {
// user-defined algorithm for moving block "b" to position "p"
}
Block a = getBlockA();
Position p = getPositionP();
move(a, p);
Etc. Please note: the above is an example only and does not constitute the exact syntax I am looking to achieve. I am still in the "30,000 ft view"-design phase, and don't know what my concreted scripting language will ultimately look like. I only provide this example to show that it is a flow/procedural script that the users must write, and that XML is probably not the best candidate for its implementation.
XML, perfect for hierarchial data, just doesn't feel like the best choice for such an implementation (although I could force it to work if need-be).
Not knowing a lick about DSLs, I've begun to read up on Groovy DSLs and they feel like a perfect match for what I need.
My uderstanding is that I could write, say, a Groovy (I'm stronger in Groovy than Scala, JRuby, etc.) DSL that would allow users to write scripts (.groovy files) that my program could then execute as input at runtime.
Is this correct, or am I misunderstanding the intent of DSLs altogether? If I am mistaken, does anybody have any suggestions for me? And if I am correct then how would a Java program read and execute a .groovy file (in other words, how would my program "consume" their script)?
Edit: I'm beginning to like ANTLR. Although I would love to roll up my sleeves and write a Groovy DSL, I don't want my users to be able to write any old Groovy program they want. I want my own "micro-language" and if users step outside of it I want the interpreter to invalidate the script. It's beginning to seem like Groovy/DSLs aren't the right choice, and maybe ANTLR could be the solution I need...?

I think you are on a really good path. Your users can write their files using your simple DSL and them you can run them by Evaling them at runtime. Your biggest challenge will be helping them to use the API of your DSL correctly. Unless they use an IDE this will be pretty tough.
Equivalent of eval() in Groovy

Yes, you can write a Groovy program that will accept a script as input and execute it. I recently wrote a BASIC DSL/interpreter in this way using groovy :
http://cartesianproduct.wordpress.com/binsic-is-not-sinclair-instruction-code/
(In the end it was more interpreter than DSL but that was to do with a peculiarity of Groovy that likely won't affect you - BASIC insists on UPPER CASE keywords which Groovy finds hard to parse - hence they have to be converted to lower case).
Groovy allows you to extend the script environment in various ways (eg injecting variables into the binding and transferring execution from the current script to a different, dynamically loaded script) which make this relatively simple.

Recommended strategy for parsing ad-hoc if/else syntax in Java?

(Sorry, not sure if ad-hoc is the right word here ... open for a better suggestion)
I'm trying to parse the Galaxy ToolConfig XML CLI tool wrapper format in a Java app, for replicating (in part) the behaviour of the Galaxy software itself.
The format includes some "free-text" if/else clauses, inside the command tag (that's the only place they occur, AFAIK):
...
<command interpreter="python">
sam_to_bam.py
--input1=$source.input1
--dbkey=${input1.metadata.dbkey}
#if $source.index_source == "history":
--ref_file=$source.ref_file
#else
--ref_file="None"
#end if
--output1=$output1
--index_dir=${GALAXY_DATA_INDEX_DIR}
</command>
...
What would be a recommended strategy for parsing this if/else structure into something that can be used to remodel the if/else logic in Java?
Is BNF/ANTLR overkill, better just to parse into some object structure, or? Any design patterns that would fit here? (Haven't worked with BNF/ANTLR before, but am willing to look into it if it will be worth it).

If you want to capture all the structure of the your input, a parser is the only way to go. One can code a parser manually top-down recursive, but there is little point in doing that, which is why parser generator tools exist; use them.
Regarding the #if #then #else: if that's the only structure you want to capture, then you need only a pretty primitive grammar that also allows tokens containing arbitrary text to pick up the goo between the #if#then#else constructs as a blob of text.
If you want to capture all code structure, and the conditionals are only allowed in certain places, then their existence can be simply integrated into whatever BNF you are using.
If, as I suspect, these can occur anywhere ("ad hoc"? the #if follows C preprocessor style, and those conditionals can occur virtually anywhere in the input stream), then parsing the text and retaining the conditionals is presently at the bleeding edge of what state of the art parsing can do. This is the standard C-preprocessing disease, and there have been no good solutions to this. Standard parser generators pretty can't help in this case. (Hand coded parsers don't fare better here either; the same kind of solution has to be used in either case).
One of the recent schemes (just reported as PhD research results in the last few months) to handle this is to fork the parse whenever a #if token is found to handle #if, and #else, and join when #endif is found; then you need a way to fuse to the generated subtrees typically as ambiguous subtrees marked with which arm of the conditional.
If you want to get on with your life, I suggest you simply insist that these conditionals occur in well-defined places in your grammar, and put up with the occasional complaint from people that write unstructured preprocessor directives. ("You wrote crazy code? Sorry, my tool doesn't handle it").

JAVA and how to execute user-code

I am building a tool which should do a diagnosis based on some values...
It should be user extensible so hardcoding the conditions isnt a solution...
Suppose that we have a blood test...
example ... WBC , ALDO ...
And i want the user to be able to write somehow scripts
if (WBC.between(4,10) && ALDO.greater(5) || SOMETHINGELESE.isTrue()) ..... diagnosis="MPLAMPLA"...
The problem is
1)Write my parser
2)Or try to find something that executes user conditionals at runtime and customize it..
3)another way
Please help,ideas needed!

Use scripting (you can use javascript, bsh, groovy, etc). See tutorial here
Use workflow engine, e.g. jBPM

Which templating engine is this?

At my current place of work we have to use a web-service which works with a text template we feed to it. Now we would like to use those templates at other places in code and so I wondered what template language that could possibly be and if it's some off-the-shelf or off-the-net software. It's presumably something from Java or .NET world, but can be essentially anything.
It's got the following tokens:
$$variable$$
##function##
##function_call[$$parameter$$]##
Condition: ##IF[$$boolen$$]##THEN##Text##[ELSE##Text##]ENDIF##
Does someone recognize this?

it seems an home-grown templating system. At least it is not:
Velocity (Java)
StringTemplate (Java)
FreeMarker (Java)

##Looks like a really grim one##, $$made$$ by someone with ###no sense### of making things loook attractive $when$ written:######$$$$[][]
I would be surprised if it was the syntax of an actual product or open-source template engine.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.