Statically Typing a Scripting Language in Java - java

I'm building a scripting language in Java for a game, and I'm currently working on the parser. The language is to be utilized by players/modders/myself to create custom spells and effects. However, I'm having difficulty imagining how to smoothly implement static typing in the current system (a painful necessity driven by performance needs). I don't care so much if compilation is fast, but actual execution needs to be as fast as I can get it (within reason, at least. I'm hoping to get this done pretty soon.)
So the parser has next() and peek() methods to iterate through the stream of tokens. It's currently built of a hierarchy methods that call each other in a fashion that preserves type precedence (the "bottom-most" method returning a constant, variable, etc). Each method returns an IResolve that has a generic type <T> it "resolves" to. For example, here's a method that handles "or" expressions, with "and" being more tightly coupled:
protected final IResolve checkGrammar_Or() throws ParseException
{
IResolve left = checkGrammar_And();
if (left == null)
return null;
if (peek().type != TokenType.IDENTIFIER || !"or".equals((String)peek().value))
return left;
next();
IResolve right = checkGrammar_Or();
if (right == null)
throwExpressionException();
return new BinaryOperation(left, right, new LogicOr());
}
The problem is when I need to implement a function that depends on the type. As you probably noticed, the generic type isn't being specified by the parser, and is part of the design problem. In this function, I was hoping to do something like the following (though this wouldn't work due to generic types' erasure...)
protected final IResolve checkGrammar_Comparison() throws ParseException
{
IResolve left = checkGrammer_Term();
if (left == null)
return null;
IBinaryOperationType op;
switch (peek().type)
{
default:
return left;
case LOGIC_LT:
//This ain't gonna work because of erasure
if (left instanceof IResolve<Double>)
op = new LogicLessThanDouble();
break;
//And the same for these
case LOGIC_LT_OR_EQUAL:
case LOGIC_GT:
case LOGIC_GT_OR_EQUAL:
}
next();
IResolve right = checkGrammar_Comparison();
if (right == null)
throwExpressionException();
return new BinaryOperation(left, right, op);
}
The problem spot, where I'm wishing I could make the connection, is in the switch statement. I'm already certain I'll need to make IResolve non-generic and give it a "getType()" method that returns an int or something, especially if I want to support user-defined classes in the future.
The question is:
What's the best way to achieve static typing given my current structure and the desire for mixed inheritance (user-defined classes and interfaces, like Java and C#)? If there is no good way, how can I alter or even rebuild my structure to achieve it?
Note: I don't claim to have any idea what I've gotten myself into, constructive criticism is more than welcome. If I need to clarify anything, let me know!
Another note: I know you're thinking "Why static typing?", and normally I'd agree with you-- however, the game world is composed of voxels (it's a Minecraft mod to be precise) and working with them needs to be fast. Imagine a script that's a O(n^2) algorithm iterating over 100 blocks twenty times a second, for 30+ players on a cheap server that's already barely squeaking by... or, a single, massive explosion effecting thousands of blocks, inevitably causing a horrendous lag spike. Hence, backend type checking or any form of duck-typing ain't gonna cut it (though I'm desperately aching for it atm.) The low level benefits are a necessity in this particular case, painful though it is.

You can get the best of both worlds by adding a method Class<T> getType() to IResolve; its implementers should simply return the appropriate Class object. (If the implementers themselves are generic, you need to get a reference to that object in the constructor or something.)
You can then do left.getType().equals(Double.class), etc.
This is entirely separate from the question of whether you should build your own parser with static typing, which is very much worth asking.

The solution I'm going with, as some have suggested in the comments, was to separate parsing and typing into separate phases, along with using an enum to represent type as I originally felt I should.
While I appreciate Taymon's answer, I can't use it if I hope to support user defined classes in the future.
If someone has a better solution, I'd be more than happy to accept it!

Related

Java - need help untangling compact Java notations: orElse, Optional, Lazy

I'm attempting to understand what's happening in this bit of Java code as its owner are no longer around and possibly fixing it or simplifying it. I'm guessing these blocks had a lot more in them at some point and what's left in place was not cleaned up properly.
It seems all occurrences of orElse(false) don't set anything to false and can be removed.
Then the second removeDiscontinued method is returning a boolean that I don't think is used anywhere. Is this just me or this is written in a way that makes it hard to read?
I'm hesitant removing anything from it since I haven't used much of the syntax like orElse, Lazy, Optional. Some help would be much appreciated.
private void removeDiscontinued(Optional<Map<String, JSONArrayCache>> dptCache, Lazy<Set<String>> availableTps) {
dptCache.map(pubDpt -> removeDiscontinued(pubDpt.keySet(), availableTps)).orElse(false);
}
private boolean removeDiscontinued(Set<String> idList, Lazy<Set<String>> availableTps) {
if (availableTps.get().size() > 0) {
Optional.ofNullable(idList).map(trIds -> trIds.removeIf(id -> !availableTps.get().contains(id)))
.orElse(false);
}
return true;
}
This code is indeed extremely silly. I know why - there's a somewhat common, extremely misguided movement around. This movement makes claims that are generally interpreted as 'write it 'functional' and then it is just better'.
That interpretation is obvious horse exhaust. It's just not true.
We can hold a debate on who is to blame for this - is it folks hearing the arguments / reading the blogposts and drawing the wrong conclusions, or is it the 'functional fanfolks' fanning the flames, so to speak, making ridiculous claims that do not hold up?
Point is: This code is using functional style when it is utterly inappropriate to do so and it has turned into a right mess as a result. The code is definitely bad; the author of this code is not a great programmer, but perhaps most of the blame goes to the functional evangelistst. At any rate, it's very difficult to read; no wonder you're having a hard time figuring out what this stuff does.
The fundamental issue
The fundamental issue is that this functional style strongly likes being a side-effect free process: You start with some data, then the functional pipeline (a chain of stream map, orElse, etc operations) produces some new result, and then you do something with that. Nothing within the pipeline should be changing anything, it's just all in service of calculating new things.
Both of your methods fail to do so properly - the return value of the 'pipeline' is ignored in both of them, it's all about the side effects.
You don't want this: The primary point of the pipelines is that they can skip steps, and will aggressively do so if they think they can, and the pipeline assumes no side-effects, so it makes wrong calls.
That orElse is not actually optional - it doesn't seem to do anything, except: It forces the pipeline to run, except the spec doesn't quite guarantee that it will, so this code is in that sense flat out broken, too.
These methods also take in Optional as an argument type which is completely wrong. Optional is okay as a return value for a functional pipeline (such as Stream's own max() etc methods). It's debatable as a return value anywhere else, and it's flat out silly and a style error so bad you should configure your linter to aggressively flag it as not suitable for production code if they show up in a field declaration or as a method argument.
So get rid of that too.
Let's break down what these methods do
Both of them will call map on an Optional. An optional is either 'NONE', which is like null (as in, there is no value), or it is a SOME, which means there is exactly one value.
Both of your methods invoke map on an optional. This operation more or less boils down, in these specific methods, as:
If the optional is NONE, do nothing, silently. Otherwise, perform the operation in the parens.
Thus, to get rid of the Optional in the argument of your first method, just remove that, and then update the calling code so that it decides what to do in case of no value, instead of this pair of methods (which decided: If passing in an optional.NONE, silently do nothing. "Silently do nothing" is an extremely stupid default behaviour mode, which is a large part of why Optional is not great). Clearly it has an Optional from somewhere - either it made it (with e.g. Optional.ofNullable in which case undo that too, or it got one from elsewhere, for example because it does a stream operation and that returned an optional, in which case, replace:
Optional<Map<String, JSONArrayCache>> optional = ...;
removeDiscontinued(thatOptionalThing, availableTps);
with:
optional.map(v -> removeDiscontinued(v, availableTps));
or perhaps simply:
if (optional.isPresent()) {
removeDiscontinued(optional.get(), availableTps);
} else {
code to run otherwise
}
If you don't see how it could be null, great! Optional is significantly worse than NullPointerException in many cases, and so it is here as well: You do NOT want your code to silently do nothing when some value is absent in a place where the programmer of said code wasn't aware of that possibility - an exception is vastly superior: You then know there is a problem, and the exception tells you where. In contrast to the 'silently do not do anything' approach, where it's much harder to tell something is off, and once you realize something is off, you have no idea where to look. Takes literally hundreds of times longer to find the problem.
Thus, then just go with:
removeDiscontinued(thatOptionalThing.get(), availableTps);
which will NPE if the unexpected happens, which is good.
The methods themselves
Get rid of those pipelines, functional is not the right approach here, as you're only interested in the side effects:
private void removeDiscontinued(Map<String, JSONArrayCache> dptCache, Lazy<Set<String>> availableTps) {
Set<String> keys = dptCache.keySet();
if (availableTps.get().size() > 0) {
keys.removeIf(id -> availableTps.get().contains(id));
}
}
That's it - that's all you need, that's what that code does in a very weird, sloppy, borderline broken way.
Specifically:
That boolean return value is just a red herring - the author needed that code to return something so that they could use it as argument in their map operation. The value is completely meaningless. If a styleguide that promises: "Your code will be better if you write it using this style" ends up with extremely confusing pointless variables whose values are irrelevant, get rid of the style guide, I think.
The ofNullable wrap is pointless: That method is private and its only caller cannot possibly pass null there, unless dptCache is an instance of some bizarro broken implementation of the Map interface that deigns to return null when its keySet() method is invoked: If that's happening, definitely fix the problem at the source, don't work around it in your codebase, no sane java reader would expect .keySet to return null there. That ofNullable is just making this stuff hard to read, it doesn't do anything here.
Note that the if (availableTps.get().size() > 0) check is just an optimization. You can leave it out if you want. That optimization isn't going to have any impact unless that dptCache object is a large map (thousands of keys at least).

Return optional value, depending on exception thrown by method used inside stream

Im trying to implement validation module used for handling events. The validation module is based on simple interface:
public interface Validator {
Optional<ValidationException> validate(Event event);
}
Existing code base in my team relies on the wrapping exception mechanism - I cannot really play with it.
I have encountered problems when implementing new validator, that is responsible for validating single event, in two terms.
Assume the event is PlayWithDogEvent, and it contains Toys a dog can play with.
Flow of validation of such event:
For each toy,
Check if its a ball
If its a ball, it should be not too large.
If any of the toys is either not a ball/too big ball, my validate(Event event) method should return Optional.of(new ValidationException("some msg")).
I have implemented my validator the following way:
public class ValidBallsOnlyValidator implements Validator {
#Override
public Optional<ValidationException> validate(Event event) {
try {
event.getToys().forEach(this::validateSingleToy);
return Optional.empty();
} catch (InvalidToyException ex) {
return Optional.of(new ValidationException(ex.getMessage()));
}
}
private void validateSingleToy(Toy toy) {
// In real code the optional here is kinda mandatory
Optional<Toy> potentialBall = castToyToBall(toy);
// Im using Java 8
if(potentiallBall.isPresent()) {
checkIfBallIsOfValidSize(potentialBall.get(), "exampleSize");
} else {
throw new InvalidToyException("The toy is not a ball!")
}
}
private void checkIfBallIsOfValidSize(Toy toy, String size) {
if(toyTooLarge(toy, size)) throw new InvalidToyException("The ball is too big!")
}
}
The piece seems to work just fine, but im uncomfortable with the way it looks. My biggest concern is whether it is a good practice to place whole stream processing inside single try. Moreover, I don't think such mixing of exception-catching + returning optionals is elegant.
I could use some advice and/or best practices for such scenarios.
but im uncomfortable with the way it looks.
The API you're working against is crazy design. The approach to dealing with silly APIs is generally the same:
Try to fix it 'upstream': Make a pull request, talk to the team that made it, etc.
If and only if that option has been exhausted, then [A] write whatever ugly hackery you have to, to make it work, [B] restrict the ugliness to as small a snippet of code as you can; this may involve writing a wrapper that 'contains' the ugly, and finally [C] do not worry about code elegance within the restricted 'ugly is okay here' area.
The reason the API is bizarre is that it is both getting validation wrong, and not capitalizing on the benefits of their mistake (as in, if I'm wrong about their approach being wrong, then at least they aren't doing the best job at their approach).
Specifically, an exception is a return value, in the sense that it is a way to return from a method. Why isn't that interface:
public interface Validator {
void validate(Event event) throws ValidationException;
}
More generally, validation is not a 'there is at most one thing wrong' situation, and that goes towards your problem with 'it feels weird to write a try/catch around the whole thing'.
Multiple things can be wrong. There could be 5 toys, one of which is a ball but too large, and one of which is a squeaky toy. It is weird to report only one error (and presumably, an arbitrarily chosen one).
If you're going to go with the route of not throwing validation exceptions but returning validation issues, then the issues should presumably not be exceptions in the first place, but some other object, and, you should be working with a List<ValidationIssue> and not with an Optional<ValidationIssue>. You've gotten rid of an optional, which is always a win, and you now can handle multiple issues in one go. If the 'end point' that processes all this is fundamentally incapable of dealing with more than one problem at the time, that's okay: They can just treat that list as an effective optional, with list.isEmpty() serving as the 'all is well' indicator, and list.get(0) otherwise used to get the first problem (that being the only problem this one-error-at-a-time system can deal with).
This goes to code elegance, the only meaningful way to define that word 'elegance': It's code that is easier to test, easier to understand, and more flexible. It's more flexible: If later on the endpoint code that deals with validation errors is updated to be capable of dealing with more than one, you can now do that without touching the code that makes validation issue objects.
Thus, rewrite it all. Either:
Make the API design such that the point is to THROW that exception, not to shove it into an optional, -or-
Make the API list-based, also get rid of optional (yay!) and probably don't work with a validation issue object that extends SomeException. If you're not gonna throw it, don't make it a throwable.
If that's not okay, mostly just don't worry about elegance so much - elegance is off the table once you're forced to work with badly designed APIs.
However, there's of course almost always some style notes to provide for any code.
return Optional.of(new ValidationException(ex.getMessage()));
Ordinarily, this is extremely bad exception handling and your linter tool SHOULD be flagging this down as unacceptable. If wrapping exceptions, you want the cause to remain to preserve both the stack trace and any exception-type-specific information. You're getting rid of all that by ignoring everything about ex, except for its message. Ordinarily, this should be new ValidationException("Some string that adds appropriate context", ex) - thus preserving the chain. If there is no context to add / it is hard to imagine what this might be, then you shouldn't be wrapping at all, and instead throwing the original exception onwards.
However, given that exceptions are being abused here, perhaps this code is okay - this again goes to the central point: Once you're committed to working with a badly designed API, rules of thumb on proper code style go right out the window.
private void checkIfBallIsOfValidSize(Toy toy, String size) {
if(toyTooLarge(toy, size)) throw new InvalidToyException("The ball is too big!")
}
Yes, this is a good idea - whilst the API expects you not to throw exceptions but to wrap them in optionals, that part is bad, and you should usually not perpetuate a mistake even if that means your code starts differing in style.
event.getToys().forEach(this::validateSingleToy);
Generally speaking, using the forEach method directly, or .stream().forEach(), is a code smell. forEach should be used in only two cases:
It's the terminal on a bunch of stream ops (.stream().filter().flatMap().map()....forEach - that'd be fine).
You already have a Consumer<T> object and want it to run for each element in a list.
You have neither. This code is best written as:
for (var toy : event.getToys()) validateSingleToy(toy);
Lambdas have 3 downsides (which turn into upsides if using lambdas as they were fully intended, namely as code that may run in some different context):
Not control flow transparent.
Not mutable local var transparent.
Not checked exception type transparent.
3 things you lose, and you gain nothing in return. When there are 2 equally succint and clear ways to do the same thing, but one of the two is applicable in a strict superset of scenarios, always write it in the superset style, because code consistency is a worthwhile goal, and that leads to more consistency (it's worthwhile in that it reduces style friction and lowers learning curves).
That rule applies here.
Returning exceptions instead of returning them is weird, but whatever. (Why not return a ValidationResult object instead? Exceptions are usually intended to be thrown and caught).
But you could change your private methods to also return Optional instances which would make it easier to combine them. It would also avoid mixing throwing and returning and streams. Not sure if that is what you are looking for?
public class ValidBallsOnlyValidator implements Validator {
#Override
public Optional<ValidationException> validate(Event event)
return event.getToys()
.stream()
.filter(Optional::isPresent)
.findFirst()
.map(ex -> new ValidationException(ex.getMessage()));
}
private Optional<InvalidToyException> validateSingleToy(Toy toy) {
// In real code the optional here is kinda mandatory
Optional<Toy> potentialBall = castToyToBall(toy);
if(potentiallBall.isPresent()) {
return checkIfBallIsOfValidSize(potentialBall.get(), "exampleSize");
} else {
return Optional.of(new InvalidToyException("The toy is not a ball!"));
}
}
private Optional<InvalidToyException> checkIfBallIsOfValidSize(Toy toy, String size) {
if(toyTooLarge(toy, size)) return Optional.of(new InvalidToyException("The ball is too big!"));
return Optional.empty();
}
}

Is there a name for the difference of these two code styles?

When i see code from others, i mainly see two types of method-styling.
One looks like this, having many nested ifs:
void doSomething(Thing thing) {
if (thing.hasOwner()) {
Entity owner = thing.getOwner();
if (owner instanceof Human) {
Human humanOwner = (Human) owner;
if (humanOwner.getAge() > 20) {
//...
}
}
}
}
And the other style, looks like this:
void doSomething(Thing thing) {
if (!thing.hasOwner()) {
return;
}
Entity owner = thing.getOwner();
if (!(owner instanceof Human)) {
return;
}
Human humanOwner = (Human) owner;
if (humanOwner.getAge() <= 20) {
return;
}
//...
}
My question is, are there names for these two code styles? And if, what are they called.
The early-returns in the second example are known as guard clauses.
Prior to the actual thing the method is going to do, some preconditions are checked, and if they fail, the method immediately returns. It is a kind of fail-fast mechanism.
There's a lot of debate around those return statements. Some think that it's bad to have multiple return statements within a method. Others think that it avoids wrapping your code in a bunch of if statements, like in the first example.
My own humble option is in line with this post: minimize the number of returns, but use them if they enhance readability.
Related:
Should a function have only one return statement?
Better Java syntax: return early or late?
Guard clauses may be all you need
I don't know if there is a recognized name for the two styles, but in structured programming terms, they can be described as "single exit" versus "multiple exit" control structures. (This also includes continue and break statements in loop constructs.)
The classical structured programming paradigm advocated single exit over multiple exit, but most programmers these days are happy with either style, depending on the context. Even classically, relaxation of the "single exit" rule was acceptable when the resulting code was more readable.
(One needs to remember that structured programming was a viewed as the antidote to "spaghetti" programming, particularly in assembly language, where the sole control constructs were conditional and non-conditional branches.)
i would say it's about readability. The 2nd style which i prefer, gives you the opportunity to send for example messages to the user/program for any check that should stop the program.
One could call it "multiple returns" and "single return". But I wouldn't call it a style, you may want to use both approaches, depending on readability in any particular case.
Single return is considered a better practice in general, since it allows you to write more readable code with the least surprise for the reader. In a complex method, it may be quite complicated to understand at which point the program will exit for any particular arguments, and what side effects may occur.
But if in any particular case you feel multiple returns improve readability of your code, there's nothing wrong with using them.

Overriding abstract method or using one single method in enums?

Consider the below enums, which is better? Both of them can be used exactly the same way, but what are their advantages over each other?
1. Overriding abstract method:
public enum Direction {
UP {
#Override
public Direction getOppposite() {
return DOWN;
}
#Override
public Direction getRotateClockwise() {
return RIGHT;
}
#Override
public Direction getRotateAnticlockwise() {
return LEFT;
}
},
/* DOWN, LEFT and RIGHT skipped */
;
public abstract Direction getOppposite();
public abstract Direction getRotateClockwise();
public abstract Direction getRotateAnticlockwise();
}
2. Using a single method:
public enum Orientation {
UP, DOWN, LEFT, RIGHT;
public Orientation getOppposite() {
switch (this) {
case UP:
return DOWN;
case DOWN:
return UP;
case LEFT:
return RIGHT;
case RIGHT:
return LEFT;
default:
return null;
}
}
/* getRotateClockwise and getRotateAnticlockwise skipped */
}
Edit: I really hope to see some well reasoned/elaborated answers, with evidences/sources to particular claims. Most existing answers regarding performance isn't really convincing due to the lack of proves.
You can suggest alternatives, but it have to be clear how it's better than the ones stated and/or how the stated ones is worse, and provide evidences when needed.
Forget about performance in this comparison; it would take a truly massive enum for there to be a meaningful performance difference between the two methodologies.
Let's focus instead on maintainability. Suppose you finish coding your Direction enum and eventually move on to a more prestigious project. Meanwhile, another developer is given ownership of your old code including Direction - let's call him Jimmy.
At some point, requirements dictate that Jimmy add two new directions: FORWARD and BACKWARD. Jimmy is tired and overworked and does not bother to fully research how this would affect existing functionality - he just does it. Let's see what happens now:
1. Overriding abstract method:
Jimmy immediately gets a compiler error (actually he probably would've spotted the method overrides right below the enum constant declarations). In any case, the problem is spotted and fixed at compile time.
2. Using a single method:
Jimmy doesn't get a compiler error, or even an incomplete switch warning from his IDE, since your switch already has a default case. Later, at runtime, a certain piece of code calls FORWARD.getOpposite(), which returns null. This causes unexpected behavior and at best quickly causes a NullPointerException to be thrown.
Let's back up and pretend you added some future-proofing instead:
default:
throw new UnsupportedOperationException("Unexpected Direction!");
Even then the problem wouldn't be discovered until runtime. Hopefully the project is properly tested!
Now, your Direction example is pretty simple so this scenario might seem exaggerated. In practice though, enums can grow into a maintenance problem as easily as other classes. In a larger, older code base with multiple developers resilience to refactoring is a legitimate concern. Many people talk about optimizing code but they can forget that dev time needs to be optimized too - and that includes coding to prevent mistakes.
Edit: A note under JLS Example ยง8.9.2-4 seems to agree:
Constant-specific class bodies attach behaviors to the constants. [This] pattern is much safer than using a switch statement in the base type... as the pattern precludes the possibility of forgetting to add a behavior for a new constant (since the enum declaration would cause a compile-time error).
I actually do something different. Your solutions have setbacks: abstract overridden methods introduce quite a lot of overhead, and switch statements are pretty hard to maintain.
I suggest the following pattern (applied to your problem):
public enum Direction {
UP, RIGHT, DOWN, LEFT;
static {
Direction.UP.setValues(DOWN, RIGHT, LEFT);
Direction.RIGHT.setValues(LEFT, DOWN, UP);
Direction.DOWN.setValues(UP, LEFT, RIGHT);
Direction.LEFT.setValues(RIGHT, UP, DOWN);
}
private void setValues(Direction opposite, Direction clockwise, Direction anticlockwise){
this.opposite = opposite;
this. clockwise= clockwise;
this. anticlockwise= anticlockwise;
}
Direction opposite;
Direction clockwise;
Direction anticlockwise;
public final Direction getOppposite() { return opposite; }
public final Direction getRotateClockwise() { return clockwise; }
public final Direction getRotateAnticlockwise() { return anticlockwise; }
}
With such design you:
never forget to set a direction, because it is enforced by the constructor (in case case you could)
have little method call overhead, because the method is final, not virtual
clean and short code
you can however forget to set one direction's values
First variant is faster and is probably more maintainable, because all properties of the direction are described where the direction itself is defined. Nevertheless, putting non-trivial logic into enums looks odd for me.
The second variant will probably be a little bit faster as the >2-ary polymorphism will force a full virtual function call on the interface, vs a direct call and index for the latter.
The first form is the object-oriented approach.
The second form is a pattern-matching approach.
As such the first form, being object-oriented, makes it easy to add new enums, but hard to add new operations. The second form does the opposite
Most experienced programmers I know would recommend using pattern-matching over object-orientation. As enums are closed, adding new enums is not an option; therefore, I would definitely go with the latter approach myself.
The enum values can be considered as independant classes. So considering the Object Oriented Concepts each enum should define its own behaviour. So i would reccommend first approach.
You could also simpy implement it once like this (you need to keep the enum constants in the appropriate order):
public enum Orientation {
UP, RIGHT, DOWN, LEFT; //Order is important: must be clock-wise
public Orientation getOppposite() {
int position = ordinal() + 2;
return values()[position % 4];
}
public Orientation getRotateClockwise() {
int position = ordinal() + 1;
return values()[position % 4];
}
public Orientation getRotateAnticlockwise() {
int position = ordinal() + 3; //Not -1 to avoid negative position
return values()[position % 4];
}
}
The first version is probably much faster. The Java JIT compiler can apply aggressive optimizations to it because enums are final (so all methods in them are final, too). The code:
Orientation o = Orientation.UP.getOppposite();
should actually become (at runtime):
Orientation o = Orientation.DOWN;
i.e. the compiler can remove the overhead for the method call.
From a design perspective, it's the proper way to do these things with OO: Move knowledge close to the object that needs it. So UP should know about it's opposite, not some code elsewhere.
The advantage of the second method is that it's more readable since all related things are grouped better (i.e. all the code related to "opposite" is in one place instead of a bit here and a bit there).
EDIT My first argument depends on how smart the JIT compiler is. My solution for the problem would look like this:
public enum Orientation {
UP, DOWN, LEFT, RIGHT;
private static Orientation[] opposites = {
DOWN, UP, RIGHT, LEFT
};
public Orientation getOpposite() {
return opposites[ ordinal() ];
}
}
This code is compact and fast, no matter what the JIT can or could do. It clearly communicates intent and, given the rules of ordinals, it will always work.
I would also suggest to add a test which makes sure that when calling getOpposite() for each value of the enum, you always get a different result and none of the results is null. That way, you can be sure that you got every case.
The only problem left is when you change the order of values. To prevent problems in this case, assign each value an index and use that to look up values in an array or even in Orientation.values().
here is another way to do it:
public enum Orientation {
UP(1), DOWN(0), LEFT(3), RIGHT(2);
private int opposite;
private Orientation( int opposite ) {
this.opposite = opposite;
}
public Orientation getOpposite() {
return values()[ opposite ];
}
}
I don't like this approach, though.
It's too hard to read (you have to count the index of each value in your head) and too easy to get wrong. It would need a unit test per value in the enum and per method that you can call (so 4*3 = 12 in your case).
Answer: It Depends
IF your method definitions are simple
This is the case with your very simple example methods, which just hard-code an enum output for each enum input
implement definitions specific to an enumeration value right next to that enumeration value
implement definitions common to all enumeration values at the bottom of the class in the "common area"; if the same method signature is to be available for all enum values but none/part of logic is common, use abstract method definitions in the common area
i.e. Option 1
Why?
readability, consistency, maintainability: the code directly related to a definition is right next to the definition
compile-time checking if abstract methods declared in common area, but not specified in enum value area
Note that the North/South/East/West example could be considered to represent a very simple state (of current direction) and the methods opposite/rotateClockwise/rotateAnticlockwise could be considered to represent user commands to change state. Which raises the question, what do you do for a real-life, typically complex state machine??
IF your method definitions are complex:
State-Machines are often complex, relying on current (enumerated value) state, command input, timers, and a fairly large number rules and business exceptions to determine the new (enumerated value) state. Other rare times, methods may even determine enumerated value output via calculations (e.g. scientific/engineering/insurance rating categorisation). Or it could use data-structures such as a map, or a complex data structure suited to an algorithm. When the logic is complex then extra care is required and the balance between "common" logic and "enum value-specific" logic changes.
avoid putting excessive code volume, complexity, and repeated 'cut & paste' sections right next to the enum value
try to refactor as much logic as possible into the common area - possibly putting 100% of logic here, but if not possible, employing the Gang Of Four "Template Method" pattern to maximise the amount of common logic, but flexibly allow a small amount of specific logic against each enum value.
i.e. As much as possible of Option 1, with a little of Option 2 allowed
Why?
readability, consistency, maintainability: avoids code bloat, duplication, poor textual formatting with masses of code interspersed amongst enum values, allows the full set of enum values to be quickly seen and understood
compile-time checking if using Template Method pattern and abstract methods declared in common area, but not specified in enum value area
Note: you could put ALL logic into a separate helper class, but I personally don't see any advantages to this (not performance/maintainability/readability). It breaks encapsulation a little and once you have all the logic in one place, what difference does it make to add a simple enum definition back to the top of the class? Splitting code across multiple classes is a different matter and is to be encouraged where appropriate.

Refactoring advice and tools

I have some code that consists of a lot (several hundreds of LOC) of uggly conditionals i.e.
SomeClass someClass = null;
if("foo".equals(fooBar)) {
// do something possibly involving more if-else statments
// and possibly modify the someClass variable among others...
} else if("bar".equals(fooBar)) {
// Same as above but with some slight variations
} else if("baz".equals(fooBar)) {
// and yet again as above
}
//... lots of more else ifs
} else {
// and if nothing matches it is probably an error...
// so there is some error handling here
}
// Some code that acts on someClass
GenerateOutput(someClass);
Now I had the idea of refactoring this kind of code something along the lines of:
abstract class CheckPerform<S,T,Q> {
private CheckPerform<T> next;
CheckPerform(CheckPerform<T> next) {
this.next = next;
}
protected abstract T perform(S arg);
protected abstract boolean check(Q toCheck);
public T checkPerform(S arg, Q toCheck) {
if(check(toCheck)) {
return perform(arg);
}
// Check if this CheckPerform is the last in the chain...
return next == null ? null : next.checkPerform();
}
}
And for each if statment generate a subclass of CheckPerform e.g.
class CheckPerformFoo extends CheckPerform<SomeInput, SomeClass, String> {
CheckPerformFoo(CheckPerform<SomeInput, SomeClass, String> next) {
super(next);
}
protected boolean check(String toCheck) {
// same check as in the if-statment with "foo" above"
returs "foo".equals(toCheck);
}
protected SomeClass perform(SomeInput arg) {
// Perform same actions (as in the "foo" if-statment)
// and return a SomeClass instance (that is in the
// same state as in the "foo" if-statment)
}
}
I could then inject the diffrent CheckPerforms into eachother so that the same order of checks are made and the corresponding actions taken. And in the original class I would only need to inject one CheckPerform object. Is this a valid approach to this type of problem? The number of classes in my project is likely to explode, but atleast I will get more modular and testable code. Should I do this some other way?
Since these if-else-if-...-else-if-else statments are what I would call a recurring theme of the code base I would like to do this refactoring as automagically as possible. So what tools could I use to automate this?
a) Some customizable refactoring feature hidden somewhere in an IDE that I have missed (either in Eclipse or IDEA preferably)
b) Some external tool that can parse Java code and give me fine grained control of transformations
c) Should I hack it myself using Scala?
d) Should I manually go over each class and do the refactoring using the features I am familiar with in my IDE?
Ideally the output of the refactoring should also include some basic test code template that I can run (preferably also test cases for the original code that can be run on both new and old as a kind of regression test... but that I leave for later).
Thanks for any input and suggestions!
What you have described is the Chain of Responsibility Pattern and this sounds like it could be a good choice for your refactor. There could be some downsides to this.
Readability Because you are going to be injecting the the order of the CheckPerformers using spring or some such, this means that it is difficult to see what the code will actually do at first clance.
Maintainence If someone after you wants to add a new condition, as well as adding a whole new class they also have to edit some spring config. Choosing the correct place to add there new CheckPerformer could be difficult and error prone.
Many Classes Depending on how many conditions you have and how much repeated code within those conditions you could end up with a lot of new classes. Even though the long list of if else its very pretty, the logic it in one place, which again aids readability.
To answer the more general part of your question, I don't know of any tools for automatic refactoring beyond basic IDE support, but if you want to know what to look for to refactor have a look at the Refactoring catalog. The specific of your question are covered by replace conditional with Polymorphism and replace conditional with Visitor.
To me the easiest approach would involve a Map<String, Action>, i.e. mapping various strings to specific actions to perform. This way the lookup would be simpler and more performant than the manual comparison in your CheckPerform* classes, getting rid of much duplicated code.
The actions can be implemented similar to your design, as subclasses of a common interface, but it may be easier and more compact to use an enum with overridden method(s). You may see an example of this in an earlier answer of mine.
Unfortunately I don't know of any automatic refactoring which could help you much in this. Earlier when I did somewhat similar refactorings, I wrote unit tests and did the refactoring step-by-step, manually, using automated support at the level of Move Method et al. Of course since the unit tests were pretty similar to each other in their structure, I could reuse part of the code there.
Update
#Sebastien pointed out in his comment, that I missed the possible sub-ifs within the bigger if blocks. One can indeed use a hierarchy of maps to resolve this. However, if the hierarchy starts to be really complex with a lot of duplicated functionality, a further improvement might be to implement a DSL, to move the whole mapping out of code into a config file or DB. In its simplest form it might look something like
foo -> com.foo.bar.SomeClass.someMethod
biz -> com.foo.bar.SomeOtherClass.someOtherMethod
baz -> com.foo.bar.YetAnotherClass.someMethod
bar -> com.foo.bar.SomeOtherClass.someMethod
biz -> com.foo.bar.DifferentClass.aMethod
baz -> com.foo.bar.AndAnotherClass.anotherMethod
where the indented lines configure the sub-conditions for each bigger case.

Categories

Resources