I have come across 2 samples of code from a legacy system that i'm at a loss to understand why someone would code like this. The app is in Java and is about 10-15 years old.
It seems so inefficient to hard to understand done like this.
if(condition) {
String[] hdtTmp = { "Range (Brand):", "Build Region:", "Design:", "Plan (Size):", "Facade:", "Region:" , "Internal Colour", "External Colour"};
hdt = hdtTmp;
String[] hddTmp = { p.RangeName, brName, p.HomeName, p.Name, f.Name, "North", "Red", "Blue"};
hdd = hddTmp;
hddTmp = null;
hdtTmp = null;
}
I do not understand why you would not just assigned it to the attribute in the first place? And since the hdtTmp and hddTmp are inside the block why make them null?
max = hdt.length -1;
for(int i=0; ; i++) {
// do some stuff here
if(i == max)
break;
}
Again, it seems that the original programmer didn't know how for loops worked?
They never taught this when i did my degree, so my question is, Why would anyone write code like this?
Setting local variables to null... Well back in the 90s, some old Sun documentation suggested that this would help, but it's been so long and that information is probably no longer available as it is no longer correct. I have run into this in a lot of old code and at this point it does nothing since local variables loose the reference to the object as soon as the method exits and GC is smart enough to figure that out.
The for() loop question is more of someone building a loop with exit criterion inside the actual loop. It's just unfortunate coding style and probably written by a relatively junior developer.
This code definitely seems like someone coding java after learning C/C++. I have seen enough of it, C/C++ people were taught to clean up after allocation and assigning to null was something that made them happy (at least the ones I have worked with back in the day). Another thing they did was override the finalize method, which is bad form in java but was the closest thing they had to destructors.
You may also see infinite loops with stopping condition inside like:
for (;;) {
// Do stuff
if (something_happened)
break;
}
Old (bad) habits die hard.
There is a narrow use case for nulling variables in the general case.
If the code between assigning null and the end of the variable's scope (often the end of the method) is longish running, nulling gives an earlier opportunity to GC the object previously referenced.
On the surface this is a micro optimisation, but it can have security benefits: if a memory dump or other similar snapshot (eg by a profiler) was to occur it could (very slightly) reduce the chance that sensitive data was part of the dump.
These "benefits" are pretty tenuous though.
The code in the question however is utterly useless.
Related
I'm reading a blog post and trying to understand what's going on.
This is the blogpost.
it has this code:
if (validation().hasErrors())
throw new IllegalArgumentException(validation().errorMessage());
In the validation() method we have some object initialization and calculations so let' say it's an expensive call. Is it going to be executed twice? Or will it be optimized by the compiler to be something like this?
var validation = validation();
if (validation.hasErrors())
throw new IllegalArgumentException(validation.errorMessage());
Thanks!
The validation method will be called twice, and it will do the same work each time. First, the method is relatively big, and so it won't get inlined. Without being inlined, the compiler doesn't know what it does. Therefore, it safely assumes that the method has side effects, and so it cannot optimize away the second call.
Even if the method was inlined, and the compiler could examine it, it would see that there are in fact side effects. Calling LocalDate.now() returns a different result each time. For this reason, the code that you linked to is defective, although it's not likely to experience a problem in practice.
It's safer to capture the validation result in a local variable not for performance reasons, but for stability reasons. Imagine the odd case in which the initial validation call fails, but the second call passes. You'd then throw an exception with no message in it.
The Java to Bytecode compiler has a limited set of optimization techniques (e.g. 9*9 in the condition would turn into 81).
The real optimization happens by the JIT (Just In Time) compiler. This compiler is the result of over a decade and a half of extensive research and there is no simple answer to tell what it is capable of in every scenario.
With that being said, as a good practice, I always handle repetitive identical method calls by storing their result before approaching any loop structure where that result is needed. Example:
int[] grades = new int[500];
int countOfGrades = arr.length;
for (int i = 0; i < countOfGrades; i++) {
// Some code here
}
For your code (which is only run twice), you shouldn't worry as much about such optimization. But if you're looking for the ultimate – guaranteed – optimization on the account of a fraction of space (which is cheap), then you're better off using a variable to store any identical method result when needed more than once:
var validation = validation();
if (validation.hasErrors())
throw new IllegalArgumentException(validation.errorMessage());
However, I must simply question ... "these days," does it even actually matter anymore? Simply write the source-code "in the most obvious manner available," as the original programmer certainly did.
"Microseconds" really don't matter anymore. But, "clarity still does." To me, the first version of the code is frankly more understandable than the second, and "that's what matters to me most." Please don't bother to try to "out-smart" the compiler, if it results in source-code that is in any way harder to understand.
If I have a string that is currently empty
String s = "";
and reassign it
s = "";
is it bad it I don't do it like this?
if(!s.isEmpty()){
s = "";
}
or will the compiler pick up on it and optimize for me?
the cost of calling the method isEmpty() (allocating new space in the thread stack etc) negate any gains. if you want to assign an empty String to the variable, its most efiicient to do so without the if statement.
Do not micro-pessimize your code.
s = "";
The assignment gets translated by JIT into a move into register, where the source most probably resides in register as well (in a loop, such an optimization gets done unless you run out of registers). So it's the fastest instruction taking just one cycle and current Intel can execute 4 of them in parallel.
if(!s.isEmpty()){
s = "";
}
The function call is not a problem, as it gets inlined. But there's still a memory indirection from the String to it's value.length. Not bad, but already way more expensive than the simple way. Then a conditional branch, which can take tens of cycles, unless the outcome can be predicted by the CPU. If not, you still may be lucky and the JIT may replace it by a conditional move. Whatever happens, it can never be faster than the simpler code.
Addendum
In theory, you could hope that the JIT finds out that the two variants are equivalent and replaces the pessimized one by the simple one. But these two variants are not equivalent. You'd need to use
if (s != "") {
s = "";
}
instead as there may be other empty string than the interned one (i.e., the compile time constant ""). Anyway, I hope that we agree that the above snippet is something nobody should ever use.
If there's any real logic between the initialization and the point at which you assign "" again, the compiler probably won't be able to optimize it out. But it's fine if it doesn't, because the reassignment isn't going to take any significant time. In theory, if it did, the Just-In-Time compiler (JIT) in the JVM (well, Oracle's JVM, anyway) would try to optimize it (if it could) if it ended up being a "hot spot" in the code.
I have the following piece of code:
Player player = (Player)Main.getInstance().getPlayer();
player.setSpeedModifier(keyMap[GLFW_KEY_LEFT_SHIFT] ? 1.8f : 1);
if (keyMap[GLFW_KEY_W]) {
player.moveForward();
}
if (keyMap[GLFW_KEY_S]) {
player.moveBackward();
}
player.rotateTowards(getMousePositionInWorld());
I was wondering if the usage of a local variable (For the player) to make the code more readable has any impact on performance or whether it would be optimised during compilation to replace the uses of the variable seeing as it is just a straight copy of another variable. Whilst it is possible to keep the long version in place, I prefer the readability of having the shorter version. I understand that the performance impact if there was any would be miniscule, but was just interested if there would be any whatsoever.
Thanks, -Slendy.
For any modern compiler, this will most likely be optimized away and it will not have any performance implications. The few additional bytes used for storage are well worth the added readability.
consider these 2 pieces of code:
final Player player = (Player)Main.getInstance().getPlayer();
player.callmethod1();
player.callmethod2();
and:
((Player)Main.getInstance().getPlayer()).callmethod1();
((Player)Main.getInstance().getPlayer()).callmethod2();
There are reasons, why first variant is preferable:
First one is more readable, at least because of line length
Java compiler cannot assume that the same object will be returned by Main.getInstance().getPlayer() this is why second variant will actually call getPlayer twice, which could be performance penalty
Apart from the probably unneeded (Player) cast, I even find your version to be superior to having long worms of calls.
IMHO if you need one special object more than once or twice, it is worth to be saved in a local variable.
The local variable will need some bytes on the stack, but on the other hand, several calls are omitted, so your version clearly wins.
Your biggest performance hit will likely be the function lookup of the objects:
(Player)Main.getInstance().getPlayer();
Otherwise, you want to minimize these function calls if possible. In this case, a local var could save CPU, though if you have a global var, it might be a hair faster to use it.
It really depends on how many times this is done in a loop though. Quite likely you will see no difference either way in normal usage. :)
In my program I have a long if condition which looks something like this:
if((items.size() > 0 && !k.getText().equals(last)) || cr.isLast() == true)
Now I thought it is easier to read if I use a variable for the first statement so I changed my code into this:
boolean textChanged = items.size() > 0 && !k.getText().equals(last);
if(fachChanged == true || cr.isLast() == true)
Now my question is: Does the second code need more memory because I used a temporary variable or is this optimized from the compiler? I think today it is not so important if one boolean less or more is stored in the memory but there is the wish to create an optimized and memory friendly program.
"The First Rule of Program Optimization: Don't do it. The Second Rule of Program Optimization (for experts only!): Don't do it yet." — Michael A. Jackson
The compiler will optimize what can be optimized, you should care about the more difficult task: try to write clean and readable code.
if((items.size() > 0 && !k.getText().equals(last)) || cr.isLast() == true)
This is resolved to a boolean value by a compiler, and that boolean has to go somewhere. Once the if statement is finished, the boolean is disposed of. If you create a local variable, that boolean is maintained in memory for its life time (here it looks like up until the end of the method).
That said, the compiler may notice this, and provided that boolean isn't used anywhere else, it will probably evaluate to your first example anyway. Either way, this is quite a stringent optimisation, and something that Java can definitely handle either way.
Use your extracted solution - it is more readable, you can debug it properly and the compiler optimizes it anyway. Also, only locally scoped variables (and boolean primitives above all) wont get stored on the heap and are therefore quickly disposed anyway.
I 100% agree with all the other answers so far, but I'm also compelled to actually answer the question.
my question is: Does the second code need more memory because I used a
temporary variable
Probably not. It's most likely that in both cases, the compiler will put the boolean in a register and it will never hit memory at all. It depends on what other code is happening around the code you've provided.
If you reference that variable later on, you might run out of registers and it would end up on the stack. In either case it would never go in the heap, so your memory profile will be identical either way.
Currently I am working on a bit of code which (I believe) requires quite a few embedded if statements. Is there some standard to how many if statements to embed? Most of my googling has turned up things dealing with excel..don't know why.
If there is a standard, why? Is it for readability or is it to keep code running more smoothly? In my mind, it makes sense that it would be mainly for readability.
An example of my if-structure:
if (!all_fields_are_empty):
if (id_search() && validId()):
// do stuff
else if (name_search):
if (name_exists):
if (match < 1):
// do stuff
else:
// do stuff
else if (name_search_type_2):
if (exists):
if (match < 1):
// do stuff
else:
// do stuff
else:
// you're stupid
I have heard that there's a limit to 2-3 nested for/while loops, but is there some standard for if-statements?
Update:
I have some years under my belt now. Please don't use this many if statements. If you need this many, your design is probably bad. Today, I LOVE when I can find an elegant way to do these things with minimal if statements or switch cases. The code ends up cleaner, easier to test, and easier to maintain. Normally.
As Randy mentioned, the cause of this kind of code is in most cases a poor design of an application. Usually I try to use "processor" classes in your case.
For example, given that there is some generic parameter named "operation" and 30 different operations with different parameters, you could make an interface:
interface OperationProcessor {
boolean validate(Map<String, Object> parameters);
boolean process(Map<String, Object> parameters);
}
Then implement lots of processors for each operation you need, for example:
class PrinterProcessor implements OperationProcessor {
boolean validate(Map<String, Object> parameters) {
return (parameters.get("outputString") != null);
}
boolean process(Map<String, Object> parameters) {
System.out.println(parameters.get("outputString"));
}
}
Next step - you register all your processors in some array when application is initialized:
public void init() {
this.processors = new HashMap<String, OperationProcessor>();
this.processors.put("print",new PrinterProcessor());
this.processors.put("name_search", new NameSearchProcessor());
....
}
So your main method becomes something like this:
String operation = parameters.get("operation"); //For example it could be 'name_search'
OperationProcessor processor = this.processors.get(operation);
if (processor != null && processor.validate()) { //Such operation is registered, and it validated all parameters as appropriate
processor.process();
} else {
System.out.println("You are dumb");
}
Sure, this is just an example, and your project would require a bit different approach, but I guess it could be similiar to what I described.
I don't think there is a limit but i wouldn't recommend embeddeding more the two - it's too hard to read, difficult to debug and hard to unit test. Consider taking a look at a couple great books like Refactoring, Design Patterns, and maybe Clean Code
Technically, I am not aware of any limitation to nesting.
It might be an indicator of poor design if you find yourself going very deep.
Some of what you posted looks like it may be better served as a case statement.
I would be concerned with readability, and code maintenance for the next person which really means it will be difficult - even for the first person (you) - to get it all right in the first place.
edit:
You may also consider having a class that is something like SearchableObject(). You could make a base class of this with common functionality, then inherit for ID, Name, etc, and this top level control block would be drastically simplified.
Technically you can have as many as you like but if you have a lot it can quickly make the code unreadable.
What i'd normally do is something like:
if(all_fields_are_empty) {
abuseuser;
return;
}
if(id_search() && validId()) {
//do stuff
return;
}
if(name_search)
{
if(name_exists)
//do stuff
return
else
//do stuff
return
}
I'm sure you get the picture
Tl;Dr You don't really want anymore than 10-15 paths though any one method
What your essentially referring to here is Cyclomatic complexity.
Cyclomatic complexity is a software metric (measurement), used to
indicate the complexity of a program. It is a quantitative measure of
the number of linearly independent paths through a program's source
code. It was developed by Thomas J. McCabe, Sr. in 1976.
So every if statement is potentially a new path though your code and increases it's Cyclomatic complexity. There are tools that will measure this for you and high light areas of high complexity for potential refactoring.
Is there some standard to how many if statements to embed?
Yes and no. It's generally regarded (and McCabe himself argued) that a Cyclomatic complexity of over about 10 or 15 is too high and a sign that the code should be refactored.
One of McCabe's original applications was to limit the complexity of
routines during program development; he recommended that programmers
should count the complexity of the modules they are developing, and
split them into smaller modules whenever the cyclomatic complexity of
the module exceeded 10.[2] This practice was adopted by the NIST
Structured Testing methodology, with an observation that since
McCabe's original publication, the figure of 10 had received
substantial corroborating evidence, but that in some circumstances it
may be appropriate to relax the restriction and permit modules with a
complexity as high as 15. As the methodology acknowledged that there
were occasional reasons for going beyond the agreed-upon limit, it
phrased its recommendation as: "For each module, either limit
cyclomatic complexity to [the agreed-upon limit] or provide a written
explanation of why the limit was exceeded."[7]
This isn't really a hard rule though and can be disregarded in some circumstances. See this question What is the highest Cyclomatic Complexity of any function you maintain? And how would you go about refactoring it?.
why? Is it for readability or is it to keep code running more
smoothly?
Essentially this is for readability, which should make your code run smoothly. To quote Martin Fowler
Any fool can write code that a computer can understand. Good
programmers write code that humans can understand.
The only technical limit to the number of nested if/else blocks in Java will probably be the size of your stack. Style is another matter.
Btw: What's with the colons?