structuring complicated java methods

structuring complicated java methods - java

Can anyone show me how to control java's order of execution while still organizing the code in a simplified, easy-to-maintain structure?
I am programming a rather complicated algorithm in java. One method got to be several hundred lines long, to the point where isolating critical parts of the code for fine tuning the algorithm became too labor-intensive. To simplify the code, I identified each section of the code that could be resolved to a single variable, and I moved each of those sections to their own methods, which are then called from the formerly complicated method. This is good, because it makes the code readable, and much easier to maintain.
The problem is that now I am getting some errors that indicate that the calling method is continuing to execute subsequent code before some of the earlier methods have returned their values.
Here is an example in code:
void myMethod(Double numb){
double first = new getFirst(numb);
double second = new getSecond(numb);
double third = new getThird(numb);
double anAside = new getAnAside(first, second, third);
double fourth = new getFourth(numb);
}
The error messages that come up have to do with things happening in getFourth(numb) at the same time when I am getting System.out.println() results in the eclipse console indicating that getFirst(numb) is still running. Back when I had all the contents of getFirst(numb), getSecond(numb), getThird(numb), getAnAside(first,second,third), and getFourth(numb) within myMethod(numb), I was not getting the same evidence that code blocks were running out of order. (Because their were no sub-methods, the code was all in one long block.) However, the code was difficult to read. How can I make changes to myMethod(numb) above so that each method must be fully returned before moving on to the next, so that I can still have easy to read code?

This is a bit of guesswork, but I have known IDEs to display stdout output in the wrong order. I suggest you eliminate this first (even if it is unlikely), either by running Java from a proper terminal or writing the logs to disk.

Try flushing your System.out between each method like this:
void myMethod(Double numb)
{
double first = new getFirst(numb);
System.out.flush();
double second = new getSecond(numb);
System.out.flush();
double third = new getThird(numb);
System.out.flush();
double anAside = new getAnAside(first, second, third);
System.out.flush();
double fourth = new getFourth(numb);
}
As Matt B mentions, if you are only using one thread then it is probably something faulty with your logging.

I usually find that in terms of program structure, algorithms are never as complicated as people make them out to be. All you have to do is talk it out and write as you go. Say my fridge needs milk so that I need an algorithm to go get milk in the store:
public Milk goGetMilk(){
getMoney();
getCarKey();
driveToStore();
findMilk();
buyMilk();
driveBack();
putMilkInFridge();
}
Then each nested method can in turn be broken into chuncks until I have a complete program.
As far as order of execution goes, if you have one thread: it's impossible to have racing conditions. If you are using multiple threads, you need to synchronize shared resources and rendezvous points.

Related

Why does this code get stuck in infinite loop [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 12 days ago.
Improve this question
public void bad() {
final ConcurrentMap<String, Integer> chm = new ConcurrentHashMap<>();
final String key = "1";
chm.computeIfAbsent(key, __ -> {
chm.remove(key);
return 1;
});
}
Of course i understand this looks very silly. It is just a super simplified version of some problematic code i was dealing with. I understand it makes no sense to do this but i am trying to understand the behaviour it causes.
When running this code you get stuck in an infinite loop on line 1107 after invoking http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/concurrent/ConcurrentHashMap.java#l1096
I am finding it very difficult to understand exactly what is happening which is causing this. Same behaviour when done on a seperate thread but waiting
public void bad2() {
final ConcurrentMap<String, Integer> chm = new ConcurrentHashMap<>();
final String key = "1";
Thread worker = new Thread(() -> chm.remove("1"));
chm.computeIfAbsent(key, __ -> {
worker.start();
try {
worker.join();
} catch (InterruptedException e) {
e.printStackTrace();
}
return 1;
});
}
Why in both cases is it not that chm.remove(key) completes normally returning null and then the value of 1 is set?
Interestingly this was addressed at some point and the first example throws java.lang.IllegalStateException: Recursive update when i ran with java 17.

This is called a 'contract error'. This happens when javadoc explicitly tells you NOT to do thing X, and leaves it at that; it does not specify what happens if you do X, just that you shouldn't do that. In this case, X is 'update the map in your passed-in function', and the javadoc explicitly spells out: DO NOT.
so the computation should be short and simple,
and must not attempt to update any other mappings of this map.
When you perform such a contract error, anything can happen. The spec is clear: Don't. So, if you do, the spec essentially claims the ability to do anything. Hard-crash, whistle a dixie tune from the speakers, you name it.
Hence why the behaviour changed (ordinarily, java does not change its behaviours without quite a big ordeal about breaking compatibility, but here, the spec behaviour has not changed, because the spec merely says 'do not do this, this thing does not perform according to spec if you fail to heed this warning' and 'loop endlessly' and 'throw an exception' are both just 'doing unspecified broken stuff' in this regard.
Okay, but why does it endlessly loop?
Because concurrent hashmap is 'smart' and uses a retry/CAS update model. Instead of acquiring a bunch of stuff, it just tries the operation without doing that, but will then check during/afterwards if it actually succeeded, or if, due to other threads modifying the same map at the same time in the same general area, its write got overwritten or otherwise didn't apply, in which case it'll do it again. In this case, removing the key is essentially 'eliminating a marker', which makes CHM think it updated a thing concurrently with another thing and therefore it should try again. Forever and ever.
That's what the 'cas' in line 1656 in your linked source file (casTabAt) stands for: Compare-And-Set. This is a concurrency primitive that can be a lot faster than locks: "If the current value is X, then set it to Y. Otherwise, do not set it at all. Tell me whether you set it or not" - all that, in one atomic operation, which is speedy because CPUs tend to support it as barebones machine code. No lock acquiry required. The general principle is to check what the current value is, do some bookkeeping, then set the new value, using CAS to ensure that the 'state' you checked is still the state we're in. If not, some other thread so happened to also be updating stuff, so, start over.
That's just one implementation. Tomorrow, it can change. You cannot rely on 'it will endlessly loop' because the spec do not guarantee it, and indeed, in JDK17, you get an exception instead.

Calling method twice in same block. Why?

I'm reading a blog post and trying to understand what's going on.
This is the blogpost.
it has this code:
if (validation().hasErrors())
throw new IllegalArgumentException(validation().errorMessage());
In the validation() method we have some object initialization and calculations so let' say it's an expensive call. Is it going to be executed twice? Or will it be optimized by the compiler to be something like this?
var validation = validation();
if (validation.hasErrors())
throw new IllegalArgumentException(validation.errorMessage());
Thanks!

The validation method will be called twice, and it will do the same work each time. First, the method is relatively big, and so it won't get inlined. Without being inlined, the compiler doesn't know what it does. Therefore, it safely assumes that the method has side effects, and so it cannot optimize away the second call.
Even if the method was inlined, and the compiler could examine it, it would see that there are in fact side effects. Calling LocalDate.now() returns a different result each time. For this reason, the code that you linked to is defective, although it's not likely to experience a problem in practice.
It's safer to capture the validation result in a local variable not for performance reasons, but for stability reasons. Imagine the odd case in which the initial validation call fails, but the second call passes. You'd then throw an exception with no message in it.

The Java to Bytecode compiler has a limited set of optimization techniques (e.g. 9*9 in the condition would turn into 81).
The real optimization happens by the JIT (Just In Time) compiler. This compiler is the result of over a decade and a half of extensive research and there is no simple answer to tell what it is capable of in every scenario.
With that being said, as a good practice, I always handle repetitive identical method calls by storing their result before approaching any loop structure where that result is needed. Example:
int[] grades = new int[500];
int countOfGrades = arr.length;
for (int i = 0; i < countOfGrades; i++) {
// Some code here
}
For your code (which is only run twice), you shouldn't worry as much about such optimization. But if you're looking for the ultimate – guaranteed – optimization on the account of a fraction of space (which is cheap), then you're better off using a variable to store any identical method result when needed more than once:
var validation = validation();
if (validation.hasErrors())
throw new IllegalArgumentException(validation.errorMessage());

However, I must simply question ... "these days," does it even actually matter anymore? Simply write the source-code "in the most obvious manner available," as the original programmer certainly did.
"Microseconds" really don't matter anymore. But, "clarity still does." To me, the first version of the code is frankly more understandable than the second, and "that's what matters to me most." Please don't bother to try to "out-smart" the compiler, if it results in source-code that is in any way harder to understand.

Correct way to get a value?

As part of my AP curriculum I am learning java and while working on a project I wondered which of the following is best way to return a value?
public double getQuarters(){
return quarters;
}
or
public void getQuarters(){
System.out.println(quarters);
}
***Note: I now that the second option is not "technically" returning a value but its still showing my the value so why bother?

Your answer would be correct. The second method doesn't return any value at all, so while you might be able to see the output, your program can't. The second method could still be useful for testing or even for a command line application, but it should be named something like printQuarters instead.

public double getQuarters(){
return quarters;
}
Use this incorder to encapsulate quarters and hide it from being accessed by other programs. That means, you have to declare it as private quarters. Let see the second option:
public void getQuarters(){
System.out.println(quarters);
}
However, this seems wrong as getQuarters is not returning anything. Hence it would make more sense to refactor it as
public void printQuarters(){
System.out.println(quarters);
}

You answered your own question. For most definitions of the word "best", you should go with the first option.
Your question, however, does touch on the object-oriented programming topic of accessors and mutators. In your example, "getQuarters" is an accessor. It is usually best to use accessors to retrieve your values. This is one way to adhere to the Open/Closed Principle.
Also, the Java community has a coding convention for this and many tools and libraries depend on code following those conventions.

If all you need to do is display the value when this method is called, and you are ok with console output, then your System.out.println method will do the job. HOWEVER, a function that actually returns the variable is much more semantically correct and useful.
For example, while you may only need to print the variable for your current project, what if you came back later and decided that you were instead going to output your variable to a file? If you wrote your getQuarters function with a println statement, you would need to rewrite the whole thing. On the other hand, if you wrote the function as a return, you wouldn't need to change anything. All you'd have to do is add new code for the file output, and consume the function where needed.
A returning function is therefore much more versatile, although more so in larger code projects.

You return values to a specific point in your program, so that the program can use it to function.
You print values at a specific point in your program, so that you as an end user can see what value you got back for some function.
Depending on the function - for instance, yours - the result of quarters is no longer regarded in the program; all it did was print a value to the screen, and the application doesn't have a [clean|easy] way to get that back to use it.
If your program needs the value to function, then it must be a return. If you need to debug, then you can use System.out.println() where necessary.
However, more times than not, you will be using the return statement.

Option 1 is far superior.
It can be easily Unit Tested.
What if the spec changes and sometimes you want to print the result, other times put it into a database? Option 1 splits apart the logic of obtaining the value from what to do with it. Now, for a single method getQuarters no big deal, but eventually you may have getDimes, getEuros, etc...
What if there may be an error condition on quarters, like the value is illegal? In option 1, you could return a "special" value, like -1.0, or throw an Exception. The client then decides what to do.

Is it inefficient to reference a hashmap in another class multiple times?

Class A
Class A {
public HashMap <Integer,Double> myHashMap;
public A(){
myHashMap = new HashMap()
}
}
class B
Class B {
private A anInstanceOfA;
public B(A a) {
this.anInstanceOfA = a;
}
aMethod(){
anInstanceOfA.myHashMap.get(1); <--getting hashmap value for key = 1
//proceed to use this value, but instead of storing it to a variable
// I use anInstanceOfA.myHashMap.get(1) each time I need that value.
}
In aMethod() I use anInstanceOfA.myHashMap.get(1) to get the value for key = 1. I do that multiple times in aMethod() and I'm wondering if there is any difference in efficiency between using anInstanceOfA.myHashMap.get(1) multiple times or just assigning it to a variable and using the assigned variable multiple times.
I.E
aMethod(){
theValue = anInstanceOfA.myHashMap.get(1);
//proceed to use theValue in my calculations. Is there a difference in efficiency?
}

In theory the JVM can optimise away the difference to be very small (compared to what the rest of the program is doing). However I prefer to make it a local variable as I believe it makes the code clearer (as I can give it a meaningful name)
I suggest you do what you believe is simpler and clearer, unless you have measured a performance difference.

The question seems to be that you want to know if it is more expensive to call get(l) multiple times instead of just once.
The answer to this is yes. The question is if it is enough to matter. The definitive answer is to ask the JVM by profiling. You can, however, guess by looking at the get method in your chosen implementation and consider if you want to do all that work every time.
Note, that there is another reason that you might want to put the value in a variable, namely that you can give it a telling name, making your program easier to maintain in the future.

This seems like a micro-optimization, that really doesn't make much difference in the scheme of things.
As #peter already suggested, 'optimizing' for style/readability is a better rationale for choosing the second option over the first one. Optimizing for speed only starts making sense if you really do a lot of calls, or if the call is very expensive -- both are probably not the case in your current example.

Put it in a local variable, for multiple reasons:
It will be much faster. Reading a local variable is definitely cheaper than a HashMap lookup, probably by a factor of 10-100x.
You can give the local variable a good, meaningful name
Your code will probably be shorter / simpler overall, particularly if you use the local variable many times.
You may get bugs during future maintenance if someone modifies one of the get calls but forgets to change the others. This is a problem whenever you are duplicating code. Using a local variable minimises this risk.
In concurrent situations, the value could theoretically change if the HashMap is modified by some other code. You normally want to get the value once and work with the same value. Although if you are running into problems of this nature you should probably be looking at other solutions first (locking, concurrent collections etc.)

How to follow the origin of a value in Java?

I have a variable that very rarely gets an incorrect value. Since the system is quite complex I'm having trouble tracing all the code paths that value goes through - there are multiple threads involved, it can be saved and then loaded from a DB and so on. I'm going to try to use a code graph generator to see if I can spot the problem by looking at the ways the setter can be called, by may be there's some other technique. Perhaps wrapping the value with a class that traces the places and changes it goes through? I'm not sure the question is clear enough, but I'd appreciate input from somebody who encountered such a situation.
[Edit] The problem is not easily reproducible and I can't catch it in a debugger. I'm looking for a static analysis or logging technique to help track down the issue.
[Edit 2] Just to make things clearer, the value I'm talking about is a timestamp represented as the number of milliseconds from the Unix epoch (01/01/1970) in a 64-bit long variable. At some unknown point the top 32 bits of the value are truncated generating completely incorrect (and unrecoverable) timestamps.
[Edit 3] OK, thanks to some of your suggestions and to a couple of hours of pouring through the code, I found the culprit. The millisecond-based timestamp was converted into a second-based timestamp by dividing it by 1000 and stored in an int variable. At a later point in code, the second-based timestamp (an int) was multiplied by 1000 and stored into a new long variable. Since both 1000 and the second-based timestamps were int values, the result of the multiplication was truncated before being converted to long. This was a subtle one, thanks to everyone who helped.

If you are using a setter and only a setter to set your value you can add these lines in order to track the thread and stack trace:
public void setTimestamp(long value) {
if(log.idDebugEnabled) {
log.debug("Setting the value to " + value + ". Old value is " + this.timestamp);
log.debug("Thread is " + Thread.currentThread().getName());
log.debug("Stacktrace is", new Throwable()); // we could also iterate on Thread.currentThread().getStackTrace()
}
// check for bad value
if(value & 0xffffffff00000000L == 0L) {
log.warn("Danger Will Robinson", new IlegalValueException());
}
this.timestamp = value;
}
Also, go over the class that contains the field, and make sure that every reference to it is done via the setter (even in private/protected methods)
Edit
Perhaps FindBugs can help in terms of static analysis, I'll try to find the exact rule later.

The fact that 32 bits of the long get changed, rather than the whole value, suggests strongly that this is a threading problem (two threads update the variable at the same time). Since java does not guarantee atomic access to a long value, if two threads update it at the same time, it could end up with half the bits set one way and half the other. This means that the best way to approach the issue is from a threading point of view. Odds are that there is nothing setting the variable in a way that a static analysis tool will show you is an incorrect value, but rather the syncronization and locking strategy around this variable needs to be examined for potential holes.
As a quick fix, you could wrap that value in an AtomicLong.

I agree - if the value is only changed via a setter (no matter what the orgin) - and it better be - then the best way is to modify the setter to do the tracking for you (print stack trace at every setting, possibly only when the value set is a specific one if that cuts down on the clatter)

Multithreaded programming is jsut hard, but there are IDE tools to help. If you have intellij IDEA, you can use the analyze dataflow feature to work out where things gets changed. If won't show you a live flow (its a static analysis tool), but it can give you a great start.
Alternatively, you can use some Aspects and just print out the value of the variable everywhere, but the resulting debugging info will be too overwhelming to be that meaningful.
The solution is to avoid state shared between threads. Use immutable objects, and program functionally.

Two things:
First, to me, it smells as though some caller is treating their timestamp in an integer context, losing your high 32 bits. It may be, as Yishai surmised, threading-related, but I'd look first at the operations being performed. However, naturally, you need to assure that your value is being updated "atomically" - whether with an AtomicLong, as he suggested, or with some other mechanism.
That speculation aside, given that what you're losing is the high 32 bits, and you know it's milliseconds since the epoch, your setter can enforce validity: if the supplied value is less than the timestamp at program start, it's wrong, so reject it, and of course, print a stack trace.

1) Supposing that foo is the name of your variable, you could add something like this to the setter method:
try {
throw new Exception();
}
catch (Exception e) {
System.out.println("foo == " + foo.toString());
e.printStackTrace();
}
How well this will work depends on how frequently the setter is being called. If it's being called thousands of times over the run of your program, you might have trouble finding the bad value in all the stack traces. (I've used this before to troubleshoot a problem like yours. It worked for me.)
2) If you can run your app in a debugger and you can identify programatically bad values for your variable, then you could set a breakpoint in the setter conditional on whatever it is that makes the value bad. But this requires that you can write a test for badness, which maybe you can't do.
3) Since you said (in a subsequent edit) that the problem is the high 32 bits being zeroed, you can specifically test for that before printing your stack trace. That should cut down the amount of debugging output enough to be manageable.

In your question, you speak of a "variable" that has an incorrect value, and suggest that you could try "wrapping the value with a class". Perhaps I'm reading too much into your choice of words, but would like to see a bit more about the design.
Is the value in question a primitive? Is it a field of a large, complex object that is shared between threads? If it is a field of some object, is that object a DTO or does it implement domain-specific behavior?
In general, I'd agree with the previous comments re instrumenting the object of which the "variable" is a field, but more information about the nature and usage of this variable would help guide more precise suggestions.

Based on your description, I don't know if that means it's not feasible to actual debug the app in real time, but if it is, depending on your IDE there's a bunch of debugging options available.
I know that with Eclipse, you can set conditional breakpoints in the setter method for example. You can specify to suspend only when the value gets set to a specific value, and you can also filter by thread, in case you want to focus on a specific thread.

I will rather keep a breakpoint inside the setter. Eclipse allows you to do that.
There are some IDE which allows you to halt ( wait for execution of next instruction ) the program, if the value of variable is changed.

IMO the best way to debug this type of problem is using a field modification breakpoint. (Especially if you're using reflection extensively)
I'm not sure how to do this in eclipse, but in intellij you can just right click on the field and do an "add breakpoint".

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.