Java AtomincInteger vs one element array for streams - java

int[] arr = new int[]{0};
l.stream().forEach(x -> {if (x > 10 && x < 15) { arr[0] += 1;}});
l is List<Integer>. Here I use one element arr array to store value that is changed inside the stream. An alternative solution is to use an instance of AtomicInteger class. But I don't understand what is the difference between these two approaches in terms of memory usage, time complexity, safety...
Please note: I am not trying to use AtomicInteger (or array) in this particular piece of code. This code is used only as an example. Thanks!

Knowing which is the best way is important and #rzwitserloot's explanation covers that in great detail. In your specific example, you could avoid the issue by doing it like this.
List<Integer> list = List.of(1,2,11,12,15,11,11,9,10,2,3);
int count = list.stream().filter(x->x > 10 && x < 15).reduce(0, (a,b)->a+1);
// or
int count = list.stream().filter(x->x > 10 && x < 15).mapToInt(x->1).sum();
Both return the value 4
In the first example, reduce sets an initial value of 0 and then adds 1 to it (b is syntactically required but not used). To sum the actual elements rather than 1, replace 1 with b in the reduce method.
In the second example, the values are replace with 1 in the stream and then summed. Since the method sum() doesn't exist for streams of objects, the 1 needs to be mapped to an int to create an IntStream. To sum the actual elements here, use mapToInt(x->x)
As suggested in the comments, you can also do it like this.
long count = list.stream().filter(x->x > 10 && x < 15).count();
count() returns a long so it would have to be down cast to an int if that is what you want.

You should always use AtomicInteger:
The performance impact is negligible. Technically, new int[1] is 'faster', but they are the same size, or, the array is actually larger in heap (but unlikely; depends on your OS architecture, usually they'd end up being the same size), and the array does not spend any cycles on guaranteeing proper concurrency protections, but there are really only two options: [A] the concurrency protections are required (because it's a lambda that runs in another thread), and thus the int array is a non-starter; it would result in hard to find bugs, quite horrible, or [B] they aren't required, and the hotspot engine is likely going to figure that out and eliminate this cost entirely. Even if it doesn't, the overhead of concurrency protection when there is no contention is low in any case.
It is more readable. Only slightly so, but new int[1] is weirder than new AtomicInteger(), I'd say. AtomicInteger at least suggests: I want a mutable int that I'm going to mess with from other contexts.
It is more convenient. System.out.println-ing an atomicinteger prints the value. sysouting an array prints garbage.
The convenience methods in AtomicInteger might be relevant. Maybe compareAndSet is useful.
But why?
Lambdas are not transparent in the following 3 things:
Checked exceptions (you cannot throw a checked exception inside a lambda even if the context around your lambda catches it).
Mutable local vars (you cannot touch, let alone change, any variable declared outside of the lambda, unless it is (effectively) final).
Control flow. You can't use break, continue, or return from inside a lambda and have it act like it wasn't: You can't break or continue a loop located outside of your lambda and you can't return form the method outside of your lambda (you can only return from the lambda itself).
These are all very bad things when the lambda runs 'in context', but they are all very good things when the lambda doesn't run in context.
Here is an example:
new TreeSet<String>((a, b) -> a - b);
Here I have created a TreeSet (which is a set that keeps its elements sorted automatically). To make one, you pass in code that determines for any 2 elements which one is 'the higher one', and TreeSet takes care of everything else. That TreeSet can survive your method (just store it in a field or pass it to a method that ends up storing it in a field) and could even escape your thread (have another thread read that field). That means when that code (a - b in this code) is invoked, we could be 5 days from the creation of that TreeSet, in another thread, with the code that 'surrounds' your new TreeSet statement having loooong gone.
In this scenario, all those transparencies make no sense at all:
What does it mean to break back to a loop that has long since completed and the system doesn't even know what it is about anymore?
That catch block uses context that is long gone, such as local vars or the parameters. It can't survive, so if your a - b were to throw something that is checked, the fact that you've wrapped your new TreeSet<> in a try/catch block is meaningless.
What does it mean to access a variable that no longer exists? For that matter, if it still does exist but the lambda runs in a separate thread, do we now start making local vars volatile and declare them on heap instead of stack just in case?
Of course, if your lambda runs within context, as in, you pass the lambda to some method and that method 'uses it or loses it': Runs your lambda a certain amount of times and then forgets all about it, then those lacking transparencies are really annoying.
It's annoying that you can't do this:
public List<String> toLines(List<Path> files) throws IOException {
var allLines = files.stream()
.filter(x -> x.toString().endsWith(".txt"))
.flatMap(x -> Files.readAllLines().stream())
.toList();
}
The only reason the above code fails is that Files.readAllLines() throws IOException. We declared that we throws this onwards but that won't work. You have to kludge up this code, make it bad, by trying to somehow transit that exception out of the lambda or otherwise work around it (the right answer is NOT the use the stream API at all here, write it with a normal for loop!).
Whilst trying to dance around checked exceptions in lambdas is generally just not worth it, you CAN work around the problem of wanting to share a variable with outer context:
int sum = 0;
listOfInts.forEach(x -> sum += x);
The above doesn't work - sum is from the outer scope and thus must be effectively final, and it isn't. There's no particular reason it can't work, but java won't let you. The right answer here is to use int sum = listOfInts.mapToInt(Integer::intValue).sum(); instead, but you can't always find a terminal op that just does what you want. Sometimes you need to kludge around it.
That's where new int[1] and AtomicInteger comes in. These are references - and the reference is final, so you CAN use them in the lambda. But the reference points at an object and you can change it at will, hence, you can use this 'trick' to 'share' a variable:
AtomicInteger sum = new AtomicInteger();
listOfInts.forEach(x -> sum.add(x));
That DOES work.

Related

Mutating `free variables` of Lambda Expressions

I'm reading this fantastic article about Lambda Expressions and the following is uncleared to me:
Does Lambda Expression saves the value of the free-variables or refernse/pointer to each of them? (I guess the answer is the latter because if not, mutate free-variables would be valid).
Don't count on the compiler to catch all concurrent access errors. The
prohibition against mutation holds only for local variables.
I'm not sure that self experimenting would cover all the cases so I'm searching for a well defined rules about:
What free varibles can be mutated inside the Lambda Expression (static/properties/local variables/parameters) and which can be mutated out side while beeing used inside a Lambda Expression?
Can I mutate every free variable after the end of a block of a Lambda Expression after I used it (read or called one of his methods) inisde a Lambda Expression?
Don't count on the compiler to catch all concurrent access errors. The
prohibition against mutation holds only for local variables.
If
matchesis an instance or static variable of an enclosing class, then
no error is reported, even though the result is just as undefined.
Does the result of the mutation is undefined even when I use a synchroniziton algorithm?
Update 1:
free variables - that is, the variables that are not parameters and not defined inside the code.
In simple words I can conclude that Free variables are all the variables that are not parameters of the Lambda Expression and are not defined inside the same Lambda Expression ?
This looks like complicated "words" on a simpler topic. The rules are pretty much the same as for anonymous classes.
For example the compiler catches this:
int x = 3;
Runnable r = () -> {
x = 6; // Local variable x defined in an enclosing scope must be final or effectively final
};
But at the same time it is perfectly legal to do this(from a compiler point of view):
final int x[] = { 0 };
Runnable r = () -> {
x[0] = 6;
};
The example that you provided and uses matches:
List<Path> matches = new ArrayList<>();
List<Path> files = List.of();
for (Path p : files) {
new Thread(() -> {
if (1 == 1) {
matches.add(p);
}
}).start();
}
has the same problem. The compiler does not complain about you editing matches(because you are not changing the reference matches - so it is effectively final); but at the same time this can have undefined results. This operation has side-effects and is discouraged in general.
The undefined results would come from the fact that your matches is not a thread-safe collection obviously.
And your last point : Does the result of the mutation is undefined even when I use a synchroniziton algorithm?. Of course not. With proper synchronization updating a variable outside lambda(or a stream) will work - but are discouraged, mainly because there would be other ways to achieve that.
EDIT
OK, so free variables are those that are not defined within the lambda code itself or are not the parameters of the lambda itself.
In this case the answer to 1) would be: lambda expressions are de-sugared to methods and the rules for free-variables are the same as for anonymous classes. This has been discussed numerous times, like here. This actually answers the second question as well - since the rules are the same. Obviously anything that is final or effectively final can be mutated. For primitives - this means they can't be mutated; for objects you can't mutate the references (but can change the underlying data - as shown in my example). For the 3) - yes.
Your term “free variables” is misleading at best. If you’re not talking about local variables (which must be effectively final to be captured), you are talking about heap variables.
Heap variables might be instance fields, static fields or array elements. For unqualified access to instance variables from the surrounding context, the lambda expression may (and will) access them via the captured this reference. For other instance fields, as well as array elements, you need an explicit access via a variable anyway, so it’s clear, how the heap variable will be accessed. Only static fields are accessed directly.
The rules are simple, unless being declared final, you can modify all of them, inside or outside the lambda expression. Keep in mind that lambda expressions can call arbitrary methods, containing arbitrary code anyway. Whether this will cause problems, depends on how you use the lambda expressions. You can even create problems with functions not directly modifying a variable, without any concurrency, e.g.
ArrayList<String> list=new ArrayList<>(Arrays.asList("foo", "bar"));
list.removeIf(s -> list.remove("bar"));
may throw a java.util.ConcurrentModificationException due to the list modification in an ongoing iteration.
Likewise, modifying a variable or resource in a concurrent context might break it, even if you made sure that the modification of the variable itself has been done in a thread-safe manner. It’s all about the contracts of the API you are using.
Most notably, when using parallel Streams, you have to be aware that functions are not only evaluated by different threads, they are also evaluating arbitrary elements of the Stream, regardless of their encounter order. For the final result of the Stream processing, the implementation will assemble partial results in a way that reestablishes the encounter order, if necessary, but the intermediate operations evaluate the elements in an arbitrary order, hence your functions must not only be thread safe, but also not rely on a particular processing order. In some cases, they may even process elements not contributing to the final result.
Since your bullet 3 refers to “after the end of a block”, I want to emphasize that it is irrelevant at which place inside your lambda expression the modification (or perceivable side effect) happens.
Generally, you are better off with functions not having such side effects. But this doesn’t imply that they are forbidden in general.

How to increment a value in Java Stream?

I want to increment value of index with the each iteration by 1. Easily to be achieved in the for-loop. The variable image is an array of ImageView.
Here is my for-loop.
for (Map.Entry<String, Item> entry : map.entrySet()) {
image[index].setImage(entry.getValue().getImage());
index++;
}
In order to practise Stream, I have tried to rewrite it to the Stream:
map.entrySet().stream()
.forEach(e -> item[index++].setImage(e.getValue().getImage()));
Causing me the error:
error: local variables referenced from a lambda expression must be final or effectively final
How to rewrite the Stream incrementing the variable index to be used in?
You shouldn't. These two look similar, but they are conceptually different. The loop is just a loop, but a forEach instructs the library to perform the action on each element, without specifying neither the order of actions (for parallel streams) nor threads which will execute them. If you use forEachOrdered, then there are still no guarantees about threads, but at least you have the guarantee of happens-before relationship between actions on subsequent elements.
Note especially that the docs say:
For any given element, the action may be performed at whatever time
and in whatever thread the library chooses. If the action accesses
shared state, it is responsible for providing the required
synchronization.
As #Marko noted in the comments below, though, it only applies to parallel streams, even if the wording is a bit confusing. Nevertheless, using a loop means that you don't even have to worry about all this complicated stuff!
So the bottom line is: use loops if that logic is a part of the function it's in, and use forEach if you just want to tell Java to “do this and that” to elements of the stream.
That was about forEach vs loops. Now on the topic of why the variable needs to be final in the first place, and why you can do that to class fields and array elements. It's because, like it says, Java has the limitation that anonymous classes and lambdas can't access a local variable unless it never changes. Meaning not only they can't change it themselves, but you can't change it outside them as well. But that only applies to local variables, which is why it works for everything else like class fields or array elements.
The reason for this limitation, I think, is lifetime issues. A local variable exists only while the block containing it is executing. Everything else exists while there are references to it, thanks to garbage collection. And that everything else includes lambdas and anonymous classes too, so if they could modify local variables which have different lifetime, that could lead to problems similar to dangling references in C++. So Java took the easy way out: it simply copies the local variable at the time the lambda / anonymous class is created. But that would lead to confusion if you could change that variable (because the copy wouldn't change, and since the copy is invisible it would be very confusing). So Java just forbids any changes to such variables, and that's that.
There are many questions on the final variables and anonymous classes discussed already, like this one.
Some kind of "zip" operation would be helpful here, though standard Stream API lacks it. Some third-party libraries extending Stream API provide it, including my free StreamEx library:
IntStreamEx.ints() // get stream of numbers 0, 1, 2, ...
.boxed() // box them
.zipWith(StreamEx.ofValues(map)) // zip with map values
.forKeyValue((index, item) -> image[index].setImage(item.getImage()));
See zipWith documentation for more details. Note that your map should have meaningful order (like LinkedHashMap), otherwise this would be pretty useless...

Calling a method vs assigning the return type

I would like to know which one is good. I am writing a for loop. In the condition part I am using str.length(). I wonder is this a good idea. I can also assign the value to an integer variable and use it in the loop.
Which one is the suitable/better way?
If you use str.length() more than once or twice in the code, it's logical to extract it to a local var simply for brevity's sake. As for performance, it will most probably be exactly the same because the JIT compiler will inline that call, so the native code will be as if you have used a local variable.
There is no distinct downside to calling a function in the loop condition expression in the sense that "you really should never do it". You want to watch out when calling functions that have side effects, but even that can be acceptable in some circumstances.
There are three major reasons for moving function calls out of the loop (including the loop condition expressions):
Performance. The function may (depending on the JIT compiler) get called for every iteration of the loop, which costs you execution time. Particularly if the function's code has a higher order of complexity than O(1) after the first execution, this will increase the execution time. By how much depends entirely on exactly what the function in question does and how it is implemented.
Side effects. If the function has any side effects, those may (will) be executed repeatedly. This might be exactly what you want, but you need to be aware of it. A side effect is basically something that is observable outside of the function that is being called; for example, disk or network I/O are often considered to be side effects. A function that simply performs calculations on already available data is generally a pure function.
Code clarity. Admittedly str.length() isn't very long, but if you have a complex calculation based around a function call in the loop conditional, code clarity can very easily suffer. For this reason it may be advantageous to move the loop termination condition calculation out of the loop condition expression itself. Beware of awakening the sleeping beast, however; make very sure that the refactored code actually is more readable.
For str.length() it doesn't really matter unless you are really after the last bit of performance you can get, particularly as as has been pointed out by other answerers, String#length() is an O(1) complexity operation. Especially in the general case, if you need the additional performance, consider introducing a variable to hold the result of the function call and comparing against that rather than making the function call repeatedly.
Personally, I'd consider code clarity before worrying about micro-optimizations like exactly where to place a specific function call. But if you have everything else down and still need to ooze a little bit more performance out of the code, moving the function call out of the condition expression and using a local variable (preferably of a primitive type) is something worth considering. Chances are, though, that if you are worried about that, you'll see bigger gains by considering a different algorithm. (Do you really need to iterate over the string the way you are doing? Is there no other way to do what you are after?)
It usually doesn't matter. Use whichever makes your code clearer.
If a value is going to be used more than once, then there are two advantages to assigning it to a local variable:
You can give the variable a good name, which makes your code easier to read an understand
You can sometimes avoid a small amount of overhead by calling the method only once. This helps performance (although the difference is often too small to be noticeable - if in doubt you should benchmark)
Note: This advice only applies to pure functions. You need to be much more careful if the function has side effects, or might return a different value each time (like Math.random()) - in these cases you need to think much more carefully about the effect of multiple function calls.
Calling length costs O(1) since the length is stored as a member - It's a constant operation, don't waste your time thinking about complexity and performance of this thing.
there are no difference at all between the two
But suppose if the str.length changes then in the for loop you need to manualy change the value
for example
String str="hi";
so in the for loop you write this way
for int i=0;i<str.length();i++)
{
}
or
for int i=0;i<2;i++)
{
}
Now suppose you want to change the str String str="hi1";
so in the for loop
for int i=0;i<3;i++)
{
}
So I would suggest you to go for str.length()
If you use str.length always this will evaluated. It is better to assign this value to variable and use that in for loop.
for(int i=0; i<str.length;i++){ // str.length always evaluvated
}
int k=str.length; // only one time evaluvated
for(int i=0;i<k;i++){
}
If you are concern about performance you may use second approach.
If you are using str.length() in the code more than one time then you need to assign it to another variable and use it. Otherwise you can use str.length() itself.
Reason for need
When we call a method, each time the current position is stored in a DS (heap/stack) and go to the corresponding called method and make their operations
And come back and from the DS retrieve the current position and do the normal operations.
That is actually happening. So when we do it so many times in a program it will cause the above mentioned scenario for several times.
Therefore we need to create a local variable and assign into it and use where ever need in the program.

How bad is declaring arrays inside a for loop in Java?

I come from a C background, so I admit that I'm still struggling with letting go of memory management when writing in Java. Here's one issue that's come up a few times that I would love to get some elaboration on. Here are two ways to write the same routine, the only difference being when double[] array is declared:
Code Sample 1:
double[] array;
for (int i=0; i<n; ++i) {
array = calculateSomethingAndReturnAnArray(i);
if (someFunctionOnArrays(array)) {
// DO ONE THING
} else {
// DO SOME OTHER THING
}
}
Code Sample 2:
for (int i=0; i<n; ++i) {
double[] array = calculateSomethingAndReturnAnArray(i);
if (someFunctionOnArrays(array)) {
// DO ONE THING
} else {
// DO SOME OTHER THING
}
}
Here, private double[] calculateSomethingAndReturnAnArray(int i) always returns an array of the same length. I have a strong aversion to Code Sample 2 because it creates a new array for each iteration when it could just overwrite the existing array. However, I think this might be one of those times when I should just sit back and let Java handle the situation for me.
What are the reasons to prefer one of the ways over the other or are they truly identical in Java?
There's nothing special about arrays here because you're not allocating for the array, you're just creating a new variable, it's equivalent to:
Object foo;
for(...){
foo = func(...);
}
In the case where you create the variable outside the loop it, the variable (which will hold the location of the thing it refers to) will only ever be allocated once, in the case where you create the variable inside the loop, the variable may be reallocated for in each iteration, but my guess is the compiler or the JIT will fix that in an optimization step.
I'd consider this a micro-optimization, if you're running into problems with this segment of your code, you should be making decisions based on measurements rather than on the specs alone, if you're not running into issues with this segment of code, you should do the semantically correct thing and declare the variable in the scope that makes sense.
See also this similar question about best practices.
A declaration of a local variable without an initializing expression will do NO work whatsoever. The work happens when the variable is initialized.
Thus, the following are identical with respects to semantics and performance:
double[] array;
for (int i=0; i<n; ++i) {
array = calculateSomethingAndReturnAnArray(i);
// ...
}
and
for (int i=0; i<n; ++i) {
double[] array = calculateSomethingAndReturnAnArray(i);
// ...
}
(You can't even quibble that the first case allows the array to be used after the loop ends. For that to be legal, array has to have a definite value after the loop, and it doesn't unless you add an initializer to the declaration; e.g. double[] array = null;)
To elaborate on #Mark Elliot 's point about micro-optimization:
This is really an attempt to optimize rather than a real optimization, because (as I noted) it should have no effect.
Even if the Java compiler actually emitted some non-trivial executable code for double[] array;, the chances are that the time to execute would be insignificant compared with the total execution time of the loop body, and of the application as a whole. Hence, this is most likely to be a pointless optimization.
Even if this is a worthwhile optimization, you have to consider that you have optimized for a specific target platform; i.e. a particular combination of hardware and JVM version. Micro-optimizations like this may not be optimal on other platforms, and could in theory be anti-optimizations.
In summary, you are most likely wasting your time if you focus on things like this when writing Java code. If performance is a concern for your application, focus on the MACRO level performance; e.g. things like algorithmic complexity, good database / query design, patterns of network interactions, and so on.
Both create a new array for each iteration. They have the same semantics.

Java - Common Gotchas [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
In the same spirit of other platforms, it seemed logical to follow up with this question: What are common non-obvious mistakes in Java? Things that seem like they ought to work, but don't.
I won't give guidelines as to how to structure answers, or what's "too easy" to be considered a gotcha, since that's what the voting is for.
See also:
Perl - Common gotchas
.NET - Common gotchas
"a,b,c,d,,,".split(",").length
returns 4, not 7 as you might (and I certainly did) expect. split ignores all trailing empty Strings returned. That means:
",,,a,b,c,d".split(",").length
returns 7! To get what I would think of as the "least astonishing" behaviour, you need to do something quite astonishing:
"a,b,c,d,,,".split(",",-1).length
to get 7.
Comparing equality of objects using == instead of .equals() -- which behaves completely differently for primitives.
This gotcha ensures newcomers are befuddled when "foo" == "foo" but new String("foo") != new String("foo").
I think a very sneaky one is the String.substring method. This re-uses the same underlying char[] array as the original string with a different offset and length.
This can lead to very hard-to-see memory problems. For example, you may be parsing extremely large files (XML perhaps) for a few small bits. If you have converted the whole file to a String (rather than used a Reader to "walk" over the file) and use substring to grab the bits you want, you are still carrying around the full file-sized char[] array behind the scenes. I have seen this happen a number of times and it can be very difficult to spot.
In fact this is a perfect example of why interface can never be fully separated from implementation. And it was a perfect introduction (for me) a number of years ago as to why you should be suspicious of the quality of 3rd party code.
Overriding equals() but not hashCode()
It can have really unexpected results when using maps, sets or lists.
SimpleDateFormat is not thread safe.
There are two that annoy me quite a bit.
Date/Calendar
First, the Java Date and Calendar classes are seriously messed up. I know there are proposals to fix them, I just hope they succeed.
Calendar.get(Calendar.DAY_OF_MONTH) is 1-based
Calendar.get(Calendar.MONTH) is 0-based
Auto-boxing preventing thinking
The other one is Integer vs int (this goes for any primitive version of an object). This is specifically an annoyance caused by not thinking of Integer as different from int (since you can treat them the same much of the time due to auto-boxing).
int x = 5;
int y = 5;
Integer z = new Integer(5);
Integer t = new Integer(5);
System.out.println(5 == x); // Prints true
System.out.println(x == y); // Prints true
System.out.println(x == z); // Prints true (auto-boxing can be so nice)
System.out.println(5 == z); // Prints true
System.out.println(z == t); // Prints SOMETHING
Since z and t are objects, even they though hold the same value, they are (most likely) different objects. What you really meant is:
System.out.println(z.equals(t)); // Prints true
This one can be a pain to track down. You go debugging something, everything looks fine, and you finally end up finding that your problem is that 5 != 5 when both are objects.
Being able to say
List<Integer> stuff = new ArrayList<Integer>();
stuff.add(5);
is so nice. It made Java so much more usable to not have to put all those "new Integer(5)"s and "((Integer) list.get(3)).intValue()" lines all over the place. But those benefits come with this gotcha.
Try reading Java Puzzlers which is full of scary stuff, even if much of it is not stuff you bump into every day. But it will destroy much of your confidence in the language.
List<Integer> list = new java.util.ArrayList<Integer>();
list.add(1);
list.remove(1); // throws...
The old APIs were not designed with boxing in mind, so overload with primitives and objects.
This one I just came across:
double[] aList = new double[400];
List l = Arrays.asList(aList);
//do intense stuff with l
Anyone see the problem?
What happens is, Arrays.asList() expects an array of object types (Double[], for example). It'd be nice if it just threw an error for the previous ocde. However, asList() can also take arguments like so:
Arrays.asList(1, 9, 4, 4, 20);
So what the code does is create a List with one element - a double[].
I should've figured when it took 0ms to sort a 750000 element array...
this one has trumped me a few times and I've heard quite a few experienced java devs wasting a lot of time.
ClassNotFoundException --- you know that the class is in the classpath BUT you are NOT sure why the class is NOT getting loaded.
Actually, this class has a static block. There was an exception in the static block and someone ate the exception. they should NOT. They should be throwing ExceptionInInitializerError. So, always look for static blocks to trip you. It also helps to move any code in static blocks to go into static methods so that debugging the method is much more easier with a debugger.
Floats
I don't know many times I've seen
floata == floatb
where the "correct" test should be
Math.abs(floata - floatb) < 0.001
I really wish BigDecimal with a literal syntax was the default decimal type...
Not really specific to Java, since many (but not all) languages implement it this way, but the % operator isn't a true modulo operator, as it works with negative numbers. This makes it a remainder operator, and can lead to some surprises if you aren't aware of it.
The following code would appear to print either "even" or "odd" but it doesn't.
public static void main(String[] args)
{
String a = null;
int n = "number".hashCode();
switch( n % 2 ) {
case 0:
a = "even";
break;
case 1:
a = "odd";
break;
}
System.out.println( a );
}
The problem is that the hash code for "number" is negative, so the n % 2 operation in the switch is also negative. Since there's no case in the switch to deal with the negative result, the variable a never gets set. The program prints out null.
Make sure you know how the % operator works with negative numbers, no matter what language you're working in.
Manipulating Swing components from outside the event dispatch thread can lead to bugs that are extremely hard to find. This is a thing even we (as seasoned programmers with 3 respective 6 years of java experience) forget frequently! Sometimes these bugs sneak in after having written code right and refactoring carelessly afterwards...
See this tutorial why you must.
Immutable strings, which means that certain methods don't change the original object but instead return a modified object copy. When starting with Java I used to forget this all the time and wondered why the replace method didn't seem to work on my string object.
String text = "foobar";
text.replace("foo", "super");
System.out.print(text); // still prints "foobar" instead of "superbar"
I think i big gotcha that would always stump me when i was a young programmer, was the concurrent modification exception when removing from an array that you were iterating:
List list = new ArrayList();
Iterator it = list.iterator();
while(it.hasNext()){
//some code that does some stuff
list.remove(0); //BOOM!
}
if you have a method that has the same name as the constructor BUT has a return type. Although this method looks like a constructor(to a noob), it is NOT.
passing arguments to the main method -- it takes some time for noobs to get used to.
passing . as the argument to classpath for executing a program in the current directory.
Realizing that the name of an Array of Strings is not obvious
hashCode and equals : a lot of java developers with more than 5 years experience don't quite get it.
Set vs List
Till JDK 6, Java did not have NavigableSets to let you easily iterate through a Set and Map.
Integer division
1/2 == 0 not 0.5
Using the ? generics wildcard.
People see it and think they have to, e.g. use a List<?> when they want a List they can add anything to, without stopping to think that a List<Object> already does that. Then they wonder why the compiler won't let them use add(), because a List<?> really means "a list of some specific type I don't know", so the only thing you can do with that List is get Object instances from it.
(un)Boxing and Long/long confusion. Contrary to pre-Java 5 experience, you can get a NullPointerException on the 2nd line below.
Long msec = getSleepMsec();
Thread.sleep(msec);
If getSleepTime() returns a null, unboxing throws.
The default hash is non-deterministic, so if used for objects in a HashMap, the ordering of entries in that map can change from run to run.
As a simple demonstration, the following program can give different results depending on how it is run:
public static void main(String[] args) {
System.out.println(new Object().hashCode());
}
How much memory is allocated to the heap, or whether you're running it within a debugger, can both alter the result.
When you create a duplicate or slice of a ByteBuffer, it does not inherit the value of the order property from the parent buffer, so code like this will not do what you expect:
ByteBuffer buffer1 = ByteBuffer.allocate(8);
buffer1.order(ByteOrder.LITTLE_ENDIAN);
buffer1.putInt(2, 1234);
ByteBuffer buffer2 = buffer1.duplicate();
System.out.println(buffer2.getInt(2));
// Output is "-771489792", not "1234" as expected
Among the common pitfalls, well known but still biting occasionally programmers, there is the classical if (a = b) which is found in all C-like languages.
In Java, it can work only if a and b are boolean, of course. But I see too often newbies testing like if (a == true) (while if (a) is shorter, more readable and safer...) and occasionally writing by mistake if (a = true), wondering why the test doesn't work.
For those not getting it: the last statement first assign true to a, then do the test, which always succeed!
-
One that bites lot of newbies, and even some distracted more experienced programmers (found it in our code), the if (str == "foo"). Note that I always wondered why Sun overrode the + sign for strings but not the == one, at least for simple cases (case sensitive).
For newbies: == compares references, not the content of the strings. You can have two strings of same content, stored in different objects (different references), so == will be false.
Simple example:
final String F = "Foo";
String a = F;
String b = F;
assert a == b; // Works! They refer to the same object
String c = "F" + F.substring(1); // Still "Foo"
assert c.equals(a); // Works
assert c == a; // Fails
-
And I also saw if (a == b & c == d) or something like that. It works (curiously) but we lost the logical operator shortcut (don't try to write: if (r != null & r.isSomething())!).
For newbies: when evaluating a && b, Java doesn't evaluate b if a is false. In a & b, Java evaluates both parts then do the operation; but the second part can fail.
[EDIT] Good suggestion from J Coombs, I updated my answer.
The non-unified type system contradicts the object orientation idea. Even though everything doesn't have to be heap-allocated objects, the programmer should still be allowed to treat primitive types by calling methods on them.
The generic type system implementation with type-erasure is horrible, and throws most students off when they learn about generics for the first in Java: Why do we still have to typecast if the type parameter is already supplied? Yes, they ensured backward-compatibility, but at a rather silly cost.
Going first, here's one I caught today. It had to do with Long/long confusion.
public void foo(Object obj) {
if (grass.isGreen()) {
Long id = grass.getId();
foo(id);
}
}
private void foo(long id) {
Lawn lawn = bar.getLawn(id);
if (lawn == null) {
throw new IllegalStateException("grass should be associated with a lawn");
}
}
Obviously, the names have been changed to protect the innocent :)
Another one I'd like to point out is the (too prevalent) drive to make APIs generic. Using well-designed generic code is fine. Designing your own is complicated. Very complicated!
Just look at the sorting/filtering functionality in the new Swing JTable. It's a complete nightmare. It's obvious that you are likely to want to chain filters in real life but I have found it impossible to do so without just using the raw typed version of the classes provided.
System.out.println(Calendar.getInstance(TimeZone.getTimeZone("Asia/Hong_Kong")).getTime());
System.out.println(Calendar.getInstance(TimeZone.getTimeZone("America/Jamaica")).getTime());
The output is the same.
I had some fun debugging a TreeSet once, as I was not aware of this information from the API:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all key comparisons using its compareTo (or compare) method, so two keys that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.
http://download.oracle.com/javase/1.4.2/docs/api/java/util/TreeSet.html
Objects with correct equals/hashcode implementations were being added and never seen again as the compareTo implementation was inconsistent with equals.
IMHO
1. Using vector.add(Collection) instead of vector.addall(Collection). The first adds the collection object to vector and second one adds the contents of collection.
2. Though not related to programming exactly, the use of xml parsers that come from multiple sources like xerces, jdom. Relying on different parsers and having their jars in the classpath is a nightmare.

Categories

Resources