Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I understand that every time I type the string literal "", the same String object is referenced in the string pool.
But why doesn't the String API include a public static final String Empty = "";, so I could use references to String.Empty?
It would save on compile time, at the very least, since the compiler would know to reference the existing String, and not have to check if it had already been created for reuse, right? And personally I think a proliferation of string literals, especially tiny ones, in many cases is a "code smell".
So was there a Grand Design Reason behind no String.Empty, or did the language creators simply not share my views?
String.EMPTY is 12 characters, and "" is two, and they would both be referencing exactly the same instance in memory at runtime. I'm not entirely sure why String.EMPTY would save on compile time, in fact I think it would be the latter.
Especially considering Strings are immutable, it's not like you can first get an empty String, and perform some operations on it - best to use a StringBuilder (or StringBuffer if you want to be thread-safe) and turn that into a String.
Update
From your comment to the question:
What inspired this is actually
TextBox.setText("");
I believe it would be totally legitimate to provide a constant in your appropriate class:
private static final String EMPTY_STRING = "";
And then reference it as in your code as
TextBox.setText(EMPTY_STRING);
As this way at least you are explicit that you want an empty String, rather than you forgot to fill in the String in your IDE or something similar.
Use org.apache.commons.lang.StringUtils.EMPTY
If you want to compare with empty string without worrying about null values you can do the following.
if ("".equals(text))
Ultimately you should do what what you believe is clearest. Most programmers assume "" means empty string, not a string someone forgot to put anything into.
If you think there is a performance advantage, you should test it. If you don't think its worth testing for yourself, its a good indication it really isn't worth it.
It sounds like to you try to solve a problem which was solved when the language was designed more than 15 years ago.
Don't just say "memory pool of strings is reused in the literal form, case closed". What compilers do under the hood is not the point here. The question is reasonable, specially given the number of up-votes it received.
It's about the symmetry, without it APIs are harder to use for humans. Early Java SDKs notoriously ignored the rule and now it's kind of too late. Here are a few examples on top of my head, feel free to chip in your "favorite" example:
BigDecimal.ZERO, but no AbstractCollection.EMPTY, String.EMPTY
Array.length but List.size()
List.add(), Set.add() but Map.put(), ByteBuffer.put() and let's not forget StringBuilder.append(), Stack.push()
Apache StringUtils addresses this problem too.
Failings of the other options:
isEmpty() - not null safe. If the
string is null, throws an NPE
length() == 0 - again not null safe.
Also does not take into account
whitespace strings.
Comparison to EMPTY constant - May
not be null safe. Whitespace problem
Granted StringUtils is another library to drag around, but it works very well and saves loads of time and hassle checking for nulls or gracefully handling NPEs.
Seems like this is the obvious answer:
String empty = org.apache.commons.lang.StringUtils.EMPTY;
Awesome because "empty initialization" code no longer has a "magic string" and uses a constant.
If you really want a String.EMPTY constant, you can create an utility static final class named "Constants" (for example) in your project. This class will maintain your constants, including the empty String...
In the same idea, you can create ZERO, ONE int constants... that don't exist in the Integer class, but like I commented, it would be a pain to write and to read :
for(int i=Constants.ZERO; ...) {
if(myArray.length > Constants.ONE) {
System.out.println("More than one element");
}
}
Etc.
All those "" literals are the same object. Why make all that extra complexity? It's just longer to type and less clear (the cost to the compiler is minimal). Since Java's strings are immutable objects, there's never any need at all to distinguish between them except possibly as an efficiency thing, but with the empty string literal that's not a big deal.
If you really want an EmptyString constant, make it yourself. But all it will do is encourage even more verbose code; there will never be any benefit to doing so.
To add on to what Noel M stated, you can look at this question, and this answer shows that the constant is reused.
http://forums.java.net/jive/message.jspa?messageID=17122
String constant are always "interned"
so there is not really a need for such
constant.
String s=""; String t=""; boolean b=s==t; // true
I understand that every time I type the String literal "", the same String object is referenced in the String pool.
There's no such guarantee made. And you can't rely on it in your application, it's completely up to jvm to decide.
or did the language creators simply not share my views?
Yep. To me, it seems very low priority thing.
Late answer, but I think it adds something new to this topic.
None of the previous answers has answered the original question. Some have attempted to justify the lack of a constant, while others have showed ways in which we can deal with the lack of the constant. But no one has provided a compelling justification for the benefit of the constant, so its lack is still not properly explained.
A constant would be useful because it would prevent certain code errors from going unnoticed.
Say that you have a large code base with hundreds of references to "". Someone modifies one of these while scrolling through the code and changes it to " ". Such a change would have a high chance of going unnoticed into production, at which point it might cause some issue whose source will be tricky to detect.
OTOH, a library constant named EMPTY, if subject to the same error, would generate a compiler error for something like EM PTY.
Defining your own constant is still better. Someone could still alter its initialization by mistake, but because of its wide use, the impact of such an error would be much harder to go unnoticed than an error in a single use case.
This is one of the general benefits that you get from using constants instead of literal values. People usually recognize that using a constant for a value used in dozens of places allows you to easily update that value in just one place. What is less often acknowledged is that this also prevents that value from being accidentally modified, because such a change would show everywhere. So, yes, "" is shorter than EMPTY, but EMPTY is safer to use than "".
So, coming back to the original question, we can only speculate that the language designers were probably not aware of this benefit of providing constants for literal values that are frequently used. Hopefully, we'll one day see string constants added in Java.
For those claiming "" and String.Empty are interchangeable or that "" is better, you are very wrong.
Each time you do something like myVariable = ""; you are creating an instance of an object.
If Java's String object had an EMPTY public constant, there would only be 1 instance of the object ""
E.g: -
String.EMPTY = ""; //Simply demonstrating. I realize this is invalid syntax
myVar0 = String.EMPTY;
myVar1 = String.EMPTY;
myVar2 = String.EMPTY;
myVar3 = String.EMPTY;
myVar4 = String.EMPTY;
myVar5 = String.EMPTY;
myVar6 = String.EMPTY;
myVar7 = String.EMPTY;
myVar8 = String.EMPTY;
myVar9 = String.EMPTY;
10 (11 including String.EMPTY) Pointers to 1 object
Or: -
myVar0 = "";
myVar1 = "";
myVar2 = "";
myVar3 = "";
myVar4 = "";
myVar5 = "";
myVar6 = "";
myVar7 = "";
myVar8 = "";
myVar9 = "";
10 pointers to 10 objects
This is inefficient and throughout a large application, can be significant.
Perhaps the Java compiler or run-time is efficient enough to automatically point all instances of "" to the same instance, but it might not and takes additional processing to make that determination.
Related
This question already has answers here:
Calling getters on an object vs. storing it as a local variable (memory footprint, performance)
(6 answers)
Closed 5 years ago.
To check a string input before using it, I often use code as shown in the first block. The second block does the same, but is there a difference in performance? (for example, is the textfield read twice in the first code block?) Also is one considered as bad coding or can I use both of them?
if (!textField.getText().equals("")) {
name = textField.getText();
}
or
String ipt = textField.getText();
if (!ipt.equals("")) {
name = ipt;
In terms of performance, the second is faster as it spares a duplicated processing.
But it is a so cheap task that you should execute it million times to see a difference.
Besides, modern JVMs performs many optimizations. Which could result in similar performance in both cases.
But in these two ways, I notice bad smells :
This way :
if (!textField.getText().equals("")) {
name = textField.getText();
}
introduces duplication.
It is generally unadvised as if we forget to change the duplicate everywhere we may introduce issues.
Duplicating one line of code is often not a serious issue but generally textfields are not single. So you perform very probably this task for multiple textfields. Duplication becomes so a problem.
This way :
String ipt = textField.getText();
if (!ipt.equals("")) {
name = ipt;
introduces a local variable that has a scope broader than required. It is also discouraged to make the code less error prone as we may reuse the variable after the processing where it is required while we should not be able to.
For your requirement, I dislike both solutions.
A better quality and efficient way would be introduce a method :
private String getValueOrDefault(TextField textField, String defaultValue) {
String value = textField.getText();
return !value.equals("") ? value : defaultValue;
}
You can invoke it in this way :
name = getValueOrDefault(textField, name);
You don't have any longer a variable that lives beyond its requirement and you don't have duplication either.
The second one is likely faster. That is because the first one calls getText() twice. The second one only stores it once, then can be accessed. However, since it is defined in a variable, it might be the same. This is likely the fastest way to do it:
if (!(String value=textField.getText().equals(""))
name = value;
EDIT: As said by someone in the comments, the above is NOT faster. Sorry for the answer, I didn't run it through a timer.
This question already has answers here:
Why can't strings be mutable in Java and .NET?
(17 answers)
Closed 7 years ago.
I've always wondered why does JAVA and C# has String (immutable & threadsafe) class, if they have StringBuilder (mutable & not threadsafe) or StringBuffer (mutable & threadsafe) class. Isn't StringBuilder/StringBuffer superset of String class? I mean, why should I use String class, if I've option of using StringBuilder/StringBuffer?
For example, Instead of using following,
String str;
Why can't I always use following?
StringBuilder strb; //or
StringBuffer strbu;
In short, my question is, How will my code get effected if I replace String with StringBuffer class? Also, StringBuffer has added advantage of mutability.
I mean, why should I use String class, if I've option of using StringBuilder/StringBuffer?
Precisely because it's immutable. Immutability has a whole host of benefits, primarily that it makes it much easier to reason about your code without creating copies of the data everywhere "just in case" something decides to mutate the value. For example:
private readonly String name;
public Person(string name)
{
if (string.IsNullOrEmpty(name)) // Or whatever
{
// Throw some exception
}
this.name = name;
}
// All the rest of the code can rely on name being a non-null
// reference to a non-empty string. Nothing can mutate it, leaving
// evil reflection aside.
Immutability makes sharing simple and efficient. That's particularly useful for multi-threaded code. It makes "modifying" (i.e. creating a new instance with different data) more painful, but in many situations that's absolutely fine, because values pass through the system without ever being modified.
Immutability is particularly useful for "simple" types such as strings, dates, numbers (BigDecimal, BigInteger etc). It allows them to be used within maps more easily, it allows a simple equality definition, etc.
1) StringBuilder as well as StringBuffer both are mutable. So it will cause a few problems like using in collections like keys in hashMap. See this link.
Another example of advantage of immutability will be what Jon has mentioned in his comments. I am just pasting here.
Someone can call Person p = new Person(builder); with a builder which initially passes my validation criteria - and then modify it afterwards, without the Person class having any say in it. In order to avoid that, the Person class would need to copy the validated data.
Immutabilty assures this does not happen.
2) As string is most extensively used object in java, the string pool offers to resuse same string, thus saving memory.
I completely agree with Jon Skeet that immutability is one reason to use String. Another reason (from a C# perspective) is that String is actually lighter weight than StringBuilder. If you look at reference source for both String and String Builder you will see that StringBuilder actually has a number of String constants in it. As a developer, you should only use what you need so unless you need the added benefits provided from StringBuilder you should use String.
Many answers have already outlined that there are shortcomings from using mutable variants such as StringBuilder. To illustrate the problem, one thing that you cannot achieve with StringBuilder is associative memory, i.e. hash tables. Sure, most implementations will allow you to use StringBuilder as a key for hashtables, but they will only find the values for the exact same instance of StringBuilder. However, the typical behavior that you would want to achieve is that it does not matter where the string comes from as only the characters are important, as you e.g. reade the string from a database or file (or any other external resource).
However, as far as I understood your question, you were mainly asking about field types. And indeed, I see your point particularly taking into account that we are doing the exact same thing with collections of other objects which are usually not immutable objects but mutable collections, such as List or ArrayList in C# or Java, respectively. In the end, a string is only a collection of characters, so why not making it mutable?
The answer I would give here is that the usual behavior of how such a string is changed is very different to usual collections. If you have a collection of subsequent elements, it is very common to only add a single element to the collection, leaving most of the collection untouched, i.e. you would not discard a list to insert an item, at least unless you are programming in Haskell :). For many strings like names, this is different as you typically replace the whole string. Given the importance of a string data type, the platforms usually offer a lot of optimization for strings such as interned strings, making the choice even more biased towards strings.
However, in the end, every program is different and you might have requirements that make it more reasonable to use StringBuilder by default, but for the given reasons, I think that these cases are rather rare.
EDIT: As you were asking for examples. Consider the following code:
stopwatch.Start();
var s = "";
for (int i = 0; i < 100000; i++)
{
s = "." + s;
}
stopwatch.Stop();
Console.WriteLine(stopwatch.ElapsedMilliseconds);
stopwatch.Restart();
var s2 = new StringBuilder();
for (int i = 0; i < 100000; i++)
{
s2.Insert(0, ".");
}
stopwatch.Stop();
Console.WriteLine(stopwatch.ElapsedMilliseconds);
Technically, both bits are doing a very similar thing, they will insert a character at the first position and shift whatever comes after. Both versions will involve copying the whole string that has been there before. The version with string completes in 1750ms on my machine whereas StringBuilder took 2245ms. However, both versions are reasonably fast, making the performance impact negligible in this case.
I would like to add some differences between String and StringBuilder classes:
Yes, as mentioned above String is immutable class and content cannot be changed after string has been created. It is allow to work with the same string objects from different threads without locking.
If you need to concatenate a lot of strings together, use StringBuilder class. When you use "+" operator it creates a lot of string objects on managed heap and hurts performance.
StringBuilder is mutable class. StringBuilder stores characters in array and can manipulate with characters without creating a new string object (such as add, remove, replace, append).
If you know approximate length of result string you should set capacity. Default capacity is 16 (.NET 4.5). It gives you performance improvements because StringBuilder has inner array of chars. Array of chars recreates when count of characters exceeds current capacity.
String:
is immutable (so you can use it in collections)
every operation creates a new instance on the Heap. Technically speaking really depends on the code.
For performance and memory consumption purposes it makes sense to use StringBuilder.
As we all know, String is immutable in java. however, one can change it using reflection, by getting the Field and setting access level. (I know it is unadvised, I am not planning to do so, this question is pure theoretical).
my question: assuming I know what I am doing (and modify all fields as needed), will the program run properly? or does the jvm makes some optimizations that rely on String being immutable? will I suffer performance loss? if so, what assumption does it make? what will go wrong in the program
p.s. String is just an example, I am interested actually in a general answer, in addition to the example.
thanks!
After compilation some strings may refer to the one instance, so, you will edit more than you want and never know what else are you editing.
public static void main(String args[]) throws Exception {
String s1 = "Hello"; // I want to edit it
String s2 = "Hello"; // It may be anywhere and must not be edited
Field f = String.class.getDeclaredField("value");
f.setAccessible(true);
f.set(s1, "Doesn't say hello".toCharArray());
System.out.println(s2);
}
Output:
Doesn't say hello
You are definitely asking for trouble if you do this. Does that mean you will definitely see bugs right away? No. You might get away with it in a lot of cases, depending on what you're doing.
Here are a couple of cases where it would bite you:
You modify a string that happens to have been declared as literal somewhere within the code. For example you have a function and somewhere it is being called like function("Bob"); in this scenario the string "Bob" is changed throughout your app (this will also be true of string constants declared as final).
You modify a string which is used in substring operations, or which is the result of a substring operation. In Java, taking a substring of a string actually uses the same underlying character array as the source string, which means modifications to the source string will affect substrings (and vice versa).
You modify a string that happens to be used as a key in a map somewhere. It will no longer compare equal to its original value, so lookups will fail.
I know this question is about Java, but I wrote a blog post a while back illustrating just how insane your program may behave if you mutate a string in .NET. The situations are really quite similar.
The thing that jumps to mind for me is string interning - literals, anything in the constant pool and anything manually intern()ed points to the same string object. If you start messing around with the contents of an interned string literal, you may well see the exact same alterations on all the other literals using the same underlying object.
I'm not sure whether the above actually happens since I've never tried (in theory it will, I don't know if something happens under the scene to stop it but I doubt it) but it's things like that that could throw up potential problems. Of course, it could also throw up problems at the Java level through just passing multiple references to the same string around and then using a reflection attack to alter the object from one of the references. Most people (me included!) won't explicitly guard against that sort of thing in code, so using that attack with any code that's not your own, or your own code if you haven't guarded against that either, could cause all sorts of bizarre, horrible bugs.
It's an interesting area theoretically, but the more you dig around the more you see why anything along these lines is a bad idea!
Speaking outside of string, there's no performance enhancements I know of for an object being immutable (indeed I don't think the JVM can even tell at the moment whether an object is immutable, reflection attacks aside.) It could throw things like the checker-framework off though or anything that tries to statically analyse the code to guarantee it's immutable.
I'm pretty sure The JVM itself makes no assumptions about the immutability of Strings, as "immutability" in Java is not a language-level construct; it's a trait implied by a class's implementation, but cannot, as you note, be actually guaranteed in the presence of reflection. Thus, it also shouldn't be relevant to performance.
However, pretty much all Java code in existence (including the Standard API implementation) relies on Strings being immutable, and if you break that expectation, you'll see all kinds of bugs.
The private fields in the String class are the char[], the offset and length. Changing any of these should not have any adverse effect on any other object. But if you can somehow change the contents of the char[], then you could probably see some surprising side effects.
public static void main(String args[]){
String a = "test213";
String s = new String("test213");
try {
System.out.println(s);
System.out.println(a);
char[] value = (char[])getFieldValue(s, "value");
value[1] = 'a';
System.out.println(s);
System.out.println(a);
} catch (Exception e) {
e.printStackTrace();
}
}
static Object getFieldValue(String s,String fieldName) throws SecurityException, NoSuchFieldException, IllegalArgumentException, IllegalAccessException {
Object chars = null;
Field innerCharArray = String.class.getDeclaredField(fieldName);
innerCharArray.setAccessible(true);
chars = innerCharArray.get(s);
return chars;
}
Changing value of S will change the literal of a as mentioned in all answers.
To demonstrate how can it screw up a program:
System.out.print("Initial: "); System.out.println(addr);
editIntStr("ADDR_PLACEH", "192.168.1.1");
System.out.print("From var: "); System.out.println(addr);//
System.out.print("Hardcoded: "); System.out.println("ADDR_PLACEH");
System.out.print("Substring: "); System.out.println("ADDR_PLACE" + "H".substring(0));
System.out.print("Equals test: "); System.out.println("ADDR_PLACEH".equals("192.168.1.1"));
System.out.print("Equals test with substring: "); System.out.println(("ADDR_PLACE" + "H".substring(0)).equals("192.168.1.1"));
Output:
Initial: ADDR_PLACEH
From var: 192.168.1.1
Hardcoded: 192.168.1.1
Substring: ADDR_PLACEH
Equals test: true
Equals test with substring: false
The result of the first Equals test is weird, isn't it? You can't expect your fellow programmers to figure out why is Java thinking they are equal...
Full test code: http://pastebin.com/vbstfWX1
Do you use StringUtils.EMPTY instead of ""?
I mean either as a return value or if you set a the value of a String variable. I don't mean for comparison, because there we use StringUtils.isEmpty()
Of course not.
Do you really think "" is not clear enough ?
Constants have essentially 3 use cases:
Document the meaning of a value (with constant name + javadoc)
Synchronize clients on a common value.
Provide a shortcut to a special value to avoid some init costs
None apply here.
I use StringUtils.EMPTY, for hiding the literal and also to express that return StringUtils.EMPTY was fully expected and there should return an empty string, "" can lead to the assumption that "" can be easily changed into something else and that this was maybe only a mistake. I think the EMPTY is more expressive.
No, just use "".
The literal "" is clear as crystal. There is no misunderstanding as to what was meant. I wouldn't know why you would need a class constant for that. I can only assume that this constant is used throughout the package containing StringUtils instead of "". That doesn't mean you should use it, though.
If there's a rock on the sidewalk, you don't have to throw it.
I'm amazed at how many people are happy to blindly assume that "" is indeed an empty string, and doesn't (accidentally?) contain any of Unicode's wonderful invisible and non-spacing characters. For the love of all that is good and decent, use EMPTY whenever you can.
I will add my two cents here because I don't see anybody talking about String interning and Class initialization:
All String literals in Java sources are interned, making any "" and StringUtils.EMPTY the same object
Using StringUtils.EMPTY can initialize StringUtils class, as it accesses its static member EMPTY only if it is not declared final (the JLS is specific on that point). However, org.apache.commons.lang3.StringUtils.EMPTY is final, so it won't initialize the class.
See a related answer on String interning and on Class initialization, referring to the JLS 12.4.1.
I don't really like to use it, as return ""; is shorter than return StringUtils.EMPTY.
However, one false advantage of using it is that if you type return " "; instead of return "";, you may encounter different behavior (regarding if you test correctly an empty String or not).
If your class doesn't use anything else from commons then it'd be a pity to have this dependency just for this magic value.
The designer of the StringUtils makes heavy use of this constant, and it's the right thing to do, but that doesn't mean that you should use it as well.
I find StringUtils.EMPTY useful in some cases for legibility. Particularly with:
Ternary operator eg.
item.getId() != null ? item.getId() : StringUtils.EMPTY;
Returning empty String from a method, to confirm that yes I really wanted to do that.
Also by using a constant, a reference to StringUtils.EMPTY is created. Otherwise if you try to instantiate the String literal "" each time the JVM will have to check if it exists in the String pool already (which it likely will, so no extra instance creation overhead). Surely using StringUtils.EMPTY avoids the need to check the String pool?
No, because I have more to write. And an empty String is plattform independent empty (in Java).
File.separator is better than "/" or "\".
But do as you like. You can't get an typo like return " ";
Honestly, I don't see much use of either. If you want to compare egainst an empty string, just use StringUtils.isNotEmpty(..)
I am recommending to use this constant as one of the building stones of a robust code, to lower the risk of accidently have nonvisible characters sneak in when assigning an empty string to a variable.
If you have people from all around the world in your team and maybe some of them not so experienced, then it might be a good idea to insist on using this constant in the code.
There are lots of different languages around and people are using their own local alphabet settings on their computers. Sometimes they just forget to switch back when coding and after they switch and delete with backspace, then text editor can leave some junk inside of "". Using StringUtils.EMPTY just eliminate that risk.
However, this does not have any significant impact on the performance of the code, nor on the code readability. Also it does not resolve some fundamental problem you might experience, so it is totally up to your good judgement weather you will use this constant or not.
Yes, it makes sense.
It might not be the only way to go but I can see very little in the way of saying this "doesn't make sense".
In my opinion:
It stands out more than "".
It explains that you meant empty, and that blank will likely not do.
It will still require changing everywhere if you don't define your own variable and use it in multiple places.
If you don't allow free string literals in code then this helps.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
In the same spirit of other platforms, it seemed logical to follow up with this question: What are common non-obvious mistakes in Java? Things that seem like they ought to work, but don't.
I won't give guidelines as to how to structure answers, or what's "too easy" to be considered a gotcha, since that's what the voting is for.
See also:
Perl - Common gotchas
.NET - Common gotchas
"a,b,c,d,,,".split(",").length
returns 4, not 7 as you might (and I certainly did) expect. split ignores all trailing empty Strings returned. That means:
",,,a,b,c,d".split(",").length
returns 7! To get what I would think of as the "least astonishing" behaviour, you need to do something quite astonishing:
"a,b,c,d,,,".split(",",-1).length
to get 7.
Comparing equality of objects using == instead of .equals() -- which behaves completely differently for primitives.
This gotcha ensures newcomers are befuddled when "foo" == "foo" but new String("foo") != new String("foo").
I think a very sneaky one is the String.substring method. This re-uses the same underlying char[] array as the original string with a different offset and length.
This can lead to very hard-to-see memory problems. For example, you may be parsing extremely large files (XML perhaps) for a few small bits. If you have converted the whole file to a String (rather than used a Reader to "walk" over the file) and use substring to grab the bits you want, you are still carrying around the full file-sized char[] array behind the scenes. I have seen this happen a number of times and it can be very difficult to spot.
In fact this is a perfect example of why interface can never be fully separated from implementation. And it was a perfect introduction (for me) a number of years ago as to why you should be suspicious of the quality of 3rd party code.
Overriding equals() but not hashCode()
It can have really unexpected results when using maps, sets or lists.
SimpleDateFormat is not thread safe.
There are two that annoy me quite a bit.
Date/Calendar
First, the Java Date and Calendar classes are seriously messed up. I know there are proposals to fix them, I just hope they succeed.
Calendar.get(Calendar.DAY_OF_MONTH) is 1-based
Calendar.get(Calendar.MONTH) is 0-based
Auto-boxing preventing thinking
The other one is Integer vs int (this goes for any primitive version of an object). This is specifically an annoyance caused by not thinking of Integer as different from int (since you can treat them the same much of the time due to auto-boxing).
int x = 5;
int y = 5;
Integer z = new Integer(5);
Integer t = new Integer(5);
System.out.println(5 == x); // Prints true
System.out.println(x == y); // Prints true
System.out.println(x == z); // Prints true (auto-boxing can be so nice)
System.out.println(5 == z); // Prints true
System.out.println(z == t); // Prints SOMETHING
Since z and t are objects, even they though hold the same value, they are (most likely) different objects. What you really meant is:
System.out.println(z.equals(t)); // Prints true
This one can be a pain to track down. You go debugging something, everything looks fine, and you finally end up finding that your problem is that 5 != 5 when both are objects.
Being able to say
List<Integer> stuff = new ArrayList<Integer>();
stuff.add(5);
is so nice. It made Java so much more usable to not have to put all those "new Integer(5)"s and "((Integer) list.get(3)).intValue()" lines all over the place. But those benefits come with this gotcha.
Try reading Java Puzzlers which is full of scary stuff, even if much of it is not stuff you bump into every day. But it will destroy much of your confidence in the language.
List<Integer> list = new java.util.ArrayList<Integer>();
list.add(1);
list.remove(1); // throws...
The old APIs were not designed with boxing in mind, so overload with primitives and objects.
This one I just came across:
double[] aList = new double[400];
List l = Arrays.asList(aList);
//do intense stuff with l
Anyone see the problem?
What happens is, Arrays.asList() expects an array of object types (Double[], for example). It'd be nice if it just threw an error for the previous ocde. However, asList() can also take arguments like so:
Arrays.asList(1, 9, 4, 4, 20);
So what the code does is create a List with one element - a double[].
I should've figured when it took 0ms to sort a 750000 element array...
this one has trumped me a few times and I've heard quite a few experienced java devs wasting a lot of time.
ClassNotFoundException --- you know that the class is in the classpath BUT you are NOT sure why the class is NOT getting loaded.
Actually, this class has a static block. There was an exception in the static block and someone ate the exception. they should NOT. They should be throwing ExceptionInInitializerError. So, always look for static blocks to trip you. It also helps to move any code in static blocks to go into static methods so that debugging the method is much more easier with a debugger.
Floats
I don't know many times I've seen
floata == floatb
where the "correct" test should be
Math.abs(floata - floatb) < 0.001
I really wish BigDecimal with a literal syntax was the default decimal type...
Not really specific to Java, since many (but not all) languages implement it this way, but the % operator isn't a true modulo operator, as it works with negative numbers. This makes it a remainder operator, and can lead to some surprises if you aren't aware of it.
The following code would appear to print either "even" or "odd" but it doesn't.
public static void main(String[] args)
{
String a = null;
int n = "number".hashCode();
switch( n % 2 ) {
case 0:
a = "even";
break;
case 1:
a = "odd";
break;
}
System.out.println( a );
}
The problem is that the hash code for "number" is negative, so the n % 2 operation in the switch is also negative. Since there's no case in the switch to deal with the negative result, the variable a never gets set. The program prints out null.
Make sure you know how the % operator works with negative numbers, no matter what language you're working in.
Manipulating Swing components from outside the event dispatch thread can lead to bugs that are extremely hard to find. This is a thing even we (as seasoned programmers with 3 respective 6 years of java experience) forget frequently! Sometimes these bugs sneak in after having written code right and refactoring carelessly afterwards...
See this tutorial why you must.
Immutable strings, which means that certain methods don't change the original object but instead return a modified object copy. When starting with Java I used to forget this all the time and wondered why the replace method didn't seem to work on my string object.
String text = "foobar";
text.replace("foo", "super");
System.out.print(text); // still prints "foobar" instead of "superbar"
I think i big gotcha that would always stump me when i was a young programmer, was the concurrent modification exception when removing from an array that you were iterating:
List list = new ArrayList();
Iterator it = list.iterator();
while(it.hasNext()){
//some code that does some stuff
list.remove(0); //BOOM!
}
if you have a method that has the same name as the constructor BUT has a return type. Although this method looks like a constructor(to a noob), it is NOT.
passing arguments to the main method -- it takes some time for noobs to get used to.
passing . as the argument to classpath for executing a program in the current directory.
Realizing that the name of an Array of Strings is not obvious
hashCode and equals : a lot of java developers with more than 5 years experience don't quite get it.
Set vs List
Till JDK 6, Java did not have NavigableSets to let you easily iterate through a Set and Map.
Integer division
1/2 == 0 not 0.5
Using the ? generics wildcard.
People see it and think they have to, e.g. use a List<?> when they want a List they can add anything to, without stopping to think that a List<Object> already does that. Then they wonder why the compiler won't let them use add(), because a List<?> really means "a list of some specific type I don't know", so the only thing you can do with that List is get Object instances from it.
(un)Boxing and Long/long confusion. Contrary to pre-Java 5 experience, you can get a NullPointerException on the 2nd line below.
Long msec = getSleepMsec();
Thread.sleep(msec);
If getSleepTime() returns a null, unboxing throws.
The default hash is non-deterministic, so if used for objects in a HashMap, the ordering of entries in that map can change from run to run.
As a simple demonstration, the following program can give different results depending on how it is run:
public static void main(String[] args) {
System.out.println(new Object().hashCode());
}
How much memory is allocated to the heap, or whether you're running it within a debugger, can both alter the result.
When you create a duplicate or slice of a ByteBuffer, it does not inherit the value of the order property from the parent buffer, so code like this will not do what you expect:
ByteBuffer buffer1 = ByteBuffer.allocate(8);
buffer1.order(ByteOrder.LITTLE_ENDIAN);
buffer1.putInt(2, 1234);
ByteBuffer buffer2 = buffer1.duplicate();
System.out.println(buffer2.getInt(2));
// Output is "-771489792", not "1234" as expected
Among the common pitfalls, well known but still biting occasionally programmers, there is the classical if (a = b) which is found in all C-like languages.
In Java, it can work only if a and b are boolean, of course. But I see too often newbies testing like if (a == true) (while if (a) is shorter, more readable and safer...) and occasionally writing by mistake if (a = true), wondering why the test doesn't work.
For those not getting it: the last statement first assign true to a, then do the test, which always succeed!
-
One that bites lot of newbies, and even some distracted more experienced programmers (found it in our code), the if (str == "foo"). Note that I always wondered why Sun overrode the + sign for strings but not the == one, at least for simple cases (case sensitive).
For newbies: == compares references, not the content of the strings. You can have two strings of same content, stored in different objects (different references), so == will be false.
Simple example:
final String F = "Foo";
String a = F;
String b = F;
assert a == b; // Works! They refer to the same object
String c = "F" + F.substring(1); // Still "Foo"
assert c.equals(a); // Works
assert c == a; // Fails
-
And I also saw if (a == b & c == d) or something like that. It works (curiously) but we lost the logical operator shortcut (don't try to write: if (r != null & r.isSomething())!).
For newbies: when evaluating a && b, Java doesn't evaluate b if a is false. In a & b, Java evaluates both parts then do the operation; but the second part can fail.
[EDIT] Good suggestion from J Coombs, I updated my answer.
The non-unified type system contradicts the object orientation idea. Even though everything doesn't have to be heap-allocated objects, the programmer should still be allowed to treat primitive types by calling methods on them.
The generic type system implementation with type-erasure is horrible, and throws most students off when they learn about generics for the first in Java: Why do we still have to typecast if the type parameter is already supplied? Yes, they ensured backward-compatibility, but at a rather silly cost.
Going first, here's one I caught today. It had to do with Long/long confusion.
public void foo(Object obj) {
if (grass.isGreen()) {
Long id = grass.getId();
foo(id);
}
}
private void foo(long id) {
Lawn lawn = bar.getLawn(id);
if (lawn == null) {
throw new IllegalStateException("grass should be associated with a lawn");
}
}
Obviously, the names have been changed to protect the innocent :)
Another one I'd like to point out is the (too prevalent) drive to make APIs generic. Using well-designed generic code is fine. Designing your own is complicated. Very complicated!
Just look at the sorting/filtering functionality in the new Swing JTable. It's a complete nightmare. It's obvious that you are likely to want to chain filters in real life but I have found it impossible to do so without just using the raw typed version of the classes provided.
System.out.println(Calendar.getInstance(TimeZone.getTimeZone("Asia/Hong_Kong")).getTime());
System.out.println(Calendar.getInstance(TimeZone.getTimeZone("America/Jamaica")).getTime());
The output is the same.
I had some fun debugging a TreeSet once, as I was not aware of this information from the API:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all key comparisons using its compareTo (or compare) method, so two keys that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.
http://download.oracle.com/javase/1.4.2/docs/api/java/util/TreeSet.html
Objects with correct equals/hashcode implementations were being added and never seen again as the compareTo implementation was inconsistent with equals.
IMHO
1. Using vector.add(Collection) instead of vector.addall(Collection). The first adds the collection object to vector and second one adds the contents of collection.
2. Though not related to programming exactly, the use of xml parsers that come from multiple sources like xerces, jdom. Relying on different parsers and having their jars in the classpath is a nightmare.