How to address Java String instantiation issue reported by Sonarqube

How to address Java String instantiation issue reported by Sonarqube - java

I have used the below statement in my code to declare an empty string.
String temp = new String();
This has led to an issue raised by Sonarqube.
So what would be the efficient way to fix this ?
Is the below declaration a good way?
String temp = "";

Sonar is correct in that you shouldn't be using new String(). Initializing to empty string (String temp = "") is better. But if you do not use the value of empty string in any case, you should not initialize the variable to anything. You should only initialize a variable to a value you intend to use.
This is perfectly, and usually, acceptable:
String temp;
Your conditional logic should cover all cases of assignment.

/**
* Initializes a newly created {#code String} object so that it represents
* an empty character sequence. Note that use of this constructor is
* unnecessary since Strings are immutable.
*/
public String() {
this.value = new char[0];
}
Above souce code of String class depicts that use of this constructor is unnecessary. Every time you will create new object in heap. Better is to use String pool.

The second example is the correct one.
Use
String temp = "";

Yes, Sonar is correct. Using new String() is pretty much never a good idea.
The reason for this is, that the JVM caches Strings, so that you don't need to create a new object on the heap every time (this is why sometimes wrongly comparing strings with == works - They are both referring to the same cached instance on the heap).
Constructing a String yourself will circumvent that built-in cache and can lead to performance problems down the line. If all you need is an empty String just assign it:
String temp = "";
Or, if you just want to declare a String since you want to assign it later, just don't assign anything to it!
String temp;
if(condition()) {
temp = "hello!";
} else {
temp = "bye!";
}
If you plan on concatenating to your empty String in a loop though read this question about the attached performance issues and how to handle them.

Related

What is the difference between String initializations by new String() and new String("") in Java?

What is the difference between the following two initializations in Java?
String a = new String();
String b = new String("");

Well, they are almost the same.
public static void main(String[] args) {
String s1 = new String();
String s2 = new String("");
System.out.println(s1.equals(s2)); // returns true.
}
Minor differences (rather insignificant) :
new String(); takes less time to execute than new String(""); because the copy constructor does a lot of stuff.
new String("") adds the empty String ("") to the String constants pool if it is not already present.
Other than this, there are no other differences
Note : The use of new String("abc") is almost always bad because you will be creating 2 Strings one on String constants pool and another on heap with the same value.

Java Docs explains it beautifully
These are 2 different constructor calling
public String()
Initializes a newly created String object so that it represents an
empty character sequence. Note that use of this constructor is
unnecessary since Strings are immutable.
public String(String original)
Initializes a newly created String object so that it represents the
same sequence of characters as the argument; in other words, the newly
created string is a copy of the argument string. Unless an explicit
copy of original is needed, use of this constructor is unnecessary
since Strings are immutable.

Internally, different constructors will be invoked.
However, the resulting String objects will be identical by their content and equal (a.equals(b) will return true)

TheLostMind is mostly correct, but I'd like to add that the copy constructor doesn't actually do that much:
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8-b132/java/lang/String.java#137
137 public String() {138 this.value = new char[0];139 }151 public String(String original) {152 this.value = original.value;153 this.hash = original.hash;154 }
Using the constant "" will use the first constructor to create the object reference anyway, so it doesn't matter too much which one you use.
In any case, I would recommend using the string literal "" because you can save an object reference if you use that string elsewhere. Only use the String constructor if you really need a copy of that string that isn't used anywhere else.

In first case you create only one String object in second case two: "" and new String, if "" object not already exist in string pool.
Initializes a newly created String object so that it represents an
empty character sequence.
Initializes a newly created String object so that it represents the
same sequence of characters as the argument; in other words, the newly
created string is a copy of the argument string.

The first is calling the default constructor and the second is calling the copy constructor in order to create a new string in each case.

From a purely practical point of view, there is zero difference between those constructions, as there is never any reason to ever use either of them. They are both wasteful and over-complicated, and thus equally pointless.
To initialize a variable with the empty string, do:
String s = "";
That is shorter and plainer to type, and avoids creating any String objects, since the one shared "" instance in the intern pool will certainly have already been loaded by some other class anyway.

Can't set a variable inside a while loop

New to Java, sorry if this is a stupid question. I'd like to assign new values to a couple of variables from inside a while loop, but that's not working. I'm getting an error "The local variable newString may not have been initialized" when trying to compile. Here's an example:
public class Test {
public static String example() {
String first;
String second;
String newString;
int start = 1;
while (start<5) {
if (start<4) {
first = "hello";
second = "world";
newString = first + second;
}
start += 1;
}
return newString;
}
public static void main(String[] args) {
System.out.println(example());
}
}

When you are returning a variable as a result from a method, the compiler needs to be sure that a value would be assigned to that variable (in your case newString).
Although it is pretty clear for you that the condition start < 4 will be true at some points, the compiler is not intelligent enough to figure that out, which means you have to return only variables, which has values for sure.
Depending on the purpose of your method, you have the following opportunities:
String newString = "";
In this case you are sure that your method will never return null, which could be tricky in some cases for finding errors.
Another opportunity is
String newString = null;
if you want to allow your method to return null values.
As it is obvious in this case that you will eventually enter the if-block and assign a value to the variable newString in other cases it won't be that obvious, so you need to determine whether to allow your method return null values or not.

You are getting this error because the compiler does not know if newString will ever be initialized before it gets returned.
Although you know that both start<5 and start<4 are true and will, thus, execute, the compiler doesn't perform these types of calculations during compilation. Hence, for all it knows, those statements will never execute, and thus newString may get returned before it is initialized.
Hence, you should initialize it (e.g. to null) when you declare it to avoid this error.

There is a rule in Java that all local variables must be initialized before they are first read. You are using newString as a return value, which is a read operation. Although you are assigning a value to newString, it is done in conditional situation (start<5 && start<4). At the compile time, the compiler does not know what will be the result of running the code and it conservatively complains about this situation.The simple solution will be initializing the string:
String newString = "";

Strings - How do they work?

How do String objects work in Java? How does term "immutable" exactly apply to string objects? Why don't we see modified string after passing through some method, though we operate on original string object value?

a String has a private final char[] . when a new String object is created, the array is also created and filled. it cannot be later accessed [from outside] or modified [actually it can be done with reflection, but we'll leave this aside].
it is "immutable" because once a string is created, its value cannot be changed, a "cow" string will always have the value "cow".
We don't see modified string because it is immutable, the same object will never be changed, no matter what you do with it [besides modifying it with reflection]. so "cow" + " horse" will create a new String object, and NOT modify the last object.
if you define:
void foo(String arg) {
arg= arg + " horse";
}
and you call:
String str = "cow";
foo(str);
the str where the call is is not modified [since it is the original reference to the same object!] when you changed arg, you simply changed it to reference another String object, and did NOT change the actual original object. so str, will be the same object, which was not changed, still containing "cow"
if you still want to change a String object, you can do it with reflection. However, it is unadvised and can have some serious side-affects:
String str = "cow";
try {
Field value = str.getClass().getDeclaredField("value");
Field count = str.getClass().getDeclaredField("count");
Field hash = str.getClass().getDeclaredField("hash");
Field offset = str.getClass().getDeclaredField("offset");
value.setAccessible(true);
count.setAccessible(true);
hash.setAccessible(true);
offset.setAccessible(true);
char[] newVal = { 'c','o','w',' ','h','o','r','s','e' };
value.set(str,newVal);
count.set(str,newVal.length);
hash.set(str,0);
offset.set(str,0);
} catch (NoSuchFieldException e) {
} catch (IllegalAccessException e) {}
System.out.println(str);
}

From the tutorial:
The String class is immutable, so that once it is created a String object cannot be changed. The String class has a number of methods, some of which will be discussed below, that appear to modify strings. Since strings are immutable, what these methods really do is create and return a new string that contains the result of the operation.

Strings in Java are immutable (state cannot be modified once created). This offers opportunities for optimization. One example is string interning, where string literals are maintained in a string pool and new String objects are only created if the particular string literal doesn't already exist in the pool. If the string literal already exists, a reference is returned. This can only be accomplished because strings are immutable, so you don't have to worry that some object holding a reference will change it.
Methods that appear to modify a string actually return a new instance. One example is string concatenation:
String s = "";
for( int i = 0; i < 5; i++ ){
s = s + "hi";
}
What actually happens internally (the compiler changes it):
String s = "";
for( int i = 0; i < 5; i++ ){
StringBuffer sb = new StringBuffer();
sb.append(s);
sb.append("hi");
s = sb.toString();
}
You can clearly see that new instances are created by the toString method (note that this can be made more efficient by directly using StringBuffers). StringBuffers are mutable, unlike Strings.

Every object has state. The state of a String object is the array of characters that make up the String, for example, the String "foo" contains the array ['f', 'o', 'o']. Because a String is immutable, this array can never be changed in any way, shape, or form.
Every method in every class that wants to change a String must instead return a new String that represents the altered state of the old String. That is, if you try to reverse "foo" you will get a new String object with internal state ['o', 'o', 'f'].

I think this link will help you to understand how Java String really works
Now consider the following code -
String s = "ABC";
s.toLowerCase();
The method toLowerCase() will not change the data "ABC" that s contains. Instead, a new String object is instantiated and given the data "abc" during its construction. A reference to this String object is returned by the toLowerCase() method. To make the String s contain the data "abc", a different approach is needed.
Again consider the following - s = s.toLowerCase();
Now the String s references a new String object that contains "abc". There is nothing in the syntax of the declaration of the class String that enforces it as immutable; rather, none of the String class's methods ever affect the data that a String object contains, thus making it immutable.
I don't really understood your third question. May be providing a chunk of code and telling your problem is a better option. Hope this helps.
You can also look into this blogpost for more understanding
[code samples are taken from the wiki. you can also look in there for more information]

Is there any benefit in wrapping a `String` with a `new String (...)` before running its methods?

I saw a few old code snippets in a piece of software that no one remembers who wrote that instead of doing something like:
String abc = SOME_CONSTANT.toLowerCase()
They do:
String abc = new String(SOME_CONSTANT).toLowerCase()
I can't see any value in this - seems like plain old bad programming (e.g. not understanding that String is immutable). Anyone can see a good reason for this?
Note: SOME_CONSTANT is defined as -
public static final String SOME_CONSTANT = "Some value";

No, it just creates more objects (unless the compiler optimizes it away)

The only point in wrapping a String inside another String is to force a copy. e.g.
String str = "A very very log string .......";
// uses the same underlying string which we might not need after this.
String str1 = str.substring(0, 1);
// take a copy of which only has one char in it.
String str2 = new String(str1);
I would just do
public static final String SOME_CONSTANT = "Some value";
public static final String LOWER_CONSTANT = SOME_CONSTANT.toLowerCase();

I agree with you: it's bad programming

No good reason. As you said, String is immutable, so calling toLowerCase() on it will always produce a new string anyway.

new String(someString)
only makes sense in one important case:
String s = incredilyLongString.substring(1000,1005);
String t = new String(s);
assume incredilyLongString is 1000000 chars long (e.g. an XML file) and you just want 5 chars of it. The String s will still take at least ONE MEGABYTE of memory, but String t will be created from scratch and so will take up only the necessary memory.

I'm not sure, but I think when you use new String() you force the JVM to create a new object for that String. If you use SOME_CONSTANT.toLowerCase() JVM will search in the pool of Strings and just make a reference if there is a same String.
Maybe use new String() in this case can be a good practice just to makes clear that the toLowerCase() will affect just the new String generated, not the constant.
But anyway, the effect is the same

Are Strings also static: String creation within Methods

I know that at compile time when a String is created, that String will be THE string used by any objects of that particular signature.
String s = "foo"; <--Any other identical strings will simply be references to this object.
Does this hold for strings created during methods at runtime? I have some code where an object holds a piece of string data. The original code is something like
for(datum :data){
String a = datum.getD(); //getD is not doing anything but returning a field
StringBuffer toAppend = new StringBuffer(a).append(stuff).toString();
someData = someObject.getMethod(a);
//do stuff
}
Since the String was already created in data, it seems better to just call datum.getD() instead of creating a string on every iteration of the loop.
Unless there's something I'm missing?

String instances are shared when they are the result of a compile-time constant expression. As a result, in the example below a and c will point to the same instance, but b will be a different instance, even though they all represent the same value:
String a = "hello";
String b = hell() + o();
String c = "hell" + "o";
public String hell() {
return "hell";
}
public String o() {
return "o";
}
You can explicitly intern the String however:
String b = (hell() + o()).intern();
In which case they'll all point to the same object.

The line
String a = datum.getD();
means, assign the result of evaluating datum.getD() to the reference a . It doesn't create a new String.

You are correct that strings are immutable so all references to the same string value use the same object.
As far as being static, I do not think Strings are static in the way you describe. The Class class is like that, but I think it is the only object that does that.
I think it would be better to just call the datum.getD() since there is nothing that pulling it out into its own sting object gains for you.
If you do use the datum.getD() several times in the loop, then it might make sense to pull the value into a String object, because the cost of creating a string object once might be less than the cost of calling the getD() function multiple times.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to address Java String instantiation issue reported by Sonarqube - java

I have used the below statement in my code to declare an empty string. String temp = new String(); This has led to an issue raised by Sonarqube. So what would be the efficient way to fix this ? Is the below declaration a good way? String temp = "";

The second example is the correct one. Use String temp = "";

Related

What is the difference between String initializations by new String() and new String("") in Java?

Can't set a variable inside a while loop

Strings - How do they work?

Is there any benefit in wrapping a `String` with a `new String (...)` before running its methods?

Are Strings also static: String creation within Methods

Categories

Resources