I was reading an article that said that Java strings are not completely immutable. However, in the article's sample code that modifies the string, it makes a call to string.toUpperCase().toCharArray(), which returns a new string. So what's the purpose of going through the process of changing the string if you call toUpperCase() anyway? Here is the code:
public static void toUpperCase(String orig)
{
try
{
Field stringValue = String.class.getDeclaredField("value");
stringValue.setAccessible(true);
stringValue.set(orig, orig.toUpperCase().toCharArray());
}
catch (Exception ex){}
}
Also, I noticed that string.toUpperCase() by itself doesn't work. It needs to be string.toUpperCase().toCharArray(). Is there a reason for this?
What he's doing:
He's acquring some character array that he knows is the right length (such as the uppercase version of the String) and putting it as the backing array of the String. (The backing array is called value inside the String class.)
Why he's doing it:
To illustrate that you could put any char array there you wanted.
Why this is useful:
String is immutable, and this allows you to circumvent the immutability. Of course, this is not recommended to do - EVER. On the contrary, I would not be surprised if he was saying "Watch out, because people could potentially do this to YOUR code. Even if you think your stuff is safe, it might not be!"
The implications of this are wide reaching. Immutable variables are no longer immutable. Final variables are no longer final. Thread safe objects are no longer thread safe. Contracts you thought you could rely upon, you can no longer do so. All because some engineer somewhere had a problem he couldn't fix with normal means, so he delves into reflection to solve it. Don't be 'that guy'.
You'll also note that how the hashCode for that String would now be changed. So, if you've never calculated the hashCode for that String, it's still 0 so you're okay. On the other hand, if you have calculated it, when you go to put it in a HashMap or HashSet, it won't be retrieved.
Consider the following:
import java.util.*;
import java.lang.reflect.*;
class HashTest {
/** Results:
C:\Documents and Settings\glowcoder\My Documents>java HashTest
Orig hash: -804322678
New value: STACKOVERFLOW
Contains orig: true
Contains copy: false
*/
public static void main(String[] args) throws Exception {
Set<String> set = new HashSet<String>();
String str = "StackOverflow";
System.out.println("Orig hash: " + str.hashCode());
set.add(str);
Field stringValue = String.class.getDeclaredField("value");
stringValue.setAccessible(true);
stringValue.set(str, str.toUpperCase().toCharArray()); //
System.out.println("New value: " + str);
String copy = new String(str); // force a copy
System.out.println("Contains orig: " + set.contains(str));
System.out.println("Contains copy: " + set.contains(copy));
}
}
I would bet he is doing this as a warning against bad behavior rather than showing a 'cool' trick.
EDIT: I found the article you're referring to, and the article it is based on. The original article states: "This means that if a class in another package "fiddles" with an interned String, it can cause havoc in your program. Is this a good thing? (You don't need to answer ;-) " I think that makes it quite clear this is more of a protection guide than advice on how to code.
So if you walk away from this thread with only one piece of information, it is that reflection is dangerous, unreliable, and not to be trifled with!
Don't try this at home!
You are subverting String's immutability. There is no good reason to do this.
I think I cannot add to the explanations already provided, so perhaps I can add to the discussion by suggesting how this can be prevented.
You can prevent somebody tampering with your code in these and other unintended ways by means of using a security manager.
public static void main(String args[]){
System.setSecurityManager(new SecurityManager());
String jedi1 = "jedi";
toUpperCase(jedi1);
System.out.println(jedi1);
}
This will generate an exception in the toUpperCase method, provided that you are not granting all privileges to all code bases in the default policy files. (In your current code your exceptions are currently swallowed).
What is the purpose? I'm not sure, ask the one that wrote this stuff. You normally should not do something like this. There is a reason String is immutable.
Here how this method would look if the fields were public, i.e. without reflection:
public static void toUpperCase(String orig) {
orig.value = orig.toUpperCase().toCharArray();
}
As value is of type char[], you can't assign a String to this field - this is why you need the toCharArray call after .toUpperCase(). You will get an exception if you try to do this (I suppose ClassCastException), but the try-catch block there eats it away. (This gets us another lesson: Never use such empty catch blocks.)
Pay attention: This code might not do the correct thing, since the actual data of the original string might not start at the start of the char[]. Since you don't update the offset field, you will get IndexOutOfBoundsExceptions when using such a modified String. Also, the String object caches its hashCode, thus this will be wrong, too.
Here would be a correct way:
public static void toUpperCase(String orig) {
orig.value = orig.toUpperCase().toCharArray();
orig.offset = 0;
orig.hash = 0; // will be recalculated on next `.hashCode()` call.
}
With reflection, it looks like this:
public static void toUpperCase(String orig)
{
try
{
Field stringValue = String.class.getDeclaredField("value");
stringValue.setAccessible(true);
stringValue.set(orig, orig.toUpperCase().toCharArray());
Field stringOffset = String.class.getDeclaredField("offset");
stringOffset.setAccessible(true);
stringOffset.setInt(orig, 0);
Field stringHash = String.class.getDeclaredField("hash");
stringHash.setAccessible(true);
stringHash.setInt(orig, 0);
}
catch (Exception ex){
// at least print the output
ex.printStackTrace();
}
}
1.) Read Bohemian answer.
2.) Strings are internally stored in a char array, that's why you need to call toCharArray to set the field.
By default, String.toUpperCase() leaves the original string intact, whilst returning a new string object.
The function you defined above, edits the contents of the original string object in-place.
You change a final string with reflection for testing. Sometimes that string contains the path to a default location used in the production environment but not suitable for testing. Yet, that variable is referenced by several objects/methods your trigger in your test, and hence during your tests you might want to set it to a particular value.
As others said, it's probably something you don't want to be doing (often/ever).
Related
I'm not sure if that is the correct title to use but...
I have been given a program with test cases eg:
private void Test(){
CompareString string1 = new CompareString("String here");
CompareString string2 = new CompareString("String there");
assertEquals(expected, string1.compareTo(string2));
}
(I cannot change the Tests)
And a class
public class CompareString{
CompareString(String stringIn){ //your code here }
public boolean compareTo(CompareString aString){
//your code here
return true;
}
}
How do I get each string stored separately in order to do the comparison (and further work that is not described here)
With the test calling a 'new' instance, any variables I try to store and create are reset.
I tried to use the this.stringVariable = stringIn but this obviously will only store the last string instance ("string there" in this case). How can i store the strings separately?
My understanding is that you have a test which you cannot modify and you have a class used by this test which is partially implemented. You are asking about the implementation by asking "How do I get each string stored separately...". (Hint: this is called encapsulation, but read further)
I would like to point out a good resource for you The Java™ Tutorials
There you will learn about objects, data encapsulation, and constructors.
You task is to implement a constructor and method. After you learn the essentials I outlined you will be able to solve the problem. We can help you writing the coded but you would not learn how to do it yourself.
If you answer "yes", then give an example of how you can change a String.
If you answer "no", then explain why the Java designers don't let us modify Strings
No.
Strings are immutable by design. If you want to alter a string, I suggest you look at the StringBuffer class.
Immutable is a design pattern. As multiple entities can reference an object, immutability ensures that the object you reference has not been altered by some other object that references it.
No, you cannot modify a string after you create it. String is an example of an immutable class. If you have the following code:
String a = "hello";
String b = a;
a and b are now referencing the same object, but there is nothing that you can do to a that will change the string that b is referencing. You can only assign a to a different string.
I can only speculate about why the Java designers chose to have it this way, but I suspect a major reason is to facilitate code predictability. If you had something like this.
String h = "house";
String hcopy = h;
and at some later point you were able to modify h in some way like h.splitInHalf() that actually changed h's string, that would also affect hcopy and make things very frustrating when simple String variables changed values without warning.
If you would like a String-like structure that you can modify, you can use StringBuffer.
You can't not,each time you try to change it,a new String object is actually created.If you want to use changable String,use StringBuffer,StringBuilder instead.
There is a String constant pool in memory for better performence.
In my experience, that's a popular interview question. The objective is to figure out whether the interviewee knows the internals of the java.lang.String class.
So to answer the question, by design, Java does not allow to change a String object, those objects are immutable. However, though you should never do it in a real production code, it is technically possible to change a String object using the Reflection API.
import java.lang.reflect.Field;
public class App {
public static void main(String[] args) throws Exception {
String str = "Hello world!";
Field valueField = String.class.getDeclaredField("value");
valueField.setAccessible(true);
char[] c = (char[]) valueField.get(str);
c[1] = 'a';
System.out.println(str); //prints Hallo world!
}
}
As is often said, Java will let you shoot yourself in the foot after you use a magic spell of "import java.lang.reflect.*;"
I wrote the following java code, and i expected the compiler would complain about it. but i didn't get any errors. Why is this ?
public static void main(String[] args) {
Ba ba = new Ba();
ba.fetchSomeValues();
}
public String fetchSomeValues(){
return "Hello";
}
}
I am calling the method fetchSomeValues() which should return "Hello" (which is a string), and in the main method i have included ba.fetchSomeValues(); without initializing it to a String variable. The compiler doesn't complain Why is this ?
You don't have to assign return values to variables. You don't have to do anything at all with them.
Although it is usually not recommended to just drop some return value of a method.
A counter example might be the Map's put method which returns the previous value associated with the key. If you don't care whether there was a previous value or not you just simply ignore the return value.
Map<Integer, String> map = new HashMap<Integer, String>();
test.put(1, "one"); // we don't assign the return value since we don't care
So in a nutshell the compiler cannot tell whether you care about return values or not. It is only a problem if the value is significant in the context you are using the method and you ignore it.
This is absolutely valid to ignore the return value of a method (although not always recommended).
In Java, you can ignore return values as you just found out.
There's nothing wrong in this code.
fetchSomeValues() returns a string but you don't have to assign the value.
Normally you can write String returnedValue = a.fetchSomeValues() but it is not necessary
This behavior is not specific to Java, it is in fact the norm for all the mainstream (and even those less so) languages of today. Even in FP-languages, where the focus is on side-effect-free functions whose only point is the return value, this is allowed.
You should really ask yourself, Do I want to use a language that forces me to assign every return value? Would that be a convenient language to use?
ba.fetchSomeValues(); does return the string "Hello", but since you ain't got any left var (for example String s = ba.fetchSomeValues();), the object of string, that has the "Hello" value, created in the fetchSomeValues() method, just get unused.
As we all know, String is immutable in java. however, one can change it using reflection, by getting the Field and setting access level. (I know it is unadvised, I am not planning to do so, this question is pure theoretical).
my question: assuming I know what I am doing (and modify all fields as needed), will the program run properly? or does the jvm makes some optimizations that rely on String being immutable? will I suffer performance loss? if so, what assumption does it make? what will go wrong in the program
p.s. String is just an example, I am interested actually in a general answer, in addition to the example.
thanks!
After compilation some strings may refer to the one instance, so, you will edit more than you want and never know what else are you editing.
public static void main(String args[]) throws Exception {
String s1 = "Hello"; // I want to edit it
String s2 = "Hello"; // It may be anywhere and must not be edited
Field f = String.class.getDeclaredField("value");
f.setAccessible(true);
f.set(s1, "Doesn't say hello".toCharArray());
System.out.println(s2);
}
Output:
Doesn't say hello
You are definitely asking for trouble if you do this. Does that mean you will definitely see bugs right away? No. You might get away with it in a lot of cases, depending on what you're doing.
Here are a couple of cases where it would bite you:
You modify a string that happens to have been declared as literal somewhere within the code. For example you have a function and somewhere it is being called like function("Bob"); in this scenario the string "Bob" is changed throughout your app (this will also be true of string constants declared as final).
You modify a string which is used in substring operations, or which is the result of a substring operation. In Java, taking a substring of a string actually uses the same underlying character array as the source string, which means modifications to the source string will affect substrings (and vice versa).
You modify a string that happens to be used as a key in a map somewhere. It will no longer compare equal to its original value, so lookups will fail.
I know this question is about Java, but I wrote a blog post a while back illustrating just how insane your program may behave if you mutate a string in .NET. The situations are really quite similar.
The thing that jumps to mind for me is string interning - literals, anything in the constant pool and anything manually intern()ed points to the same string object. If you start messing around with the contents of an interned string literal, you may well see the exact same alterations on all the other literals using the same underlying object.
I'm not sure whether the above actually happens since I've never tried (in theory it will, I don't know if something happens under the scene to stop it but I doubt it) but it's things like that that could throw up potential problems. Of course, it could also throw up problems at the Java level through just passing multiple references to the same string around and then using a reflection attack to alter the object from one of the references. Most people (me included!) won't explicitly guard against that sort of thing in code, so using that attack with any code that's not your own, or your own code if you haven't guarded against that either, could cause all sorts of bizarre, horrible bugs.
It's an interesting area theoretically, but the more you dig around the more you see why anything along these lines is a bad idea!
Speaking outside of string, there's no performance enhancements I know of for an object being immutable (indeed I don't think the JVM can even tell at the moment whether an object is immutable, reflection attacks aside.) It could throw things like the checker-framework off though or anything that tries to statically analyse the code to guarantee it's immutable.
I'm pretty sure The JVM itself makes no assumptions about the immutability of Strings, as "immutability" in Java is not a language-level construct; it's a trait implied by a class's implementation, but cannot, as you note, be actually guaranteed in the presence of reflection. Thus, it also shouldn't be relevant to performance.
However, pretty much all Java code in existence (including the Standard API implementation) relies on Strings being immutable, and if you break that expectation, you'll see all kinds of bugs.
The private fields in the String class are the char[], the offset and length. Changing any of these should not have any adverse effect on any other object. But if you can somehow change the contents of the char[], then you could probably see some surprising side effects.
public static void main(String args[]){
String a = "test213";
String s = new String("test213");
try {
System.out.println(s);
System.out.println(a);
char[] value = (char[])getFieldValue(s, "value");
value[1] = 'a';
System.out.println(s);
System.out.println(a);
} catch (Exception e) {
e.printStackTrace();
}
}
static Object getFieldValue(String s,String fieldName) throws SecurityException, NoSuchFieldException, IllegalArgumentException, IllegalAccessException {
Object chars = null;
Field innerCharArray = String.class.getDeclaredField(fieldName);
innerCharArray.setAccessible(true);
chars = innerCharArray.get(s);
return chars;
}
Changing value of S will change the literal of a as mentioned in all answers.
To demonstrate how can it screw up a program:
System.out.print("Initial: "); System.out.println(addr);
editIntStr("ADDR_PLACEH", "192.168.1.1");
System.out.print("From var: "); System.out.println(addr);//
System.out.print("Hardcoded: "); System.out.println("ADDR_PLACEH");
System.out.print("Substring: "); System.out.println("ADDR_PLACE" + "H".substring(0));
System.out.print("Equals test: "); System.out.println("ADDR_PLACEH".equals("192.168.1.1"));
System.out.print("Equals test with substring: "); System.out.println(("ADDR_PLACE" + "H".substring(0)).equals("192.168.1.1"));
Output:
Initial: ADDR_PLACEH
From var: 192.168.1.1
Hardcoded: 192.168.1.1
Substring: ADDR_PLACEH
Equals test: true
Equals test with substring: false
The result of the first Equals test is weird, isn't it? You can't expect your fellow programmers to figure out why is Java thinking they are equal...
Full test code: http://pastebin.com/vbstfWX1
I have helper class with this static variable that is used for passing data between two classes.
public class Helper{
public static String paramDriveMod;//this is the static variable in first calss
}
this variable is used in following second class mathod
public void USB_HandleMessage(char []USB_RXBuffer){
int type=USB_RXBuffer[2];
MESSAGES ms=MESSAGES.values()[type];
switch(ms){
case READ_PARAMETER_VALUE: // read parameter values
switch(prm){
case PARAMETER_DRIVE_MODE: // paramet drive mode
Helper.paramDriveMod =(Integer.toString(((USB_RXBuffer[4]<< 8)&0xff00)));
System.out.println(Helper.paramDriveMod+"drive mode is selectd ");
//here it shows the value that I need...........
}
}
//let say end switch and method
}
and the following is an third class method use the above class method
public void buttonSwitch(int value) throws InterruptedException{
boolean bool=true;
int c=0;
int delay=(int) Math.random();
while(bool){
int param=3;
PARAMETERS prm=PARAMETERS.values()[param];
switch(value){
case 0:
value=1;
while(c<5){
Thread.sleep(delay);
protocol.onSending(3,prm.PARAMETER_DRIVE_MODE.ordinal(),dataToRead,dataToRead.length);//read drive mode
System.out.println(Helper.paramDriveMod+" drive mode is ..........in wile loop");//here it shows null value
}
//break; ?
}
}
//let say end switch and method
}
what is the reason that this variable lose its value?
Could I suggest that to pass data between classes, you use separate objects instead of a global variable?
It's not at all clear how you expect the code in protocolImpl to get executed - as templatetypedef mentions, you haven't shown valid Java code in either that or the param class (neither of which follows Java naming conventions).
A short but complete example would really help, but in general I would suggest you avoid using this pattern in the first place. Think in terms of objects, not global variables.
As I understand it, a "Class" (Not just an instance but the entire class object) Can be garbage collected just like any other unreferenced object--a static variable in that class instance won't prevent the GC from collecting your class.
I just came here because I think I'm seeing this behavior in a singleton and I wanted to see if anyone else noticed it (I've never had to research the problem before-and this knowledge is like a decade old from the back of my brain so I'm unsure of it's reliability at this point).
Going to go continue research now.
Just found this question, check the accepted answer--looks like it's unlikely that a static will be lost due to GC, but possible.
Are static fields open for garbage collection?
A Variable never "loses" its value. You set it to "null" somewhere, but your code here is not enough to tell whats going on. The only place here where you set it is this line:
Helper.paramDriveMod =(Integer.toString(((USB_RXBuffer[4]<< 8)&0xff00)));
But if you pass "null" to toString() you get some null pointer exception...so I would assume that this line never gets hit and so you get the "null" value as you dont initialize paramDriveMod with some other value.
Don't use static variable until you are in some critical situation. You can use getter setter instead
Could it be that you may be confusing static with final? Static variables' values can change. Final variables' values can not.
The execution flow not shown - may be the 3rd code:
while(c<5){
Thread.sleep(delay);
protocol.onSending(3,prm.PARAMETER_DRIVE_MODE.ordinal(),dataToRead,dataToRead.length);//read drive mode
System.out.println(Helper.paramDriveMod+" drive mode is ..........in wile loop");//here it shows null value "
is executed before the second code
switch(ms)
{
case READ_PARAMETER_VALUE: // read parameter values
switch(prm){
case PARAMETER_DRIVE_MODE: // paramet drive mode
Helper.paramDriveMod =(Integer.toString(((USB_RXBuffer[4]<< 8)&0xff00)));