Related
I am unable to get what are the scenarios where we need an immutable class.
Have you ever faced any such requirement? or can you please give us any real example where we should use this pattern.
The other answers seem too focused on explaining why immutability is good. It is very good and I use it whenever possible. However, that is not your question. I'll take your question point by point to try to make sure you're getting the answers and examples you need.
I am unable to get what are the scenarios where we need an immutable class.
"Need" is a relative term here. Immutable classes are a design pattern that, like any paradigm/pattern/tool, is there to make constructing software easier. Similarly, plenty of code was written before the OO paradigm came along, but count me among the programmers that "need" OO. Immutable classes, like OO, aren't strictly needed, but I going to act like I need them.
Have you ever faced any such requirement?
If you aren't looking at the objects in the problem domain with the right perspective, you may not see a requirement for an immutable object. It might be easy to think that a problem domain doesn't require any immutable classes if you're not familiar when to use them advantageously.
I often use immutable classes where I think of a given object in my problem domain as a value or fixed instance. This notion is sometimes dependent on perspective or viewpoint, but ideally, it will be easy to switch into the right perspective to identify good candidate objects.
You can get a better sense of where immutable objects are really useful (if not strictly necessary) by making sure you read up on various books/online articles to develop a good sense of how to think about immutable classes. One good article to get you started is Java theory and practice: To mutate or not to mutate?
I'll try to give a couple of examples below of how one can see objects in different perspectives (mutable vs immutable) to clarify what I mean by perspective.
... can you please give us any real example where we should use this pattern.
Since you asked for real examples I'll give you some, but first, let's start with some classic examples.
Classic Value Objects
Strings and integers are often thought of as values. Therefore it's not surprising to find that String class and the Integer wrapper class (as well as the other wrapper classes) are immutable in Java. A color is usually thought of as a value, thus the immutable Color class.
Counterexample
In contrast, a car is not usually thought of as a value object. Modeling a car usually means creating a class that has changing state (odometer, speed, fuel level, etc). However, there are some domains where it car may be a value object. For example, a car (or specifically a car model) might be thought of as a value object in an app to look up the proper motor oil for a given vehicle.
Playing Cards
Ever write a playing card program? I did. I could have represented a playing card as a mutable object with a mutable suit and rank. A draw-poker hand could be 5 fixed instances where replacing the 5th card in my hand would mean mutating the 5th playing card instance into a new card by changing its suit and rank ivars.
However, I tend to think of a playing card as an immutable object that has a fixed unchanging suit and rank once created. My draw poker hand would be 5 instances and replacing a card in my hand would involve discarding one of those instance and adding a new random instance to my hand.
Map Projection
One last example is when I worked on some map code where the map could display itself in various projections. The original code had the map use a fixed, but mutatable projection instance (like the mutable playing card above). Changing the map projection meant mutating the map's projection instance's ivars (projection type, center point, zoom, etc).
However, I felt the design was simpler if I thought of a projection as an immutable value or fixed instance. Changing the map projection meant having the map reference a different projection instance rather than mutating the map's fixed projection instance. This also made it simpler to capture named projections such as MERCATOR_WORLD_VIEW.
Immutable classes are in general much simpler to design, implement and use correctly. An example is String: the implementation of java.lang.String is significantly simpler than that of std::string in C++, mostly due to its immutability.
One particular area where immutability makes an especially big difference is concurrency: immutable objects can safely be shared among multiple threads, whereas mutable objects must be made thread-safe via careful design and implementation - usually this is far from a trivial task.
Update: Effective Java 2nd Edition tackles this issue in detail - see Item 15: Minimize mutability.
See also these related posts:
non-technical benefits of having string-type immutable
Downsides to immutable objects in Java?
Effective Java by Joshua Bloch outlines several reasons to write immutable classes:
Simplicity - each class is in one state only
Thread Safe - because the state cannot be changed, no synchronization is required
Writing in an immutable style can lead to more robust code. Imagine if Strings weren't immutable; Any getter methods that returned a String would require the implementation to create a defensive copy before the String was returned - otherwise a client may accidentally or maliciously break that state of the object.
In general it is good practise to make an object immutable unless there are severe performance problems as a result. In such circumstances, mutable builder objects can be used to build immutable objects e.g. StringBuilder
Hashmaps are a classic example. It's imperative that the key to a map be immutable. If the key is not immutable, and you change a value on the key such that hashCode() would result in a new value, the map is now broken (a key is now in the wrong location in the hash table.).
Java is practically one and all references. Sometimes an instance is referenced multiple times. If you change such an instance, it would be reflected into all its references. Sometimes you simply don't want to have this to improve robustness and threadsafety. Then an immutable class is useful so that one is forced to create a new instance and reassign it to the current reference. This way the original instance of the other references remain untouched.
Imagine how Java would look like if String was mutable.
Let's take an extreme case: integer constants. If I write a statement like "x=x+1" I want to be 100% confidant that the number "1" will not somehow become 2, no matter what happens anywhere else in the program.
Now okay, integer constants are not a class, but the concept is the same. Suppose I write:
String customerId=getCustomerId();
String customerName=getCustomerName(customerId);
String customerBalance=getCustomerBalance(customerid);
Looks simple enough. But if Strings were not immutable, then I would have to consider the possibility that getCustomerName could change customerId, so that when I call getCustomerBalance, I am getting the balance for a different customer. Now you might say, "Why in the world would someone writing a getCustomerName function make it change the id? That would make no sense." But that's exactly where you could get in trouble. The person writing the above code might take it as just obvious that the functions would not change the parameter. Then someone comes along who has to modify another use of that function to handle the case where where a customer has multiple accounts under the same name. And he says, "Oh, here's this handy getCustomer name function that's already looking up the name. I'll just make that automatically change the id to the next account with the same name, and put it in a loop ..." And then your program starts mysteriously not working. Would that be bad coding style? Probably. But it's precisely a problem in cases where the side effect is NOT obvious.
Immutability simply means that a certain class of objects are constants, and we can treat them as constants.
(Of course the user could assign a different "constant object" to a variable. Someone can write
String s="hello";
and then later write
s="goodbye";
Unless I make the variable final, I can't be sure that it's not being changed within my own block of code. Just like integer constants assure me that "1" is always the same number, but not that "x=1" will never be changed by writing "x=2". But I can be confidant that if I have a handle to an immutable object, that no function I pass it to can change it on me, or that if I make two copies of it, that a change to the variable holding one copy will not change the other. Etc.
We don't need immutable classes, per se, but they can certainly make some programming tasks easier, especially when multiple threads are involved. You don't have to perform any locking to access an immutable object, and any facts that you've already established about such an object will continue to be true in the future.
There are various reason for immutability:
Thread Safety: Immutable objects cannot be changed nor can its internal state change, thus there's no need to synchronise it.
It also guarantees that whatever I send through (through a network) has to come in the same state as previously sent. It means that nobody (eavesdropper) can come and add random data in my immutable set.
It's also simpler to develop. You guarantee that no subclasses will exist if an object is immutable. E.g. a String class.
So, if you want to send data through a network service, and you want a sense of guarantee that you will have your result exactly the same as what you sent, set it as immutable.
My 2 cents for future visitors:
2 scenarios where immutable objects are good choices are:
In multi-threading
Concurrency issues in multi-threaded environment can very well be solved by synchronization but synchronization is costly affair (wouldn't dig here on "why"), so if you are using immutable objects then there is no synchronization to solve concurrency issue because state of immutable objects cannot be changed, and if state cannot be changed then all threads can seamless access the object. So, immutable objects makes a great choice for shared objects in multi-threaded environment.
As key for hash based collections
One of the most important thing to note when working with hash based collection is that key should be such that its hashCode() should always return the same value for the lifetime of the object, because if that value is changed then old entry made into the hash based collection using that object cannot be retrieved, hence it would cause memory leak. Since state of immutable objects cannot be changed so they makes a great choice as key in hash based collection. So, if you are using immutable object as key for hash based collection then you can be sure that there will not be any memory leak because of that (of course there can still be memory leak when the object used as key is not referenced from anywhere else, but that's not the point here).
I'm going to attack this from a different perspective. I find immutable objects make life easier for me when reading code.
If I have a mutable object I am never sure what its value is if it's ever used outside of my immediate scope. Let's say I create MyMutableObject in a method's local variables, fill it out with values, then pass it to five other methods. ANY ONE of those methods can change my object's state, so one of two things has to occur:
I have to keep track of the bodies of five additional methods while thinking about my code's logic.
I have to make five wasteful defensive copies of my object to ensure that the right values get passed to each method.
The first makes reasoning about my code difficult. The second makes my code suck in performance -- I'm basically mimicking an immutable object with copy-on-write semantics anyway, but doing it all the time whether or not the called methods actually modify my object's state.
If I instead use MyImmutableObject, I can be assured that what I set is what the values will be for the life of my method. There's no "spooky action at a distance" that will change it out from under me and there's no need for me to make defensive copies of my object before invoking the five other methods. If the other methods want to change things for their purposes they have to make the copy – but they only do this if they really have to make a copy (as opposed to my doing it before each and every external method call). I spare myself the mental resources of keeping track of methods which may not even be in my current source file, and I spare the system the overhead of endlessly making unnecessary defensive copies just in case.
(If I go outside of the Java world and into, say, the C++ world, among others, I can get even trickier. I can make the objects appear as if they're mutable, but behind the scenes make them transparently clone on any kind of state change—that's copy-on-write—with nobody being the wiser.)
Immutable objects are instances whose states do not change once initiated.
The use of such objects is requirement specific.
Immutable class is good for caching purpose and it is thread safe.
By the virtue of immutability you can be sure that the behavior/state of the underlying immutable object do not to change, with that you get added advantage of performing additional operations:
You can use multiple core/processing(concurrent/parallel processing) with ease(as the sequence of operations will no longer matter.)
Can do caching for expensive operations (as you are sure of the same
result).
Can do debugging with ease(as the history of run will not be a concern
anymore)
Using the final keyword doesn't necessarily make something immutable:
public class Scratchpad {
public static void main(String[] args) throws Exception {
SomeData sd = new SomeData("foo");
System.out.println(sd.data); //prints "foo"
voodoo(sd, "data", "bar");
System.out.println(sd.data); //prints "bar"
}
private static void voodoo(Object obj, String fieldName, Object value) throws Exception {
Field f = SomeData.class.getDeclaredField("data");
f.setAccessible(true);
Field modifiers = Field.class.getDeclaredField("modifiers");
modifiers.setAccessible(true);
modifiers.setInt(f, f.getModifiers() & ~Modifier.FINAL);
f.set(obj, "bar");
}
}
class SomeData {
final String data;
SomeData(String data) {
this.data = data;
}
}
Just an example to demonstrate that the "final" keyword is there to prevent programmer error, and not much more. Whereas reassigning a value lacking a final keyword can easily happen by accident, going to this length to change a value would have to be done intentionally. It's there for documentation and to prevent programmer error.
Immutable data structures can also help when coding recursive algorithms. For example, say that you're trying to solve a 3SAT problem. One way is to do the following:
Pick an unassigned variable.
Give it the value of TRUE. Simplify the instance by taking out clauses that are now satisfied, and recur to solve the simpler instance.
If the recursion on the TRUE case failed, then assign that variable FALSE instead. Simplify this new instance, and recur to solve it.
If you have a mutable structure to represent the problem, then when you simplify the instance in the TRUE branch, you'll either have to:
Keep track of all changes you make, and undo them all once you realize the problem can't be solved. This has large overhead because your recursion can go pretty deep, and it's tricky to code.
Make a copy of the instance, and then modify the copy. This will be slow because if your recursion is a few dozen levels deep, you'll have to make many many copies of the instance.
However if you code it in a clever way, you can have an immutable structure, where any operation returns an updated (but still immutable) version of the problem (similar to String.replace - it doesn't replace the string, just gives you a new one). The naive way to implement this is to have the "immutable" structure just copy and make a new one on any modification, reducing it to the 2nd solution when having a mutable one, with all that overhead, but you can do it in a more efficient way.
One of the reasons for the "need" for immutable classes is the combination of passing everything by reference and having no support for read-only views of an object (i.e. C++'s const).
Consider the simple case of a class having support for the observer pattern:
class Person {
public string getName() { ... }
public void registerForNameChange(NameChangedObserver o) { ... }
}
If string were not immutable, it would be impossible for the Person class to implement registerForNameChange() correctly, because someone could write the following, effectively modifying the person's name without triggering any notification.
void foo(Person p) {
p.getName().prepend("Mr. ");
}
In C++, getName() returning a const std::string& has the effect of returning by reference and preventing access to mutators, meaning immutable classes are not necessary in that context.
They also give us a guarantee. The guarantee of immutability means that we can expand on them and create new patters for efficiency that are otherwise not possible.
http://en.wikipedia.org/wiki/Singleton_pattern
One feature of immutable classes which hasn't yet been called out: storing a reference to a deeply-immutable class object is an efficient means of storing all of the state contained therein. Suppose I have a mutable object which uses a deeply-immutable object to hold 50K worth of state information. Suppose, further, that I wish to on 25 occasions make a "copy" of my original (mutable) object (e.g. for an "undo" buffer); the state could change between copy operations, but usually doesn't. Making a "copy" of the mutable object would simply require copying a reference to its immutable state, so 20 copies would simply amount to 20 references. By contrast, if the state were held in 50K worth of mutable objects, each of the 25 copy operations would have to produce its own copy of 50K worth of data; holding all 25 copies would require holding over a meg worth of mostly-duplicated data. Even though the first copy operation would produce a copy of the data that will never change, and the other 24 operations could in theory simply refer back to that, in most implementations there would be no way for the second object asking for a copy of the information to know that an immutable copy already exists(*).
(*) One pattern that can sometimes be useful is for mutable objects to have two fields to hold their state--one in mutable form and one in immutable form. Objects can be copied as mutable or immutable, and would begin life with one or the other reference set. As soon as the object wants to change its state, it copies the immutable reference to the mutable one (if it hasn't been done already) and invalidates the immutable one. When the object is copied as immutable, if its immutable reference isn't set, an immutable copy will be created and the immutable reference pointed to that. This approach will require a few more copy operations than would a "full-fledged copy on write" (e.g. asking to copy an object which has been mutated since the last copy would require a copy operation, even if the original object is never again mutated) but it avoids the threading complexities that FFCOW would entail.
Why Immutable class?
Once an object is instantiated it state cannot be changed in lifetime. Which also makes it thread safe.
Examples :
Obviously String, Integer and BigDecimal etc. Once these values are created cannot be changed in lifetime.
Use-case :
Once Database connection object is created with its configuration values you might not need to change its state where you can use an immutable class
from Effective Java;
An immutable class is simply a class whose instances cannot be modified. All of
the information contained in each instance is provided when it is created and is
fixed for the lifetime of the object. The Java platform libraries contain many
immutable classes, including String, the boxed primitive classes, and BigInte-
ger and BigDecimal. There are many good reasons for this: Immutable classes
are easier to design, implement and use than mutable classes. They are less prone
to error and are more secure.
An immutable class is good for caching purposes because you don't have to worry about the value changes. Another benefit of an immutable class is that it is inherently thread-safe, so you don't have to worry about thread safety in case of a multi-threaded environment.
Integer, Character, Double, etc. -- all these are immutable classes like String. String has Stringpool to save memory but why don't these wrappers have similar pools?
I have checked: Integer has a similar pool only up to 127, but not more than that.
Unless someone can find a design document from Gosling, et. al., circa 1994 or so that specifically addresses this, it's impossible to say for certain.
One likely reason is that the complexity and overhead weren't deemed worth the benefit. Strings are A) a lot bigger and B) a lot more common than Integer, Long, and such, as mostly people use primitives whenever they can, only using the wrappers where they can't avoid it.
IMO, String is the most commonly used type in java. As an argument to load a class, a param to connect to DB/network connections, to store (almost) each and every thing - the list is long. Usage scenario for rest other primitives/wrapper types combined together would also be negligible compared to String - in any application.
If used in an un-optimized manner (e.g. implemented without Stringpool), performance would be up for a toss - hence it does make sense to have a pool of (only) String.
Respected Sir!
As i have not learnt java yet but most people say that C++ has more OOP features than Java, I would like to know that what are the features that c++ has and java doesn't. Please explain.
From java.sun.com
Java omits many rarely used, poorly understood, confusing features of C++ that in our experience bring more grief than benefit. These omitted features primarily consist of operator overloading (although the Java language does have method overloading), multiple inheritance, and extensive automatic coercions.
For a more detailed comparison check out this Wikipedia page.
This might be controversial, but some authors say that using free functions might be more object oriented than writting methods for everything. So by those author's point of view, free functions in C++ make it more OO than Java (not having them).
The explanation is that there are some operations that are not really performed on an instance of an object, but rather externally, and that having externally defined operations for those cases improves the OO design. Some of the cases are operations on two objects that are not naturally an operation of either one. Incrementing a value is clearly an operation on the value, but creating a new value with the sum of two others (or concatenating) are not really operations on the instance. When you write:
String a = "Hello";
String b = " World";
String c = a.append( b );
The append operation is not performed on a: after the operation a is still "Hello". The operation is not performed on b either, it is an external operation that is performed on both a and b. In this particular example, the most OO way of implementing the operation would be providing a new constructor that takes two arguments (after all, the operation is performed on the new string), but another solution would be providing an external function append that takes two strings and returns a third one.
In this case, where both instances are of the same type, the operation can naturally be performed as a static method of the type, but when you mix different types the operation is not really part of either one, and in some cases it might end up being of a completely different type. In some cases free functions are faked in Java as in the Collections java class, it does not represent any OO element, but is rather simple glue to tie free functions are static methods because the language does not have support for the former. Note that all those algorithms are not performed on the collection nor an instance of the contained type.
Multiple inheritance
Template Metaprogramming
C++ is a huge language and it is common for C++ developers to only use a small subset during development. These language features are often cited as being the most dangerous/difficult part of C++ to master and are often avoided.
In C++ you can bypass the OO model and make up your own stuff, whereas in Java, the VM decides that you cannot. Very simplified, but you know... who has the time.
I suppose some would consider operator overloading an object oriented feature(if you view binary operators not much different then class methods).
Some links, that give some good answers:
Java is not pure a OOP language (... but I don't care ;) )
Comparing C++ and Java (Java Coffee Break article)
Comparing Java and C++ (Wikipedia comprehensive comparision)
Be careful. There are multiple definitions of OOP out there. For example, the definitions in Wegner 87 and Booch et al 91 are different to what people say in Java is not pure a OOP language.
All this "my language is more OO than your language" stuff is a bit pointless, IMO.
I'm looking at some Java code that are maintained by other parts of the company, incidentally some former C and C++ devs. One thing that is ubiquitous is the use of static integer constants, such as
class Engine {
private static int ENGINE_IDLE = 0;
private static int ENGINE_COLLECTING = 1;
...
}
Besides a lacking 'final' qualifier, I'm a bit bothered by this kind of code. What I would have liked to see, being trained primarily in Java from school, would be something more like
class Engine {
private enum State { Idle, Collecting };
...
}
However, the arguments fail me. Why, if at all, is the latter better than the former?
Why, if at all, is the latter better
than the former?
It is much better because it gives you type safety and is self-documenting. With integer constants, you have to look at the API doc to find out what values are valid, and nothing prevents you from using invalid values (or, perhaps worse, integer constants that are completely unrelated). With Enums, the method signature tells you directly what values are valid (IDE autocompletion will work) and it's impossible to use an invalid value.
The "integer constant enums" pattern is unfortunately very common, even in the Java Standard API (and widely copied from there) because Java did not have Enums prior to Java 5.
An excerpt from the official docs, http://java.sun.com/j2se/1.5.0/docs/guide/language/enums.html:
This pattern has many problems, such as:
Not typesafe - Since a season is just an int you can pass in any other int value where a season is required, or add two seasons together (which makes no sense).
No namespace - You must prefix constants of an int enum with a string (in this case SEASON_) to avoid collisions with other int enum types.
Brittleness - Because int enums are compile-time constants, they are compiled into clients that use them. If a new constant is added between two existing constants or the order is changed, clients must be recompiled. If they are not, they will still run, but their behavior will be undefined.
Printed values are uninformative - Because they are just ints, if you print one out all you get is a number, which tells you nothing about what it represents, or even what type it is.
And this just about covers it. A one word argument would be that enums are just more readable and informative.
One more thing is that enums, like classes. can have fields and methods. This gives you the option to encompass some additional information about each type of state in the enum itself.
Because enums provide type safety. In the first case, you can pass any integer and if you use enum you are restricted to Idle and Collecting.
FYI : http://www.javapractices.com/topic/TopicAction.do?Id=1.
By using an int to refer to a constant, you're not forcing someone to actually use that constant. So, for example, you might have a method which takes an engine state, to which someone might happy invoke with:
engine.updateState(1);
Using an enum forces the user to stick with the explanatory label, so it is more legible.
There is one situation when static constance is preferred (rather that the code is legacy with tonne of dependency) and that is when the member of that value are not/may later not be finite.
Imagine if you may later add new state like Collected. The only way to do it with enum is to edit the original code which can be problem if the modification is done when there are already a lot of code manipulating it. Other than this, I personally see no reason why enum is not used.
Just my thought.
Readabiliy - When you use enums and do State.Idle, the reader immediately knows that you are talking about an idle state. Compare this with 4 or 5.
Type Safety - When use enum, even by mistake the user cannot pass a wrong value, as compiler will force him to use one of the pre-declared values in the enum. In case of simple integers, he could even pass -3274.
Maintainability - If you wanted to add a new state Waiting, then it would be very easy to add new state by adding a constant Waiting in your enum State without casuing any confusion.
The reasons from the spec, which Lajcik quotes, are explained in more detail in Josh Bloch's Effective Java, Item 30. If you have access to that book, I'd recommend perusing it. Java Enums are full-fledged classes which is why you get compile-time type safety. You can also give them behavior, giving you better encapsulation.
The former is common in code that started pre-1.5. Actually, another common idiom was to define your constants in an interface, because they didn't have any code.
Enums also give you a great deal of flexibility. Since Enums are essentially classes, you can augment them with useful methods (such as providing an internationalized resource string corresponding to a certain value in the enumeration, converting back and forth between instances of the enum type and other representations that may be required, etc.)
Why is it that they decided to make String immutable in Java and .NET (and some other languages)? Why didn't they make it mutable?
According to Effective Java, chapter 4, page 73, 2nd edition:
"There are many good reasons for this: Immutable classes are easier to
design, implement, and use than mutable classes. They are less prone
to error and are more secure.
[...]
"Immutable objects are simple. An immutable object can be in
exactly one state, the state in which it was created. If you make sure
that all constructors establish class invariants, then it is
guaranteed that these invariants will remain true for all time, with
no effort on your part.
[...]
Immutable objects are inherently thread-safe; they require no synchronization. They cannot be corrupted by multiple threads
accessing them concurrently. This is far and away the easiest approach
to achieving thread safety. In fact, no thread can ever observe any
effect of another thread on an immutable object. Therefore,
immutable objects can be shared freely
[...]
Other small points from the same chapter:
Not only can you share immutable objects, but you can share their internals.
[...]
Immutable objects make great building blocks for other objects, whether mutable or immutable.
[...]
The only real disadvantage of immutable classes is that they require a separate object for each distinct value.
There are at least two reasons.
First - security http://www.javafaq.nu/java-article1060.html
The main reason why String made
immutable was security. Look at this
example: We have a file open method
with login check. We pass a String to
this method to process authentication
which is necessary before the call
will be passed to OS. If String was
mutable it was possible somehow to
modify its content after the
authentication check before OS gets
request from program then it is
possible to request any file. So if
you have a right to open text file in
user directory but then on the fly
when somehow you manage to change the
file name you can request to open
"passwd" file or any other. Then a
file can be modified and it will be
possible to login directly to OS.
Second - Memory efficiency http://hikrish.blogspot.com/2006/07/why-string-class-is-immutable.html
JVM internally maintains the "String
Pool". To achive the memory
efficiency, JVM will refer the String
object from pool. It will not create
the new String objects. So, whenever
you create a new string literal, JVM
will check in the pool whether it
already exists or not. If already
present in the pool, just give the
reference to the same object or create
the new object in the pool. There will
be many references point to the same
String objects, if someone changes the
value, it will affect all the
references. So, sun decided to make it
immutable.
Actually, the reasons string are immutable in java doesn't have much to do with security. The two main reasons are the following:
Thead Safety:
Strings are extremely widely used type of object. It is therefore more or less guaranteed to be used in a multi-threaded environment. Strings are immutable to make sure that it is safe to share strings among threads. Having an immutable strings ensures that when passing strings from thread A to another thread B, thread B cannot unexpectedly modify thread A's string.
Not only does this help simplify the already pretty complicated task of multi-threaded programming, but it also helps with performance of multi-threaded applications. Access to mutable objects must somehow be synchronized when they can be accessed from multiple threads, to make sure that one thread doesn't attempt to read the value of your object while it is being modified by another thread. Proper synchronization is both hard to do correctly for the programmer, and expensive at runtime. Immutable objects cannot be modified and therefore do not need synchronization.
Performance:
While String interning has been mentioned, it only represents a small gain in memory efficiency for Java programs. Only string literals are interned. This means that only the strings which are the same in your source code will share the same String Object. If your program dynamically creates string that are the same, they will be represented in different objects.
More importantly, immutable strings allow them to share their internal data. For many string operations, this means that the underlying array of characters does not need to be copied. For example, say you want to take the five first characters of String. In Java, you would calls myString.substring(0,5). In this case, what the substring() method does is simply to create a new String object that shares myString's underlying char[] but who knows that it starts at index 0 and ends at index 5 of that char[]. To put this in graphical form, you would end up with the following:
| myString |
v v
"The quick brown fox jumps over the lazy dog" <-- shared char[]
^ ^
| | myString.substring(0,5)
This makes this kind of operations extremely cheap, and O(1) since the operation neither depends on the length of the original string, nor on the length of the substring we need to extract. This behavior also has some memory benefits, since many strings can share their underlying char[].
Thread safety and performance. If a string cannot be modified it is safe and quick to pass a reference around among multiple threads. If strings were mutable, you would always have to copy all of the bytes of the string to a new instance, or provide synchronization. A typical application will read a string 100 times for every time that string needs to be modified. See wikipedia on immutability.
One should really ask, "why should X be mutable?" It's better to default to immutability, because of the benefits already mentioned by Princess Fluff. It should be an exception that something is mutable.
Unfortunately most of the current programming languages default to mutability, but hopefully in the future the default is more on immutablity (see A Wish List for the Next Mainstream Programming Language).
Wow! I Can't believe the misinformation here. Strings being immutable have nothing with security. If someone already has access to the objects in a running application (which would have to be assumed if you are trying to guard against someone 'hacking' a String in your app), they would certainly be a plenty of other opportunities available for hacking.
It's a quite novel idea that the immutability of String is addressing threading issues. Hmmm ... I have an object that is being changed by two different threads. How do I resolve this? synchronize access to the object? Naawww ... let's not let anyone change the object at all -- that'll fix all of our messy concurrency issues! In fact, let's make all objects immutable, and then we can removed the synchonized contruct from the Java language.
The real reason (pointed out by others above) is memory optimization. It is quite common in any application for the same string literal to be used repeatedly. It is so common, in fact, that decades ago, many compilers made the optimization of storing only a single instance of a String literal. The drawback of this optimization is that runtime code that modifies a String literal introduces a problem because it is modifying the instance for all other code that shares it. For example, it would be not good for a function somewhere in an application to change the String literal "dog" to "cat". A printf("dog") would result in "cat" being written to stdout. For that reason, there needed to be a way of guarding against code that attempts to change String literals (i. e., make them immutable). Some compilers (with support from the OS) would accomplish this by placing String literal into a special readonly memory segment that would cause a memory fault if a write attempt was made.
In Java this is known as interning. The Java compiler here is just following an standard memory optimization done by compilers for decades. And to address the same issue of these String literals being modified at runtime, Java simply makes the String class immutable (i. e, gives you no setters that would allow you to change the String content). Strings would not have to be immutable if interning of String literals did not occur.
String is not a primitive type, yet you normally want to use it with value semantics, i.e. like a value.
A value is something you can trust won't change behind your back.
If you write: String str = someExpr();
You don't want it to change unless YOU do something with str.
String as an Object has naturally pointer semantics, to get value semantics as well it needs to be immutable.
One factor is that, if Strings were mutable, objects storing Strings would have to be careful to store copies, lest their internal data change without notice. Given that Strings are a fairly primitive type like numbers, it is nice when one can treat them as if they were passed by value, even if they are passed by reference (which also helps to save on memory).
I know this is a bump, but...
Are they really immutable?
Consider the following.
public static unsafe void MutableReplaceIndex(string s, char c, int i)
{
fixed (char* ptr = s)
{
*((char*)(ptr + i)) = c;
}
}
...
string s = "abc";
MutableReplaceIndex(s, '1', 0);
MutableReplaceIndex(s, '2', 1);
MutableReplaceIndex(s, '3', 2);
Console.WriteLine(s); // Prints 1 2 3
You could even make it an extension method.
public static class Extensions
{
public static unsafe void MutableReplaceIndex(this string s, char c, int i)
{
fixed (char* ptr = s)
{
*((char*)(ptr + i)) = c;
}
}
}
Which makes the following work
s.MutableReplaceIndex('1', 0);
s.MutableReplaceIndex('2', 1);
s.MutableReplaceIndex('3', 2);
Conclusion: They're in an immutable state which is known by the compiler. Of couse the above only applies to .NET strings as Java doesn't have pointers. However a string can be entirely mutable using pointers in C#. It's not how pointers are intended to be used, has practical usage or is safely used; it's however possible, thus bending the whole "mutable" rule. You can normally not modify an index directly of a string and this is the only way. There is a way that this could be prevented by disallowing pointer instances of strings or making a copy when a string is pointed to, but neither is done, which makes strings in C# not entirely immutable.
For most purposes, a "string" is (used/treated as/thought of/assumed to be) a meaningful atomic unit, just like a number.
Asking why the individual characters of a string are not mutable is therefore like asking why the individual bits of an integer are not mutable.
You should know why. Just think about it.
I hate to say it, but unfortunately we're debating this because our language sucks, and we're trying to using a single word, string, to describe a complex, contextually situated concept or class of object.
We perform calculations and comparisons with "strings" similar to how we do with numbers. If strings (or integers) were mutable, we'd have to write special code to lock their values into immutable local forms in order to perform any kind of calculation reliably. Therefore, it is best to think of a string like a numeric identifier, but instead of being 16, 32, or 64 bits long, it could be hundreds of bits long.
When someone says "string", we all think of different things. Those who think of it simply as a set of characters, with no particular purpose in mind, will of course be appalled that someone just decided that they should not be able to manipulate those characters. But the "string" class isn't just an array of characters. It's a STRING, not a char[]. There are some basic assumptions about the concept we refer to as a "string", and it generally can be described as meaningful, atomic unit of coded data like a number. When people talk about "manipulating strings", perhaps they're really talking about manipulating characters to build strings, and a StringBuilder is great for that. Just think a bit about what the word "string" truly means.
Consider for a moment what it would be like if strings were mutable. The following API function could be tricked into returning information for a different user if the mutable username string is intentionally or unintentionally modified by another thread while this function is using it:
string GetPersonalInfo( string username, string password )
{
string stored_password = DBQuery.GetPasswordFor( username );
if (password == stored_password)
{
//another thread modifies the mutable 'username' string
return DBQuery.GetPersonalInfoFor( username );
}
}
Security isn't just about 'access control', it's also about 'safety' and 'guaranteeing correctness'. If a method can't be easily written and depended upon to perform a simple calculation or comparison reliably, then it's not safe to call it, but it would be safe to call into question the programming language itself.
Immutability is not so closely tied to security. For that, at least in .NET, you get the SecureString class.
Later edit: In Java you will find GuardedString, a similar implementation.
The decision to have string mutable in C++ causes a lot of problems, see this excellent article by Kelvin Henney about Mad COW Disease.
COW = Copy On Write.
It's a trade off. Strings go into the String pool and when you create multiple identical Strings they share the same memory. The designers figured this memory saving technique would work well for the common case, since programs tend to grind over the same strings a lot.
The downside is that concatenations make a lot of extra Strings that are only transitional and just become garbage, actually harming memory performance. You have StringBuffer and StringBuilder (in Java, StringBuilder is also in .NET) to use to preserve memory in these cases.
Strings in Java are not truly immutable, you can change their value's using reflection and or class loading. You should not be depending on that property for security.
For examples see: Magic Trick In Java
Immutability is good. See Effective Java. If you had to copy a String every time you passed it around, then that would be a lot of error-prone code. You also have confusion as to which modifications affect which references. In the same way that Integer has to be immutable to behave like int, Strings have to behave as immutable to act like primitives. In C++ passing strings by value does this without explicit mention in the source code.
There is an exception for nearly almost every rule:
using System;
using System.Runtime.InteropServices;
namespace Guess
{
class Program
{
static void Main(string[] args)
{
const string str = "ABC";
Console.WriteLine(str);
Console.WriteLine(str.GetHashCode());
var handle = GCHandle.Alloc(str, GCHandleType.Pinned);
try
{
Marshal.WriteInt16(handle.AddrOfPinnedObject(), 4, 'Z');
Console.WriteLine(str);
Console.WriteLine(str.GetHashCode());
}
finally
{
handle.Free();
}
}
}
}
It's largely for security reasons. It's much harder to secure a system if you can't trust that your Strings are tamperproof.