Why can't strings be mutable in Java and .NET? - java

Why is it that they decided to make String immutable in Java and .NET (and some other languages)? Why didn't they make it mutable?

According to Effective Java, chapter 4, page 73, 2nd edition:
"There are many good reasons for this: Immutable classes are easier to
design, implement, and use than mutable classes. They are less prone
to error and are more secure.
[...]
"Immutable objects are simple. An immutable object can be in
exactly one state, the state in which it was created. If you make sure
that all constructors establish class invariants, then it is
guaranteed that these invariants will remain true for all time, with
no effort on your part.
[...]
Immutable objects are inherently thread-safe; they require no synchronization. They cannot be corrupted by multiple threads
accessing them concurrently. This is far and away the easiest approach
to achieving thread safety. In fact, no thread can ever observe any
effect of another thread on an immutable object. Therefore,
immutable objects can be shared freely
[...]
Other small points from the same chapter:
Not only can you share immutable objects, but you can share their internals.
[...]
Immutable objects make great building blocks for other objects, whether mutable or immutable.
[...]
The only real disadvantage of immutable classes is that they require a separate object for each distinct value.

There are at least two reasons.
First - security http://www.javafaq.nu/java-article1060.html
The main reason why String made
immutable was security. Look at this
example: We have a file open method
with login check. We pass a String to
this method to process authentication
which is necessary before the call
will be passed to OS. If String was
mutable it was possible somehow to
modify its content after the
authentication check before OS gets
request from program then it is
possible to request any file. So if
you have a right to open text file in
user directory but then on the fly
when somehow you manage to change the
file name you can request to open
"passwd" file or any other. Then a
file can be modified and it will be
possible to login directly to OS.
Second - Memory efficiency http://hikrish.blogspot.com/2006/07/why-string-class-is-immutable.html
JVM internally maintains the "String
Pool". To achive the memory
efficiency, JVM will refer the String
object from pool. It will not create
the new String objects. So, whenever
you create a new string literal, JVM
will check in the pool whether it
already exists or not. If already
present in the pool, just give the
reference to the same object or create
the new object in the pool. There will
be many references point to the same
String objects, if someone changes the
value, it will affect all the
references. So, sun decided to make it
immutable.

Actually, the reasons string are immutable in java doesn't have much to do with security. The two main reasons are the following:
Thead Safety:
Strings are extremely widely used type of object. It is therefore more or less guaranteed to be used in a multi-threaded environment. Strings are immutable to make sure that it is safe to share strings among threads. Having an immutable strings ensures that when passing strings from thread A to another thread B, thread B cannot unexpectedly modify thread A's string.
Not only does this help simplify the already pretty complicated task of multi-threaded programming, but it also helps with performance of multi-threaded applications. Access to mutable objects must somehow be synchronized when they can be accessed from multiple threads, to make sure that one thread doesn't attempt to read the value of your object while it is being modified by another thread. Proper synchronization is both hard to do correctly for the programmer, and expensive at runtime. Immutable objects cannot be modified and therefore do not need synchronization.
Performance:
While String interning has been mentioned, it only represents a small gain in memory efficiency for Java programs. Only string literals are interned. This means that only the strings which are the same in your source code will share the same String Object. If your program dynamically creates string that are the same, they will be represented in different objects.
More importantly, immutable strings allow them to share their internal data. For many string operations, this means that the underlying array of characters does not need to be copied. For example, say you want to take the five first characters of String. In Java, you would calls myString.substring(0,5). In this case, what the substring() method does is simply to create a new String object that shares myString's underlying char[] but who knows that it starts at index 0 and ends at index 5 of that char[]. To put this in graphical form, you would end up with the following:
| myString |
v v
"The quick brown fox jumps over the lazy dog" <-- shared char[]
^ ^
| | myString.substring(0,5)
This makes this kind of operations extremely cheap, and O(1) since the operation neither depends on the length of the original string, nor on the length of the substring we need to extract. This behavior also has some memory benefits, since many strings can share their underlying char[].

Thread safety and performance. If a string cannot be modified it is safe and quick to pass a reference around among multiple threads. If strings were mutable, you would always have to copy all of the bytes of the string to a new instance, or provide synchronization. A typical application will read a string 100 times for every time that string needs to be modified. See wikipedia on immutability.

One should really ask, "why should X be mutable?" It's better to default to immutability, because of the benefits already mentioned by Princess Fluff. It should be an exception that something is mutable.
Unfortunately most of the current programming languages default to mutability, but hopefully in the future the default is more on immutablity (see A Wish List for the Next Mainstream Programming Language).

Wow! I Can't believe the misinformation here. Strings being immutable have nothing with security. If someone already has access to the objects in a running application (which would have to be assumed if you are trying to guard against someone 'hacking' a String in your app), they would certainly be a plenty of other opportunities available for hacking.
It's a quite novel idea that the immutability of String is addressing threading issues. Hmmm ... I have an object that is being changed by two different threads. How do I resolve this? synchronize access to the object? Naawww ... let's not let anyone change the object at all -- that'll fix all of our messy concurrency issues! In fact, let's make all objects immutable, and then we can removed the synchonized contruct from the Java language.
The real reason (pointed out by others above) is memory optimization. It is quite common in any application for the same string literal to be used repeatedly. It is so common, in fact, that decades ago, many compilers made the optimization of storing only a single instance of a String literal. The drawback of this optimization is that runtime code that modifies a String literal introduces a problem because it is modifying the instance for all other code that shares it. For example, it would be not good for a function somewhere in an application to change the String literal "dog" to "cat". A printf("dog") would result in "cat" being written to stdout. For that reason, there needed to be a way of guarding against code that attempts to change String literals (i. e., make them immutable). Some compilers (with support from the OS) would accomplish this by placing String literal into a special readonly memory segment that would cause a memory fault if a write attempt was made.
In Java this is known as interning. The Java compiler here is just following an standard memory optimization done by compilers for decades. And to address the same issue of these String literals being modified at runtime, Java simply makes the String class immutable (i. e, gives you no setters that would allow you to change the String content). Strings would not have to be immutable if interning of String literals did not occur.

String is not a primitive type, yet you normally want to use it with value semantics, i.e. like a value.
A value is something you can trust won't change behind your back.
If you write: String str = someExpr();
You don't want it to change unless YOU do something with str.
String as an Object has naturally pointer semantics, to get value semantics as well it needs to be immutable.

One factor is that, if Strings were mutable, objects storing Strings would have to be careful to store copies, lest their internal data change without notice. Given that Strings are a fairly primitive type like numbers, it is nice when one can treat them as if they were passed by value, even if they are passed by reference (which also helps to save on memory).

I know this is a bump, but...
Are they really immutable?
Consider the following.
public static unsafe void MutableReplaceIndex(string s, char c, int i)
{
fixed (char* ptr = s)
{
*((char*)(ptr + i)) = c;
}
}
...
string s = "abc";
MutableReplaceIndex(s, '1', 0);
MutableReplaceIndex(s, '2', 1);
MutableReplaceIndex(s, '3', 2);
Console.WriteLine(s); // Prints 1 2 3
You could even make it an extension method.
public static class Extensions
{
public static unsafe void MutableReplaceIndex(this string s, char c, int i)
{
fixed (char* ptr = s)
{
*((char*)(ptr + i)) = c;
}
}
}
Which makes the following work
s.MutableReplaceIndex('1', 0);
s.MutableReplaceIndex('2', 1);
s.MutableReplaceIndex('3', 2);
Conclusion: They're in an immutable state which is known by the compiler. Of couse the above only applies to .NET strings as Java doesn't have pointers. However a string can be entirely mutable using pointers in C#. It's not how pointers are intended to be used, has practical usage or is safely used; it's however possible, thus bending the whole "mutable" rule. You can normally not modify an index directly of a string and this is the only way. There is a way that this could be prevented by disallowing pointer instances of strings or making a copy when a string is pointed to, but neither is done, which makes strings in C# not entirely immutable.

For most purposes, a "string" is (used/treated as/thought of/assumed to be) a meaningful atomic unit, just like a number.
Asking why the individual characters of a string are not mutable is therefore like asking why the individual bits of an integer are not mutable.
You should know why. Just think about it.
I hate to say it, but unfortunately we're debating this because our language sucks, and we're trying to using a single word, string, to describe a complex, contextually situated concept or class of object.
We perform calculations and comparisons with "strings" similar to how we do with numbers. If strings (or integers) were mutable, we'd have to write special code to lock their values into immutable local forms in order to perform any kind of calculation reliably. Therefore, it is best to think of a string like a numeric identifier, but instead of being 16, 32, or 64 bits long, it could be hundreds of bits long.
When someone says "string", we all think of different things. Those who think of it simply as a set of characters, with no particular purpose in mind, will of course be appalled that someone just decided that they should not be able to manipulate those characters. But the "string" class isn't just an array of characters. It's a STRING, not a char[]. There are some basic assumptions about the concept we refer to as a "string", and it generally can be described as meaningful, atomic unit of coded data like a number. When people talk about "manipulating strings", perhaps they're really talking about manipulating characters to build strings, and a StringBuilder is great for that. Just think a bit about what the word "string" truly means.
Consider for a moment what it would be like if strings were mutable. The following API function could be tricked into returning information for a different user if the mutable username string is intentionally or unintentionally modified by another thread while this function is using it:
string GetPersonalInfo( string username, string password )
{
string stored_password = DBQuery.GetPasswordFor( username );
if (password == stored_password)
{
//another thread modifies the mutable 'username' string
return DBQuery.GetPersonalInfoFor( username );
}
}
Security isn't just about 'access control', it's also about 'safety' and 'guaranteeing correctness'. If a method can't be easily written and depended upon to perform a simple calculation or comparison reliably, then it's not safe to call it, but it would be safe to call into question the programming language itself.

Immutability is not so closely tied to security. For that, at least in .NET, you get the SecureString class.
Later edit: In Java you will find GuardedString, a similar implementation.

The decision to have string mutable in C++ causes a lot of problems, see this excellent article by Kelvin Henney about Mad COW Disease.
COW = Copy On Write.

It's a trade off. Strings go into the String pool and when you create multiple identical Strings they share the same memory. The designers figured this memory saving technique would work well for the common case, since programs tend to grind over the same strings a lot.
The downside is that concatenations make a lot of extra Strings that are only transitional and just become garbage, actually harming memory performance. You have StringBuffer and StringBuilder (in Java, StringBuilder is also in .NET) to use to preserve memory in these cases.

Strings in Java are not truly immutable, you can change their value's using reflection and or class loading. You should not be depending on that property for security.
For examples see: Magic Trick In Java

Immutability is good. See Effective Java. If you had to copy a String every time you passed it around, then that would be a lot of error-prone code. You also have confusion as to which modifications affect which references. In the same way that Integer has to be immutable to behave like int, Strings have to behave as immutable to act like primitives. In C++ passing strings by value does this without explicit mention in the source code.

There is an exception for nearly almost every rule:
using System;
using System.Runtime.InteropServices;
namespace Guess
{
class Program
{
static void Main(string[] args)
{
const string str = "ABC";
Console.WriteLine(str);
Console.WriteLine(str.GetHashCode());
var handle = GCHandle.Alloc(str, GCHandleType.Pinned);
try
{
Marshal.WriteInt16(handle.AddrOfPinnedObject(), 4, 'Z');
Console.WriteLine(str);
Console.WriteLine(str.GetHashCode());
}
finally
{
handle.Free();
}
}
}
}

It's largely for security reasons. It's much harder to secure a system if you can't trust that your Strings are tamperproof.

Related

Why factory methods for Collections produce immutable instances? [duplicate]

I am unable to get what are the scenarios where we need an immutable class.
Have you ever faced any such requirement? or can you please give us any real example where we should use this pattern.
The other answers seem too focused on explaining why immutability is good. It is very good and I use it whenever possible. However, that is not your question. I'll take your question point by point to try to make sure you're getting the answers and examples you need.
I am unable to get what are the scenarios where we need an immutable class.
"Need" is a relative term here. Immutable classes are a design pattern that, like any paradigm/pattern/tool, is there to make constructing software easier. Similarly, plenty of code was written before the OO paradigm came along, but count me among the programmers that "need" OO. Immutable classes, like OO, aren't strictly needed, but I going to act like I need them.
Have you ever faced any such requirement?
If you aren't looking at the objects in the problem domain with the right perspective, you may not see a requirement for an immutable object. It might be easy to think that a problem domain doesn't require any immutable classes if you're not familiar when to use them advantageously.
I often use immutable classes where I think of a given object in my problem domain as a value or fixed instance. This notion is sometimes dependent on perspective or viewpoint, but ideally, it will be easy to switch into the right perspective to identify good candidate objects.
You can get a better sense of where immutable objects are really useful (if not strictly necessary) by making sure you read up on various books/online articles to develop a good sense of how to think about immutable classes. One good article to get you started is Java theory and practice: To mutate or not to mutate?
I'll try to give a couple of examples below of how one can see objects in different perspectives (mutable vs immutable) to clarify what I mean by perspective.
... can you please give us any real example where we should use this pattern.
Since you asked for real examples I'll give you some, but first, let's start with some classic examples.
Classic Value Objects
Strings and integers are often thought of as values. Therefore it's not surprising to find that String class and the Integer wrapper class (as well as the other wrapper classes) are immutable in Java. A color is usually thought of as a value, thus the immutable Color class.
Counterexample
In contrast, a car is not usually thought of as a value object. Modeling a car usually means creating a class that has changing state (odometer, speed, fuel level, etc). However, there are some domains where it car may be a value object. For example, a car (or specifically a car model) might be thought of as a value object in an app to look up the proper motor oil for a given vehicle.
Playing Cards
Ever write a playing card program? I did. I could have represented a playing card as a mutable object with a mutable suit and rank. A draw-poker hand could be 5 fixed instances where replacing the 5th card in my hand would mean mutating the 5th playing card instance into a new card by changing its suit and rank ivars.
However, I tend to think of a playing card as an immutable object that has a fixed unchanging suit and rank once created. My draw poker hand would be 5 instances and replacing a card in my hand would involve discarding one of those instance and adding a new random instance to my hand.
Map Projection
One last example is when I worked on some map code where the map could display itself in various projections. The original code had the map use a fixed, but mutatable projection instance (like the mutable playing card above). Changing the map projection meant mutating the map's projection instance's ivars (projection type, center point, zoom, etc).
However, I felt the design was simpler if I thought of a projection as an immutable value or fixed instance. Changing the map projection meant having the map reference a different projection instance rather than mutating the map's fixed projection instance. This also made it simpler to capture named projections such as MERCATOR_WORLD_VIEW.
Immutable classes are in general much simpler to design, implement and use correctly. An example is String: the implementation of java.lang.String is significantly simpler than that of std::string in C++, mostly due to its immutability.
One particular area where immutability makes an especially big difference is concurrency: immutable objects can safely be shared among multiple threads, whereas mutable objects must be made thread-safe via careful design and implementation - usually this is far from a trivial task.
Update: Effective Java 2nd Edition tackles this issue in detail - see Item 15: Minimize mutability.
See also these related posts:
non-technical benefits of having string-type immutable
Downsides to immutable objects in Java?
Effective Java by Joshua Bloch outlines several reasons to write immutable classes:
Simplicity - each class is in one state only
Thread Safe - because the state cannot be changed, no synchronization is required
Writing in an immutable style can lead to more robust code. Imagine if Strings weren't immutable; Any getter methods that returned a String would require the implementation to create a defensive copy before the String was returned - otherwise a client may accidentally or maliciously break that state of the object.
In general it is good practise to make an object immutable unless there are severe performance problems as a result. In such circumstances, mutable builder objects can be used to build immutable objects e.g. StringBuilder
Hashmaps are a classic example. It's imperative that the key to a map be immutable. If the key is not immutable, and you change a value on the key such that hashCode() would result in a new value, the map is now broken (a key is now in the wrong location in the hash table.).
Java is practically one and all references. Sometimes an instance is referenced multiple times. If you change such an instance, it would be reflected into all its references. Sometimes you simply don't want to have this to improve robustness and threadsafety. Then an immutable class is useful so that one is forced to create a new instance and reassign it to the current reference. This way the original instance of the other references remain untouched.
Imagine how Java would look like if String was mutable.
Let's take an extreme case: integer constants. If I write a statement like "x=x+1" I want to be 100% confidant that the number "1" will not somehow become 2, no matter what happens anywhere else in the program.
Now okay, integer constants are not a class, but the concept is the same. Suppose I write:
String customerId=getCustomerId();
String customerName=getCustomerName(customerId);
String customerBalance=getCustomerBalance(customerid);
Looks simple enough. But if Strings were not immutable, then I would have to consider the possibility that getCustomerName could change customerId, so that when I call getCustomerBalance, I am getting the balance for a different customer. Now you might say, "Why in the world would someone writing a getCustomerName function make it change the id? That would make no sense." But that's exactly where you could get in trouble. The person writing the above code might take it as just obvious that the functions would not change the parameter. Then someone comes along who has to modify another use of that function to handle the case where where a customer has multiple accounts under the same name. And he says, "Oh, here's this handy getCustomer name function that's already looking up the name. I'll just make that automatically change the id to the next account with the same name, and put it in a loop ..." And then your program starts mysteriously not working. Would that be bad coding style? Probably. But it's precisely a problem in cases where the side effect is NOT obvious.
Immutability simply means that a certain class of objects are constants, and we can treat them as constants.
(Of course the user could assign a different "constant object" to a variable. Someone can write
String s="hello";
and then later write
s="goodbye";
Unless I make the variable final, I can't be sure that it's not being changed within my own block of code. Just like integer constants assure me that "1" is always the same number, but not that "x=1" will never be changed by writing "x=2". But I can be confidant that if I have a handle to an immutable object, that no function I pass it to can change it on me, or that if I make two copies of it, that a change to the variable holding one copy will not change the other. Etc.
We don't need immutable classes, per se, but they can certainly make some programming tasks easier, especially when multiple threads are involved. You don't have to perform any locking to access an immutable object, and any facts that you've already established about such an object will continue to be true in the future.
There are various reason for immutability:
Thread Safety: Immutable objects cannot be changed nor can its internal state change, thus there's no need to synchronise it.
It also guarantees that whatever I send through (through a network) has to come in the same state as previously sent. It means that nobody (eavesdropper) can come and add random data in my immutable set.
It's also simpler to develop. You guarantee that no subclasses will exist if an object is immutable. E.g. a String class.
So, if you want to send data through a network service, and you want a sense of guarantee that you will have your result exactly the same as what you sent, set it as immutable.
My 2 cents for future visitors:
2 scenarios where immutable objects are good choices are:
In multi-threading
Concurrency issues in multi-threaded environment can very well be solved by synchronization but synchronization is costly affair (wouldn't dig here on "why"), so if you are using immutable objects then there is no synchronization to solve concurrency issue because state of immutable objects cannot be changed, and if state cannot be changed then all threads can seamless access the object. So, immutable objects makes a great choice for shared objects in multi-threaded environment.
As key for hash based collections
One of the most important thing to note when working with hash based collection is that key should be such that its hashCode() should always return the same value for the lifetime of the object, because if that value is changed then old entry made into the hash based collection using that object cannot be retrieved, hence it would cause memory leak. Since state of immutable objects cannot be changed so they makes a great choice as key in hash based collection. So, if you are using immutable object as key for hash based collection then you can be sure that there will not be any memory leak because of that (of course there can still be memory leak when the object used as key is not referenced from anywhere else, but that's not the point here).
I'm going to attack this from a different perspective. I find immutable objects make life easier for me when reading code.
If I have a mutable object I am never sure what its value is if it's ever used outside of my immediate scope. Let's say I create MyMutableObject in a method's local variables, fill it out with values, then pass it to five other methods. ANY ONE of those methods can change my object's state, so one of two things has to occur:
I have to keep track of the bodies of five additional methods while thinking about my code's logic.
I have to make five wasteful defensive copies of my object to ensure that the right values get passed to each method.
The first makes reasoning about my code difficult. The second makes my code suck in performance -- I'm basically mimicking an immutable object with copy-on-write semantics anyway, but doing it all the time whether or not the called methods actually modify my object's state.
If I instead use MyImmutableObject, I can be assured that what I set is what the values will be for the life of my method. There's no "spooky action at a distance" that will change it out from under me and there's no need for me to make defensive copies of my object before invoking the five other methods. If the other methods want to change things for their purposes they have to make the copy – but they only do this if they really have to make a copy (as opposed to my doing it before each and every external method call). I spare myself the mental resources of keeping track of methods which may not even be in my current source file, and I spare the system the overhead of endlessly making unnecessary defensive copies just in case.
(If I go outside of the Java world and into, say, the C++ world, among others, I can get even trickier. I can make the objects appear as if they're mutable, but behind the scenes make them transparently clone on any kind of state change—that's copy-on-write—with nobody being the wiser.)
Immutable objects are instances whose states do not change once initiated.
The use of such objects is requirement specific.
Immutable class is good for caching purpose and it is thread safe.
By the virtue of immutability you can be sure that the behavior/state of the underlying immutable object do not to change, with that you get added advantage of performing additional operations:
You can use multiple core/processing(concurrent/parallel processing) with ease(as the sequence of operations will no longer matter.)
Can do caching for expensive operations (as you are sure of the same
result).
Can do debugging with ease(as the history of run will not be a concern
anymore)
Using the final keyword doesn't necessarily make something immutable:
public class Scratchpad {
public static void main(String[] args) throws Exception {
SomeData sd = new SomeData("foo");
System.out.println(sd.data); //prints "foo"
voodoo(sd, "data", "bar");
System.out.println(sd.data); //prints "bar"
}
private static void voodoo(Object obj, String fieldName, Object value) throws Exception {
Field f = SomeData.class.getDeclaredField("data");
f.setAccessible(true);
Field modifiers = Field.class.getDeclaredField("modifiers");
modifiers.setAccessible(true);
modifiers.setInt(f, f.getModifiers() & ~Modifier.FINAL);
f.set(obj, "bar");
}
}
class SomeData {
final String data;
SomeData(String data) {
this.data = data;
}
}
Just an example to demonstrate that the "final" keyword is there to prevent programmer error, and not much more. Whereas reassigning a value lacking a final keyword can easily happen by accident, going to this length to change a value would have to be done intentionally. It's there for documentation and to prevent programmer error.
Immutable data structures can also help when coding recursive algorithms. For example, say that you're trying to solve a 3SAT problem. One way is to do the following:
Pick an unassigned variable.
Give it the value of TRUE. Simplify the instance by taking out clauses that are now satisfied, and recur to solve the simpler instance.
If the recursion on the TRUE case failed, then assign that variable FALSE instead. Simplify this new instance, and recur to solve it.
If you have a mutable structure to represent the problem, then when you simplify the instance in the TRUE branch, you'll either have to:
Keep track of all changes you make, and undo them all once you realize the problem can't be solved. This has large overhead because your recursion can go pretty deep, and it's tricky to code.
Make a copy of the instance, and then modify the copy. This will be slow because if your recursion is a few dozen levels deep, you'll have to make many many copies of the instance.
However if you code it in a clever way, you can have an immutable structure, where any operation returns an updated (but still immutable) version of the problem (similar to String.replace - it doesn't replace the string, just gives you a new one). The naive way to implement this is to have the "immutable" structure just copy and make a new one on any modification, reducing it to the 2nd solution when having a mutable one, with all that overhead, but you can do it in a more efficient way.
One of the reasons for the "need" for immutable classes is the combination of passing everything by reference and having no support for read-only views of an object (i.e. C++'s const).
Consider the simple case of a class having support for the observer pattern:
class Person {
public string getName() { ... }
public void registerForNameChange(NameChangedObserver o) { ... }
}
If string were not immutable, it would be impossible for the Person class to implement registerForNameChange() correctly, because someone could write the following, effectively modifying the person's name without triggering any notification.
void foo(Person p) {
p.getName().prepend("Mr. ");
}
In C++, getName() returning a const std::string& has the effect of returning by reference and preventing access to mutators, meaning immutable classes are not necessary in that context.
They also give us a guarantee. The guarantee of immutability means that we can expand on them and create new patters for efficiency that are otherwise not possible.
http://en.wikipedia.org/wiki/Singleton_pattern
One feature of immutable classes which hasn't yet been called out: storing a reference to a deeply-immutable class object is an efficient means of storing all of the state contained therein. Suppose I have a mutable object which uses a deeply-immutable object to hold 50K worth of state information. Suppose, further, that I wish to on 25 occasions make a "copy" of my original (mutable) object (e.g. for an "undo" buffer); the state could change between copy operations, but usually doesn't. Making a "copy" of the mutable object would simply require copying a reference to its immutable state, so 20 copies would simply amount to 20 references. By contrast, if the state were held in 50K worth of mutable objects, each of the 25 copy operations would have to produce its own copy of 50K worth of data; holding all 25 copies would require holding over a meg worth of mostly-duplicated data. Even though the first copy operation would produce a copy of the data that will never change, and the other 24 operations could in theory simply refer back to that, in most implementations there would be no way for the second object asking for a copy of the information to know that an immutable copy already exists(*).
(*) One pattern that can sometimes be useful is for mutable objects to have two fields to hold their state--one in mutable form and one in immutable form. Objects can be copied as mutable or immutable, and would begin life with one or the other reference set. As soon as the object wants to change its state, it copies the immutable reference to the mutable one (if it hasn't been done already) and invalidates the immutable one. When the object is copied as immutable, if its immutable reference isn't set, an immutable copy will be created and the immutable reference pointed to that. This approach will require a few more copy operations than would a "full-fledged copy on write" (e.g. asking to copy an object which has been mutated since the last copy would require a copy operation, even if the original object is never again mutated) but it avoids the threading complexities that FFCOW would entail.
Why Immutable class?
Once an object is instantiated it state cannot be changed in lifetime. Which also makes it thread safe.
Examples :
Obviously String, Integer and BigDecimal etc. Once these values are created cannot be changed in lifetime.
Use-case :
Once Database connection object is created with its configuration values you might not need to change its state where you can use an immutable class
from Effective Java;
An immutable class is simply a class whose instances cannot be modified. All of
the information contained in each instance is provided when it is created and is
fixed for the lifetime of the object. The Java platform libraries contain many
immutable classes, including String, the boxed primitive classes, and BigInte-
ger and BigDecimal. There are many good reasons for this: Immutable classes
are easier to design, implement and use than mutable classes. They are less prone
to error and are more secure.
An immutable class is good for caching purposes because you don't have to worry about the value changes. Another benefit of an immutable class is that it is inherently thread-safe, so you don't have to worry about thread safety in case of a multi-threaded environment.

Why do we have String class, if StringBuilder or StringBuffer can do what a String does? [duplicate]

This question already has answers here:
Why can't strings be mutable in Java and .NET?
(17 answers)
Closed 7 years ago.
I've always wondered why does JAVA and C# has String (immutable & threadsafe) class, if they have StringBuilder (mutable & not threadsafe) or StringBuffer (mutable & threadsafe) class. Isn't StringBuilder/StringBuffer superset of String class? I mean, why should I use String class, if I've option of using StringBuilder/StringBuffer?
For example, Instead of using following,
String str;
Why can't I always use following?
StringBuilder strb; //or
StringBuffer strbu;
In short, my question is, How will my code get effected if I replace String with StringBuffer class? Also, StringBuffer has added advantage of mutability.
I mean, why should I use String class, if I've option of using StringBuilder/StringBuffer?
Precisely because it's immutable. Immutability has a whole host of benefits, primarily that it makes it much easier to reason about your code without creating copies of the data everywhere "just in case" something decides to mutate the value. For example:
private readonly String name;
public Person(string name)
{
if (string.IsNullOrEmpty(name)) // Or whatever
{
// Throw some exception
}
this.name = name;
}
// All the rest of the code can rely on name being a non-null
// reference to a non-empty string. Nothing can mutate it, leaving
// evil reflection aside.
Immutability makes sharing simple and efficient. That's particularly useful for multi-threaded code. It makes "modifying" (i.e. creating a new instance with different data) more painful, but in many situations that's absolutely fine, because values pass through the system without ever being modified.
Immutability is particularly useful for "simple" types such as strings, dates, numbers (BigDecimal, BigInteger etc). It allows them to be used within maps more easily, it allows a simple equality definition, etc.
1) StringBuilder as well as StringBuffer both are mutable. So it will cause a few problems like using in collections like keys in hashMap. See this link.
Another example of advantage of immutability will be what Jon has mentioned in his comments. I am just pasting here.
Someone can call Person p = new Person(builder); with a builder which initially passes my validation criteria - and then modify it afterwards, without the Person class having any say in it. In order to avoid that, the Person class would need to copy the validated data.
Immutabilty assures this does not happen.
2) As string is most extensively used object in java, the string pool offers to resuse same string, thus saving memory.
I completely agree with Jon Skeet that immutability is one reason to use String. Another reason (from a C# perspective) is that String is actually lighter weight than StringBuilder. If you look at reference source for both String and String Builder you will see that StringBuilder actually has a number of String constants in it. As a developer, you should only use what you need so unless you need the added benefits provided from StringBuilder you should use String.
Many answers have already outlined that there are shortcomings from using mutable variants such as StringBuilder. To illustrate the problem, one thing that you cannot achieve with StringBuilder is associative memory, i.e. hash tables. Sure, most implementations will allow you to use StringBuilder as a key for hashtables, but they will only find the values for the exact same instance of StringBuilder. However, the typical behavior that you would want to achieve is that it does not matter where the string comes from as only the characters are important, as you e.g. reade the string from a database or file (or any other external resource).
However, as far as I understood your question, you were mainly asking about field types. And indeed, I see your point particularly taking into account that we are doing the exact same thing with collections of other objects which are usually not immutable objects but mutable collections, such as List or ArrayList in C# or Java, respectively. In the end, a string is only a collection of characters, so why not making it mutable?
The answer I would give here is that the usual behavior of how such a string is changed is very different to usual collections. If you have a collection of subsequent elements, it is very common to only add a single element to the collection, leaving most of the collection untouched, i.e. you would not discard a list to insert an item, at least unless you are programming in Haskell :). For many strings like names, this is different as you typically replace the whole string. Given the importance of a string data type, the platforms usually offer a lot of optimization for strings such as interned strings, making the choice even more biased towards strings.
However, in the end, every program is different and you might have requirements that make it more reasonable to use StringBuilder by default, but for the given reasons, I think that these cases are rather rare.
EDIT: As you were asking for examples. Consider the following code:
stopwatch.Start();
var s = "";
for (int i = 0; i < 100000; i++)
{
s = "." + s;
}
stopwatch.Stop();
Console.WriteLine(stopwatch.ElapsedMilliseconds);
stopwatch.Restart();
var s2 = new StringBuilder();
for (int i = 0; i < 100000; i++)
{
s2.Insert(0, ".");
}
stopwatch.Stop();
Console.WriteLine(stopwatch.ElapsedMilliseconds);
Technically, both bits are doing a very similar thing, they will insert a character at the first position and shift whatever comes after. Both versions will involve copying the whole string that has been there before. The version with string completes in 1750ms on my machine whereas StringBuilder took 2245ms. However, both versions are reasonably fast, making the performance impact negligible in this case.
I would like to add some differences between String and StringBuilder classes:
Yes, as mentioned above String is immutable class and content cannot be changed after string has been created. It is allow to work with the same string objects from different threads without locking.
If you need to concatenate a lot of strings together, use StringBuilder class. When you use "+" operator it creates a lot of string objects on managed heap and hurts performance.
StringBuilder is mutable class. StringBuilder stores characters in array and can manipulate with characters without creating a new string object (such as add, remove, replace, append).
If you know approximate length of result string you should set capacity. Default capacity is 16 (.NET 4.5). It gives you performance improvements because StringBuilder has inner array of chars. Array of chars recreates when count of characters exceeds current capacity.
String:
is immutable (so you can use it in collections)
every operation creates a new instance on the Heap. Technically speaking really depends on the code.
For performance and memory consumption purposes it makes sense to use StringBuilder.

"Immutable" strings in Java - actually, it's a lie [duplicate]

We all know that String is immutable in Java, but check the following code:
String s1 = "Hello World";
String s2 = "Hello World";
String s3 = s1.substring(6);
System.out.println(s1); // Hello World
System.out.println(s2); // Hello World
System.out.println(s3); // World
Field field = String.class.getDeclaredField("value");
field.setAccessible(true);
char[] value = (char[])field.get(s1);
value[6] = 'J';
value[7] = 'a';
value[8] = 'v';
value[9] = 'a';
value[10] = '!';
System.out.println(s1); // Hello Java!
System.out.println(s2); // Hello Java!
System.out.println(s3); // World
Why does this program operate like this? And why is the value of s1 and s2 changed, but not s3?
String is immutable* but this only means you cannot change it using its public API.
What you are doing here is circumventing the normal API, using reflection. The same way, you can change the values of enums, change the lookup table used in Integer autoboxing etc.
Now, the reason s1 and s2 change value, is that they both refer to the same interned string. The compiler does this (as mentioned by other answers).
The reason s3 does not was actually a bit surprising to me, as I thought it would share the value array (it did in earlier version of Java, before Java 7u6). However, looking at the source code of String, we can see that the value character array for a substring is actually copied (using Arrays.copyOfRange(..)). This is why it goes unchanged.
You can install a SecurityManager, to avoid malicious code to do such things. But keep in mind that some libraries depend on using these kind of reflection tricks (typically ORM tools, AOP libraries etc).
*) I initially wrote that Strings aren't really immutable, just "effective immutable". This might be misleading in the current implementation of String, where the value array is indeed marked private final. It's still worth noting, though, that there is no way to declare an array in Java as immutable, so care must be taken not to expose it outside its class, even with the proper access modifiers.
As this topic seems overwhelmingly popular, here's some suggested further reading: Heinz Kabutz's Reflection Madness talk from JavaZone 2009, which covers a lot of the issues in the OP, along with other reflection... well... madness.
It covers why this is sometimes useful. And why, most of the time, you should avoid it. :-)
In Java, if two string primitive variables are initialized to the same literal, it assigns the same reference to both variables:
String Test1="Hello World";
String Test2="Hello World";
System.out.println(test1==test2); // true
That is the reason the comparison returns true. The third string is created using substring() which makes a new string instead of pointing to the same.
When you access a string using reflection, you get the actual pointer:
Field field = String.class.getDeclaredField("value");
field.setAccessible(true);
So change to this will change the string holding a pointer to it, but as s3 is created with a new string due to substring() it would not change.
You are using reflection to circumvent the immutability of String - it's a form of "attack".
There are lots of examples you can create like this (eg you can even instantiate a Void object too), but it doesn't mean that String is not "immutable".
There are use cases where this type of code may be used to your advantage and be "good coding", such as clearing passwords from memory at the earliest possible moment (before GC).
Depending on the security manager, you may not be able to execute your code.
You are using reflection to access the "implementation details" of string object. Immutability is the feature of the public interface of an object.
Visibility modifiers and final (i.e. immutability) are not a measurement against malicious code in Java; they are merely tools to protect against mistakes and to make the code more maintainable (one of the big selling points of the system). That is why you can access internal implementation details like the backing char array for Strings via reflection.
The second effect you see is that all Strings change while it looks like you only change s1. It is a certain property of Java String literals that they are automatically interned, i.e. cached. Two String literals with the same value will actually be the same object. When you create a String with new it will not be interned automatically and you will not see this effect.
#substring until recently (Java 7u6) worked in a similar way, which would have explained the behaviour in the original version of your question. It didn't create a new backing char array but reused the one from the original String; it just created a new String object that used an offset and a length to present only a part of that array. This generally worked as Strings are immutable - unless you circumvent that. This property of #substring also meant that the whole original String couldn't be garbage collected when a shorter substring created from it still existed.
As of current Java and your current version of the question there is no strange behaviour of #substring.
String immutability is from the interface perspective. You are using reflection to bypass the interface and directly modify the internals of the String instances.
s1 and s2 are both changed because they are both assigned to the same "intern" String instance. You can find out a bit more about that part from this article about string equality and interning. You might be surprised to find out that in your sample code, s1 == s2 returns true!
Which version of Java are you using? From Java 1.7.0_06, Oracle has changed the internal representation of String, especially the substring.
Quoting from Oracle Tunes Java's Internal String Representation:
In the new paradigm, the String offset and count fields have been removed, so substrings no longer share the underlying char [] value.
With this change, it may happen without reflection (???).
There are really two questions here:
Are strings really immutable?
Why is s3 not changed?
To point 1: Except for ROM there is no immutable memory in your computer. Nowadays even ROM is sometimes writable. There is always some code somewhere (whether it's the kernel or native code sidestepping your managed environment) that can write to your memory address. So, in "reality", no they are not absolutely immutable.
To point 2: This is because substring is probably allocating a new string instance, which is likely copying the array. It is possible to implement substring in such a way that it won't do a copy, but that doesn't mean it does. There are tradeoffs involved.
For example, should holding a reference to reallyLargeString.substring(reallyLargeString.length - 2) cause a large amount of memory to be held alive, or only a few bytes?
That depends on how substring is implemented. A deep copy will keep less memory alive, but it will run slightly slower. A shallow copy will keep more memory alive, but it will be faster. Using a deep copy can also reduce heap fragmentation, as the string object and its buffer can be allocated in one block, as opposed to 2 separate heap allocations.
In any case, it looks like your JVM chose to use deep copies for substring calls.
To add to the #haraldK's answer - this is a security hack which could lead to a serious impact in the app.
First thing is a modification to a constant string stored in a String Pool. When string is declared as a String s = "Hello World";, it's being places into a special object pool for further potential reusing. The issue is that compiler will place a reference to the modified version at compile time and once the user modifies the string stored in this pool at runtime, all references in code will point to the modified version. This would result into a following bug:
System.out.println("Hello World");
Will print:
Hello Java!
There was another issue I experienced when I was implementing a heavy computation over such risky strings. There was a bug which happened in like 1 out of 1000000 times during the computation which made the result undeterministic. I was able to find the problem by switching off the JIT - I was always getting the same result with JIT turned off. My guess is that the reason was this String security hack which broke some of the JIT optimization contracts.
According to the concept of pooling, all the String variables containing the same value will point to the same memory address. Therefore s1 and s2, both containing the same value of “Hello World”, will point towards the same memory location (say M1).
On the other hand, s3 contains “World”, hence it will point to a different memory allocation (say M2).
So now what's happening is that the value of S1 is being changed (by using the char [ ] value). So the value at the memory location M1 pointed both by s1 and s2 has been changed.
Hence as a result, memory location M1 has been modified which causes change in the value of s1 and s2.
But the value of location M2 remains unaltered, hence s3 contains the same original value.
The reason s3 does not actually change is because in Java when you do a substring the value character array for a substring is internally copied (using Arrays.copyOfRange()).
s1 and s2 are the same because in Java they both refer to the same interned string. It's by design in Java.
String is immutable, but through reflection you're allowed to change the String class. You've just redefined the String class as mutable in real-time. You could redefine methods to be public or private or static if you wanted.
Strings are created in permanent area of the JVM heap memory. So yes, it's really immutable and cannot be changed after being created.
Because in the JVM, there are three types of heap memory:
1. Young generation
2. Old generation
3. Permanent generation.
When any object are created, it goes into the young generation heap area and PermGen area reserved for String pooling.
Here is more detail you can go and grab more information from:
How Garbage Collection works in Java .
[Disclaimer this is a deliberately opinionated style of answer as I feel a more "don't do this at home kids" answer is warranted]
The sin is the line field.setAccessible(true); which says to violate the public api by allowing access to a private field. Thats a giant security hole which can be locked down by configuring a security manager.
The phenomenon in the question are implementation details which you would never see when not using that dangerous line of code to violate the access modifiers via reflection. Clearly two (normally) immutable strings can share the same char array. Whether a substring shares the same array depends on whether it can and whether the developer thought to share it. Normally these are invisible implementation details which you should not have to know unless you shoot the access modifier through the head with that line of code.
It is simply not a good idea to rely upon such details which cannot be experienced without violating the access modifiers using reflection. The owner of that class only supports the normal public API and is free to make implementation changes in the future.
Having said all that the line of code is really very useful when you have a gun held you your head forcing you to do such dangerous things. Using that back door is usually a code smell that you need to upgrade to better library code where you don't have to sin. Another common use of that dangerous line of code is to write a "voodoo framework" (orm, injection container, ...). Many folks get religious about such frameworks (both for and against them) so I will avoid inviting a flame war by saying nothing other than the vast majority of programmers don't have to go there.
String is immutable in nature Because there is no method to modify String object.
That is the reason They introduced StringBuilder and StringBuffer classes
This is a quick guide to everything
// Character array
char[] chr = {'O', 'K', '!'};
// this is String class
String str1 = new String(chr);
// this is concat
str1 = str1.concat("another string's ");
// this is format
System.out.println(String.format(str1 + " %s ", "string"));
// this is equals
System.out.println(str1.equals("another string"));
//this is split
for(String s: str1.split(" ")){
System.out.println(s);
}
// this is length
System.out.println(str1.length());
//gives an score of the total change in the length
System.out.println(str1.compareTo("OK!another string string's"));
// trim
System.out.println(str1.trim());
// intern
System.out.println(str1.intern());
// character at
System.out.println(str1.charAt(5));
// substring
System.out.println(str1.substring(5, 12));
// to uppercase
System.out.println(str1.toUpperCase());
// to lowerCase
System.out.println(str1.toLowerCase());
// replace
System.out.println(str1.replace("another", "hello"));
// output
// OK!another string's string
// false
// OK!another
// string's
// 20
// 7
// OK!another string's
// OK!another string's
// o
// other s
// OK!ANOTHER STRING'S
// ok!another string's
// OK!hello string's

Why is String immutable in Java?

I was asked in an interview why String is immutable
I answered like this:
When we create a string in java like String s1="hello"; then an
object will be created in string pool(hello) and s1 will be
pointing to hello.Now if again we do String s2="hello"; then
another object will not be created but s2 will point to hello
because JVM will first check if the same object is present in
string pool or not.If not present then only a new one is created else not.
Now if suppose java allows string mutable then if we change s1 to hello world then s2 value will also be hello world so java String is immutable.
Can any body please tell me if my answer is right or wrong?
String is immutable for several reasons, here is a summary:
Security: parameters are typically represented as String in network connections, database connection urls, usernames/passwords etc. If it were mutable, these parameters could be easily changed.
Synchronization and concurrency: making String immutable automatically makes them thread safe thereby solving the synchronization issues.
Caching: when compiler optimizes your String objects, it sees that if two objects have same value (a="test", and b="test") and thus you need only one string object (for both a and b, these two will point to the same object).
Class loading: String is used as arguments for class loading. If mutable, it could result in wrong class being loaded (because mutable objects change their state).
That being said, immutability of String only means you cannot change it using its public API. You can in fact bypass the normal API using reflection. See the answer here.
In your example, if String was mutable, then consider the following example:
String a="stack";
System.out.println(a);//prints stack
a.setValue("overflow");
System.out.println(a);//if mutable it would print overflow
Java Developers decide Strings are immutable due to the following aspect design, efficiency, and security.
Design
Strings are created in a special memory area in java heap known as "String Intern pool". While you creating new String (Not in the case of using String() constructor or any other String functions which internally use the String() constructor for creating a new String object; String() constructor always create new string constant in the pool unless we call the method intern()) variable it searches the pool to check whether is it already exist.
If it is exist, then return reference of the existing String object.
If the String is not immutable, changing the String with one reference will lead to the wrong value for the other references.
According to this article on DZone:
Security
String is widely used as parameter for many java classes, e.g. network connection, opening files, etc. Were String not immutable, a connection or file would be changed and lead to serious security threat.
Mutable strings could cause security problem in Reflection too, as the parameters are strings.
Efficiency
The hashcode of string is frequently used in Java. For example, in a HashMap. Being immutable guarantees that hashcode will always the same, so that it can be cached without worrying the changes.That means, there is no need to calculate hashcode every time it is used.
We can not be sure of what was Java designers actually thinking while designing String but we can only conclude these reasons based on the advantages we get out of string immutability, Some of which are
1. Existence of String Constant Pool
As discussed in Why String is Stored in String Constant Pool article, every application creates too many string objects and in order to save JVM from first creating lots of string objects and then garbage collecting them. JVM stores all string objects in a separate memory area called String constant pool and reuses objects from that cached pool.
Whenever we create a string literal JVM first sees if that literal is already present in constant pool or not and if it is there, new reference will start pointing to the same object in SCP.
String a = "Naresh";
String b = "Naresh";
String c = "Naresh";
In above example string object with value Naresh will get created in SCP only once and all reference a, b, c will point to the same object but what if we try to make change in a e.g. a.replace("a", "").
Ideally, a should have value Nresh but b, c should remain unchanged because as an end user we are making the change in a only. And we know a, b, c all are pointing the same object so if we make a change in a, others should also reflect the change.
But string immutability saves us from this scenario and due to the immutability of string object string object Naresh will never change. So when we make any change in a instead of change in string object Naresh JVM creates a new object assign it to a and then make change in that object.
So String pool is only possible because of String's immutability and if String would not have been immutable, then caching string objects and reusing them would not have a possibility because any variable woulds have changed the value and corrupted others.
And That's why it is handled by JVM very specially and have been given a special memory area.
2. Thread Safety
An object is called thread-safe when multiple threads are operating on it but none of them is able to corrupt its state and object hold the same state for every thread at any point in time.
As we an immutable object cannot be modified by anyone after its creation which makes every immutable object is thread safe by default. We do not need to apply any thread safety measures to it such as creating synchronized methods.
So due to its immutable nature string object can be shared by multiple threads and even if it is getting manipulated by many threads it will not change its value.
3. Security
In every application, we need to pass several secrets e.g. user's user-name\passwords, connection URLs and in general, all of this information is passed as the string object.
Now suppose if String would not have been immutable in nature then it would cause a serious security threat to the application because these values are allowed to get changed and if it is allowed then these might get changed due to wrongly written code or any other person who have access to our variable references.
4. Class Loading
As discussed in Creating objects through Reflection in Java with Example, we can use Class.forName("class_name") method to load a class in memory which again calls other methods to do so. And even JVM uses these methods to load classes.
But if you see clearly all of these methods accepts the class name as a string object so Strings are used in java class loading and immutability provides security that correct class is getting loaded by ClassLoader.
Suppose if String would not have been immutable and we are trying to load java.lang.Object which get changed to org.theft.OurObject in between and now all of our objects have a behavior which someone can use to unwanted things.
5. HashCode Caching
If we are going to perform any hashing related operations on any object we must override the hashCode() method and try to generate an accurate hashcode by using the state of the object. If an object's state is getting changed which means its hashcode should also change.
Because String is immutable so the value one string object is holding will never get changed which means its hashcode will also not change which gives String class an opportunity to cache its hashcode during object creation.
Yes, String object caches its hashcode at the time of object creation which makes it the great candidate for hashing related operations because hashcode doesn't need to be calculated again which save us some time. This is why String is mostly used as HashMap keys.
Read More on Why String is Immutable and Final in Java.
Most important reason according to this article on DZone:
String Constant Pool
...
If string is mutable, changing the string with one reference will lead to the wrong value for the other references.
Security
String is widely used as parameter for many java classes, e.g. network connection, opening files, etc. Were String not immutable, a connection or file would be changed and lead to serious security threat.
...
Hope it will help you.
IMHO, this is the most important reason:
String is Immutable in Java because String objects are cached in
String pool. Since cached String literals are shared between multiple
clients there is always a risk, where one client's action would affect
all another client.
Ref: Why String is Immutable or Final in Java
You are right. String in java uses concept of String Pool literal. When a string is created and if the string already exists in the pool, the reference of the existing string will be returned, instead of creating a new object and returning its reference.If a string is not immutable, changing the string with one reference will lead to the wrong value for the other references.
I would add one more thing, since String is immutable, it is safe for multi threading and a single String instance can be shared across different threads. This avoid the usage of synchronization for thread safety, Strings are implicitly thread safe.
String is given as immutable by Sun micro systems,because string can used to store as key in map collection.
StringBuffer is mutable .That is the reason,It cannot be used as key in map object
The most important reason of a String being made immutable in Java is Security consideration. Next would be Caching.
I believe other reasons given here, such as efficiency, concurrency, design and string pool follows from the fact that String in made immutable. For eg. String Pool could be created because String was immutable and not the other way around.
Check Gosling interview transcript here
From a strategic point of view, they tend to more often be trouble free. And there are usually things you can do with immutables that you can't do with mutable things, such as cache the result. If you pass a string to a file open method, or if you pass a string to a constructor for a label in a user interface, in some APIs (like in lots of the Windows APIs) you pass in an array of characters. The receiver of that object really has to copy it, because they don't know anything about the storage lifetime of it. And they don't know what's happening to the object, whether it is being changed under their feet.
You end up getting almost forced to replicate the object because you don't know whether or not you get to own it. And one of the nice things about immutable objects is that the answer is, "Yeah, of course you do." Because the question of ownership, who has the right to change it, doesn't exist.
One of the things that forced Strings to be immutable was security. You have a file open method. You pass a String to it. And then it's doing all kind of authentication checks before it gets around to doing the OS call. If you manage to do something that effectively mutated the String, after the security check and before the OS call, then boom, you're in. But Strings are immutable, so that kind of attack doesn't work. That precise example is what really demanded that
Strings be immutable
String class is FINAL it mean you can't create any class to inherit it and change the basic structure and make the Sting mutable.
Another thing instance variable and methods of String class that are provided are such that you can't change String object once created.
The reason what you have added doesn't make the String immutable at all.This all says how the String is stored in heap.Also string pool make the huge difference in performance
In addition to the great answers, I wanted to add a few points. Like Strings, Array holds a reference to the starting of the array so if you create two arrays arr1 and arr2 and did something like arr2 = arr1 this will make the reference of arr2 same as arr1 hence changing value in one of them will result in change of the other one for example
public class Main {
public static void main(String[] args) {
int[] a = {1, 2, 3, 4};
int[] b = a;
a[0] = 8;
b[1] = 7;
System.out.println("A: " + a[0] + ", B: " + b[0]);
System.out.println("A: " + a[1] + ", B: " + b[1]);
//outputs
//A: 8, B: 8
//A: 7, B: 7
}
}
Not only that it would cause bugs in the code it also can(and will) be exploited by malicious user. Suppose if you have a system that changes the admin password. The user have to first enter the newPassword and then the oldPassword if the oldPassword is same as the adminPass the program change the password by adminPass = newPassword. let's say that the new password has the same reference as the admin password so a bad programmer may create a temp variable to hold the admin password before the users inputs data if the oldPassword is equal to temp it changes the password otherwise adminPass = temp. Someone knowing that could easily enter the new password and never enter the old password and abracadabra he has admin access. Another thing I didn't understand when learning about Strings why doesn't JVM create a new string for every object and have a unique place in memory for it and you can just do that using new String("str"); The reason you wouldn't want to always use new is because it's not memory efficient and it is slower in most cases read more.
If HELLO is your String then you can't change HELLO to HILLO. This property is called immutability property.
You can have multiple pointer String variable to point HELLO String.
But if HELLO is char Array then you can change HELLO to HILLO. Eg,
char[] charArr = 'HELLO';
char[1] = 'I'; //you can do this
Answer:
Programming languages have immutable data variables so that it can be used as keys in key, value pair. String variables are used as keys/indices, so they are immutable.
This probably has little to do with security because, very differently, security practices recommend using character arrays for passwords, not strings. This is because an array can be immediately erased when no longer needed. Differently, a string cannot be erased, because it is immutable. It may take long time before it is garbage collected, and even more before the content gets overwritten.
I think that immutability was chosen to allow sharing the strings and they fragments easily. String assignment, picking a substring becomes a constant time operation, and string comparison largely also, because of the reusable hash codes that are part of the string data structure and can be compared first.
From the other side, if the original string is huge (say large XML document), picking few symbols from there may prevent the whole document from being garbage collected. Because of that later Java versions seemed moved away from this immutability. Modern C++ has both mutable (std::string) and from C++17 also immutable (std::string_view) versions.
From the Security point of view we can use this practical example:
DBCursor makeConnection(String IP,String PORT,String USER,String PASS,String TABLE) {
// if strings were mutable IP,PORT,USER,PASS can be changed by validate function
Boolean validated = validate(IP,PORT,USER,PASS);
// here we are not sure if IP, PORT, USER, PASS changed or not ??
if (validated) {
DBConnection conn = doConnection(IP,PORT,USER,PASS);
}
// rest of the code goes here ....
}

Is a new string created every time replaceAll() is used on a String?

So I do know that java Strings are immutable.
There are a bunch of methods that replace characters in a string in java.
So every time these methods are called, would it involve creation of a brand new String, therefore increasing the space complexity, or would replacement be done in the original String itself. I'm a little confused on this concept as to whether each of these replace statements in my code would be generating new Strings each time, and thus consuming more memory?
You noted correctly that String objects in Java are immutable. The only case when replacement, substring, etc. methods do not create a new object is when the replacement is a no-op. For example, if you ask to replace all 'x' characters in a "Hello, world!" string, there would be no new String object created. Similarly, there would be no new object when you call str.substring(0), because the entire string is returned. In all other cases, when the return value differs from the original, a new object is created.
Yes. You have noted it correctly. That immutability of String type has some consequences.
That is why the designers of Java have bring another type that should be used when you perform operations with char sequences.
A class called StringBuilder, should be used when you perform lot of operations that involve characters manipulations like replace. Off course it it more robust and requires more attention to details but that is all when you care about the performance.
So the immutability of String type do not increase memory usage. What increase it is wrong usage of String type.
They generate new ones each time; that is the corollary to being immutable.
It's true, in some sense, that it increases the 'space complexity', in that it uses more memory than the most efficient possible algorithms for replacement, but it's not as bad as it sounds; the transient objects created during the replaceAll operation and others like it are garbage collected very quickly; java is very efficient at garbage collecting transient objects. See http://www.infoq.com/articles/Java_Garbage_Collection_Distilled for an interesting writeup on some garbage collection basics.
It is true that it will return a new String , but unless the call is part of some giant loop or recursive function, one need not worry too much.
But if you purposefully wanted to crash your system, I'm sure you can think up some way.
JDK misses mutating operations for character sequences, i.e. StringBuilder for some reason does not implement replacement functionality.
A possible option would be to use third party libraries, i.e. a MutableString. It is available in Maven Central.

Categories

Resources