Regarding Java String Constant Pool - java

This is regarding the Java String Constant Pool. In one of my Programs i am decrypting the password for the database and storing it in a String. I heard that the Java Strings will be stored in a Constant pool and they won't be destroyed the VM restarts or the ClassLoader that loaded the String Quits.
If it is the case my passwords will be stored in the String pool. I am Very concerned about this issue. Is there any other way to destroy these literals or anything else i can do.
Please suggest on this,
Regards,
Sunny.

There are several different issues at play here. First, the term "constant pool" refers to a very specific part of class files for string and numerical literals, or to the data structures generated from this part of class files that reside in the JVM. Password won't be stored here unless they're part of class files.
However, some String objects are indeed stored and shared throughout the program through String internment. Any string literal is automatically interned, as are any strings that you invoke the intern() method on. To the best of my knowledge, though, no other strings are stored this way, so unless you automatically intern the strings holding passwords yourself I don't think you need to worry about this.
One other issue to be aware of is that if you don't want the passwords residing in memory, you may need to be careful about garbage collection since a String that is no longer referenced could still be in memory. Similarly, if you use certain string methods like substring that share backing representations between strings, you may keep around the full password string after you're done using it.
If what you're worried about is other Java code being able to see old passwords that have been interned or that still live in memory, though, you don't need to worry. There is no way to iterate or look at the elements of the interned string pool, or to crack open a String to see its backing array.

This only applies to String literals and Strings that you have called the intern() method on. Think about it: if it applied to all strings then you would quickly run out of memory in e.g. a servlet application that handles request parameters with different (String) values.

Dude, use a string buffer for getting the password and doing your (presumably) string-type operations on.
StringBuffer stores strings as char[], which can be deleted. Or overridden or what have you.
Say StringBuffer.delete(0,StringBuffer.length());
Or just get the db password as a char[] and work directly on that - but i think the StringBuffer way (assuming you've made all your code using String methods) will be an easier leap.

Related

Using char array instead of String

We have a security recommendation to use char array instead of String while storing password and later clear the char array. But the problem is, some of the jars accept string as an argument.
For Example, org.apache.http.auth.UsernamePasswordCredentials needs two string arguments; One for password and one for username. Now, how do I call this function without creating a string for password
httpClient.getCredentialsProvider().setCredentials(
new AuthScope(AuthScope.ANY_HOST, AuthScope.ANY_PORT),
new UsernamePasswordCredentials(user.getUsername(), new String(user
.getPassword())));
How do I resolve this. Is there any way where i can store the password. I understand that String is immutable and it is not recommended to store passwords in String. So what is the alternate I can do
So the reason the security recommendation is to store the password as a character array is because, unlike arrays, Strings are immutable. This basically means once you've created the String it's in memory, even if you overwrote it, until such time that the garbage collection removes it. This means that a another process can dump memory (before the GC runs) and potentially get your password. With Arrays on the other hand you can go and specifically overwrite the array and no other process will be able to get it.
With an array, you can explicitly wipe the data after you're done with it. You can overwrite the array with anything you like, and the password won't be present anywhere in the system, even before garbage collection.
Had a look at org.apache.http.auth.UsernamePasswordCredentials and the UsernamePasswordCredentials only supports String. So potentially I would just store the password as a char array as per your security recommendation and then just convert it to String when calling this class. Then if you that paranoid, dispose the class once your done with it and immediately run the GC (this may run up your memory usage).
Also, if security is such a serious concern then your administrators should look at other alternatives, such as disabling core dumps.

Will the String passed from outside the java application be saved in String pool?

I read many answers but none of them really answers my question exactly.
If I've a java service running on some port and a client connects to it and calls a method like:
String data = getServiceData("clientKey");
Now my question is, will this key(clientKey) be stored in String literal pool on service side? Generally literals to be stored in constant pools are figured out at compile time but what happens to strings that are passed from outside the JVM or may be while reading a file?
String Object is serialized at your client side and deserialized and is kept in the Heap memory. If you want it to be stored in your String Pool memory, you should use the intern() method.
String value;
String data = (value =getServiceData("clientKey"))==null?null:value.intern();
Most methods which read strings from external sources (especially BufferedReader.getLine() or Java serialisation) will not intern the strings, so the answer is no.
However if you use third party libraries, they might do it: for example there are some XML/Dom parsers known to do that (at least for element names, less often for values). Also some high performance frameworks (servlet containers) to that for certain strings (for example HTTP header names).
But generally it is used very seldom in good(!) implementations as it is much less desirable as one might think. Don't forget: before you can intern a string it must exist as an object which needs to be collected anyway, so from the point of avoiding garbage using intern() does not help. It only might reduce the working set memory if those strings survive long time (which it is not in OLTP) and might speed up equality checks slightly. But typically this only helps if you do thousands of them on the same string object.
You can check yourself if the string is already interned (you should of course not do it in production code as it interns your string and it might not work in all implementations) with:
input == input.intern()?"yes":"no"`
And yes (as asked in a comment), having million instances of the same API key can happen with this. But don't be fooled to think this is a bad thing. Actually interning them would need to search for the value and deal with a growing string pool. This can take longer than processing (and freeing) the string. Especially when the JVM can optimize the string allocation with generational allocation and escape analysis.
BTW: Java 8u20 has a feature (-XX:+UseStringDeduplication -XX:+PrintStringDeduplicationStatistics) to detect duplicate strings in the background while doing the garbage collection in G1. It will combine those string arrays to reduce the memory consumption. (JEP192)

Why is String immutable in Java?

I was asked in an interview why String is immutable
I answered like this:
When we create a string in java like String s1="hello"; then an
object will be created in string pool(hello) and s1 will be
pointing to hello.Now if again we do String s2="hello"; then
another object will not be created but s2 will point to hello
because JVM will first check if the same object is present in
string pool or not.If not present then only a new one is created else not.
Now if suppose java allows string mutable then if we change s1 to hello world then s2 value will also be hello world so java String is immutable.
Can any body please tell me if my answer is right or wrong?
String is immutable for several reasons, here is a summary:
Security: parameters are typically represented as String in network connections, database connection urls, usernames/passwords etc. If it were mutable, these parameters could be easily changed.
Synchronization and concurrency: making String immutable automatically makes them thread safe thereby solving the synchronization issues.
Caching: when compiler optimizes your String objects, it sees that if two objects have same value (a="test", and b="test") and thus you need only one string object (for both a and b, these two will point to the same object).
Class loading: String is used as arguments for class loading. If mutable, it could result in wrong class being loaded (because mutable objects change their state).
That being said, immutability of String only means you cannot change it using its public API. You can in fact bypass the normal API using reflection. See the answer here.
In your example, if String was mutable, then consider the following example:
String a="stack";
System.out.println(a);//prints stack
a.setValue("overflow");
System.out.println(a);//if mutable it would print overflow
Java Developers decide Strings are immutable due to the following aspect design, efficiency, and security.
Design
Strings are created in a special memory area in java heap known as "String Intern pool". While you creating new String (Not in the case of using String() constructor or any other String functions which internally use the String() constructor for creating a new String object; String() constructor always create new string constant in the pool unless we call the method intern()) variable it searches the pool to check whether is it already exist.
If it is exist, then return reference of the existing String object.
If the String is not immutable, changing the String with one reference will lead to the wrong value for the other references.
According to this article on DZone:
Security
String is widely used as parameter for many java classes, e.g. network connection, opening files, etc. Were String not immutable, a connection or file would be changed and lead to serious security threat.
Mutable strings could cause security problem in Reflection too, as the parameters are strings.
Efficiency
The hashcode of string is frequently used in Java. For example, in a HashMap. Being immutable guarantees that hashcode will always the same, so that it can be cached without worrying the changes.That means, there is no need to calculate hashcode every time it is used.
We can not be sure of what was Java designers actually thinking while designing String but we can only conclude these reasons based on the advantages we get out of string immutability, Some of which are
1. Existence of String Constant Pool
As discussed in Why String is Stored in String Constant Pool article, every application creates too many string objects and in order to save JVM from first creating lots of string objects and then garbage collecting them. JVM stores all string objects in a separate memory area called String constant pool and reuses objects from that cached pool.
Whenever we create a string literal JVM first sees if that literal is already present in constant pool or not and if it is there, new reference will start pointing to the same object in SCP.
String a = "Naresh";
String b = "Naresh";
String c = "Naresh";
In above example string object with value Naresh will get created in SCP only once and all reference a, b, c will point to the same object but what if we try to make change in a e.g. a.replace("a", "").
Ideally, a should have value Nresh but b, c should remain unchanged because as an end user we are making the change in a only. And we know a, b, c all are pointing the same object so if we make a change in a, others should also reflect the change.
But string immutability saves us from this scenario and due to the immutability of string object string object Naresh will never change. So when we make any change in a instead of change in string object Naresh JVM creates a new object assign it to a and then make change in that object.
So String pool is only possible because of String's immutability and if String would not have been immutable, then caching string objects and reusing them would not have a possibility because any variable woulds have changed the value and corrupted others.
And That's why it is handled by JVM very specially and have been given a special memory area.
2. Thread Safety
An object is called thread-safe when multiple threads are operating on it but none of them is able to corrupt its state and object hold the same state for every thread at any point in time.
As we an immutable object cannot be modified by anyone after its creation which makes every immutable object is thread safe by default. We do not need to apply any thread safety measures to it such as creating synchronized methods.
So due to its immutable nature string object can be shared by multiple threads and even if it is getting manipulated by many threads it will not change its value.
3. Security
In every application, we need to pass several secrets e.g. user's user-name\passwords, connection URLs and in general, all of this information is passed as the string object.
Now suppose if String would not have been immutable in nature then it would cause a serious security threat to the application because these values are allowed to get changed and if it is allowed then these might get changed due to wrongly written code or any other person who have access to our variable references.
4. Class Loading
As discussed in Creating objects through Reflection in Java with Example, we can use Class.forName("class_name") method to load a class in memory which again calls other methods to do so. And even JVM uses these methods to load classes.
But if you see clearly all of these methods accepts the class name as a string object so Strings are used in java class loading and immutability provides security that correct class is getting loaded by ClassLoader.
Suppose if String would not have been immutable and we are trying to load java.lang.Object which get changed to org.theft.OurObject in between and now all of our objects have a behavior which someone can use to unwanted things.
5. HashCode Caching
If we are going to perform any hashing related operations on any object we must override the hashCode() method and try to generate an accurate hashcode by using the state of the object. If an object's state is getting changed which means its hashcode should also change.
Because String is immutable so the value one string object is holding will never get changed which means its hashcode will also not change which gives String class an opportunity to cache its hashcode during object creation.
Yes, String object caches its hashcode at the time of object creation which makes it the great candidate for hashing related operations because hashcode doesn't need to be calculated again which save us some time. This is why String is mostly used as HashMap keys.
Read More on Why String is Immutable and Final in Java.
Most important reason according to this article on DZone:
String Constant Pool
...
If string is mutable, changing the string with one reference will lead to the wrong value for the other references.
Security
String is widely used as parameter for many java classes, e.g. network connection, opening files, etc. Were String not immutable, a connection or file would be changed and lead to serious security threat.
...
Hope it will help you.
IMHO, this is the most important reason:
String is Immutable in Java because String objects are cached in
String pool. Since cached String literals are shared between multiple
clients there is always a risk, where one client's action would affect
all another client.
Ref: Why String is Immutable or Final in Java
You are right. String in java uses concept of String Pool literal. When a string is created and if the string already exists in the pool, the reference of the existing string will be returned, instead of creating a new object and returning its reference.If a string is not immutable, changing the string with one reference will lead to the wrong value for the other references.
I would add one more thing, since String is immutable, it is safe for multi threading and a single String instance can be shared across different threads. This avoid the usage of synchronization for thread safety, Strings are implicitly thread safe.
String is given as immutable by Sun micro systems,because string can used to store as key in map collection.
StringBuffer is mutable .That is the reason,It cannot be used as key in map object
The most important reason of a String being made immutable in Java is Security consideration. Next would be Caching.
I believe other reasons given here, such as efficiency, concurrency, design and string pool follows from the fact that String in made immutable. For eg. String Pool could be created because String was immutable and not the other way around.
Check Gosling interview transcript here
From a strategic point of view, they tend to more often be trouble free. And there are usually things you can do with immutables that you can't do with mutable things, such as cache the result. If you pass a string to a file open method, or if you pass a string to a constructor for a label in a user interface, in some APIs (like in lots of the Windows APIs) you pass in an array of characters. The receiver of that object really has to copy it, because they don't know anything about the storage lifetime of it. And they don't know what's happening to the object, whether it is being changed under their feet.
You end up getting almost forced to replicate the object because you don't know whether or not you get to own it. And one of the nice things about immutable objects is that the answer is, "Yeah, of course you do." Because the question of ownership, who has the right to change it, doesn't exist.
One of the things that forced Strings to be immutable was security. You have a file open method. You pass a String to it. And then it's doing all kind of authentication checks before it gets around to doing the OS call. If you manage to do something that effectively mutated the String, after the security check and before the OS call, then boom, you're in. But Strings are immutable, so that kind of attack doesn't work. That precise example is what really demanded that
Strings be immutable
String class is FINAL it mean you can't create any class to inherit it and change the basic structure and make the Sting mutable.
Another thing instance variable and methods of String class that are provided are such that you can't change String object once created.
The reason what you have added doesn't make the String immutable at all.This all says how the String is stored in heap.Also string pool make the huge difference in performance
In addition to the great answers, I wanted to add a few points. Like Strings, Array holds a reference to the starting of the array so if you create two arrays arr1 and arr2 and did something like arr2 = arr1 this will make the reference of arr2 same as arr1 hence changing value in one of them will result in change of the other one for example
public class Main {
public static void main(String[] args) {
int[] a = {1, 2, 3, 4};
int[] b = a;
a[0] = 8;
b[1] = 7;
System.out.println("A: " + a[0] + ", B: " + b[0]);
System.out.println("A: " + a[1] + ", B: " + b[1]);
//outputs
//A: 8, B: 8
//A: 7, B: 7
}
}
Not only that it would cause bugs in the code it also can(and will) be exploited by malicious user. Suppose if you have a system that changes the admin password. The user have to first enter the newPassword and then the oldPassword if the oldPassword is same as the adminPass the program change the password by adminPass = newPassword. let's say that the new password has the same reference as the admin password so a bad programmer may create a temp variable to hold the admin password before the users inputs data if the oldPassword is equal to temp it changes the password otherwise adminPass = temp. Someone knowing that could easily enter the new password and never enter the old password and abracadabra he has admin access. Another thing I didn't understand when learning about Strings why doesn't JVM create a new string for every object and have a unique place in memory for it and you can just do that using new String("str"); The reason you wouldn't want to always use new is because it's not memory efficient and it is slower in most cases read more.
If HELLO is your String then you can't change HELLO to HILLO. This property is called immutability property.
You can have multiple pointer String variable to point HELLO String.
But if HELLO is char Array then you can change HELLO to HILLO. Eg,
char[] charArr = 'HELLO';
char[1] = 'I'; //you can do this
Answer:
Programming languages have immutable data variables so that it can be used as keys in key, value pair. String variables are used as keys/indices, so they are immutable.
This probably has little to do with security because, very differently, security practices recommend using character arrays for passwords, not strings. This is because an array can be immediately erased when no longer needed. Differently, a string cannot be erased, because it is immutable. It may take long time before it is garbage collected, and even more before the content gets overwritten.
I think that immutability was chosen to allow sharing the strings and they fragments easily. String assignment, picking a substring becomes a constant time operation, and string comparison largely also, because of the reusable hash codes that are part of the string data structure and can be compared first.
From the other side, if the original string is huge (say large XML document), picking few symbols from there may prevent the whole document from being garbage collected. Because of that later Java versions seemed moved away from this immutability. Modern C++ has both mutable (std::string) and from C++17 also immutable (std::string_view) versions.
From the Security point of view we can use this practical example:
DBCursor makeConnection(String IP,String PORT,String USER,String PASS,String TABLE) {
// if strings were mutable IP,PORT,USER,PASS can be changed by validate function
Boolean validated = validate(IP,PORT,USER,PASS);
// here we are not sure if IP, PORT, USER, PASS changed or not ??
if (validated) {
DBConnection conn = doConnection(IP,PORT,USER,PASS);
}
// rest of the code goes here ....
}

Safely using String for passwords by using reflection to scrub contents prior to garbage collection

Does using reflection to scrub a String make using String as safe as using char[] for passwords?
From a security aspect, it is generally considered best practice to use char[] for storing/passing passwords, because one can zero-out its contents as soon as possible in code, which may be significantly before garbage collection cleans it up and the memory is reused (wiping all trace), limiting the window of time for a memory attack.
However, char[] is not as convenient as String, so it would be handy if one could "scrub" a String if needed, thus making String as safe as char[].
Below is a method that uses reflection to zero-out the fields of String.
Is this method "OK", and does it achieve the goal of making String as safe as char[] for passwords?
public static void scrub(String str) throws NoSuchFieldException, IllegalAccessException {
Field valueField = String.class.getDeclaredField("value");
Field offsetField = String.class.getDeclaredField("offset");
Field countField = String.class.getDeclaredField("count");
Field hashField = String.class.getDeclaredField("hash");
valueField.setAccessible(true);
offsetField.setAccessible(true);
countField.setAccessible(true);
hashField.setAccessible(true);
char[] value = (char[]) valueField.get(str);
// overwrite the relevant array contents with null chars
Arrays.fill(value, offsetField.getInt(str), countField.getInt(str), '\0');
countField.set(str, 0); // scrub password length too
hashField.set(str, 0); // the hash could be used to crack a password
valueField.setAccessible(false);
offsetField.setAccessible(false);
countField.setAccessible(false);
hashField.setAccessible(false);
}
Here's a simple test:
String str = "password";
scrub(str);
System.out.println('"' + str + '"');
Output:
""
Note: You may assume that passwords are not String constants and thus calling this method will have no adverse effect on interned Strings.
Also, I have left the method is a fairly "raw" state for simplicity's sake. If I were to use it, I would not declare exceptions thrown (try/catch/ignoring them) and refactor repeated code.
There are two potential safety concerns:
The String may share its backing array with other Strings; e.g. if the String was created by calling substring on a larger String. So when you zero the entire value array you could be overwriting the state of other strings ... that don't contain passwords.
The cure is to only zero the part of the backing array that is used by the password string.
The JLS (17.5.3) warns that the effects of using reflection to change final variables is undefined.
However, the context for this is the Java Memory Model, and the fact that the compiler is allowed to aggressively cache final variables. In this case:
you would expect the String to be thread-confined, and
you shouldn't be using any of those variables again.
I wouldn't expect either of these to be real problems ... modulo fixing the over-aggressive zeroing of value.
But the real concern is Velociraptors. :-)
I'm puzzled that you would actually bothering to zap passwords like this. When you think about it, what you are protecting against is the possibility that someone can read process memory ... or a core dump or swap file ... to retrieve passwords. But if someone can do that, your system security has to have already been compromised ... cos' those things most likely require root access (or equivalent). And if they have root access they can "debug" your program and catch the passwords before your application zaps them.
One argument I have against String is that it's just too easy to inadvertently make a copy. Using strings safely is possible in theory, but the whole library ecosystem is based on the assumption that it's perfectly OK to copy strings. In the end, considering all the restrictions, strings may not be as convenient for this use case as they generally are.

Flyweight : Strings already use String pool : Does it makes sense to pool String objects for Flyweight?

Strings are already using Flyweight Design Pattern. Will it be beneficial/performant to pool common String objects. As the Strings will be already pulled from the String pool?
Strings can come from many places, and by default only string literals are in the string pool. For example, when you call BufferedReader.readLine(), the string that it returns is not in the string pool.
Whether it makes sense to pool such strings, either using String.intern() or a canonicalizing map, depends on how much duplication you have, and how much memory you can spare to reduce that duplication.
For example, if you're reading an XML file, it might be very useful to canonicalize element names. If you're reading a file of address data, it might be useful to canonicalize zip codes and/or city names. However, in both cases I'd look at using a Map rather than calling intern(), because the latter consumes permgen memory (which is a scarcer resource than normal heap memory).
Without any other info about your system, I would say that creating a specific purpose pool of Strings would fall in the premature optimization category. If your system is indeed very String operation heavy and profiling shows that String objects are the reason that major garbage collections occur, then I would recommend looking at StringBuilder as a replacement, as well as understanding in depth the best practices of working with Strings, instead of creating a cache for them.

Categories

Resources