This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why is char[] preferred over string for passwords?
Reading the java documentation, i found this statement about Console class
First, it suppresses echoing, so the password is not visible on the user's screen. Second, readPassword returns a character array, not a String, so the password can be overwritten, removing it from memory as soon as it is no longer needed.
Why a character array can be overwritten and a String not?
Or maybe a character array can be overwritted in a more simple way?
A String could be kept in something called a String pool by the JVM to manage memory usage for Strings more efficiently. A side effect of this however, is that it may be kept in memory even after you overwrite the reference with a new String.
A character array however can be directly overwritten, and is therefore safer in this respect.
From the Sun Certified Java Programmer for Java 6 Study Guide:
the readPassword method doesn't return a string: it returns a character array. Here's the reason for this: Once you've got the password, you can verify it and then absolutely remove it from memory. If a string was returned, it could exist in a pool somewhere in memory and perhaps some nefarious hacker could find it.
Related
As far as I understand, Java Strings are just an array of characters, with the maximum length being an integer value.
If I understand this answer correctly, it is possible to cause an overflow with a String - albeit in "unusual circumstances".
Since Java Strings are based on char arrays and Java automatically checks array bounds, buffer overflows are only possible in unusual scenarios:
If you call native code via JNI
In the JVM itself (usually written in C++)
The interpreter or JIT compiler does not work correctly (Java bytecode mandated bounds checks)
Correct me if I'm wrong, but I believe this means that you can write outside the bounds of the array, without triggering the ArrayIndexOutOfBounds (or similar) exception.
I've encountered issues in C++ with buffer overflows, and I can find plenty of advice about other languages, but none specifically answering what would happen if you caused a buffer overflow with a String (or any other array type) in Java.
I know that Java Strings are bounds-checked, and can't be overflowed by native Java code alone (unless issues are present in the compiler or JVM, as per points 2 and 3 above), but the first point implies that it is technically possible to get a char[] into an... undesirable position.
Given this, I have two specific questions about the behaviour of such issues in Java, assuming the above is correct:
If a String can overflow, what happens when it does?
What would the implications of this behaviour be?
Thanks in advance.
To answer you first question, I had the luck of actually causing a error of such, and the execution just stopped throwing one of these errors:
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
So that was my case, I don't know if that represents a security problem as buffer overflow in C and C++.
A String in Java is immutable, so once created there is no writing to the underlying array of char or array of byte (it depends on the Java version and contents of the String whether one or the other is used). Ok, using JNI could circumvent that, but with pure Java, it is impossible to leave the bounds of the array without causing an ArrayOutOfBoundsException or alike.
The only way to cause a kind of an overflow in the context of String handling would be to create a new String that is too long. Make sure that your JVM will have enough heap (around 36 GB), create a char array of Integer.MAX_VALUE - 1, populate that appropriately, call new String( byte [] ) with that array, and then execute
var string = string.concat( new String( array ) );
But the result is just an exception telling you that it was attempted to create a too large array.
I feel strings can replace character array in all the scenarios. Even considering the immutability characteristic of Strings, declaration of strings in appropriate scope and java's garbage collection feature should help us avoid any memory leaks. I want to know if there is any corner case where character array should be used instead of Strings in Java.
Character arrays have some slight advantage over plain strings when it comes to storing security sensitive data. There's a lot of resources on that, for example this question: Why is char[] preferred over String for passwords? (with an answer by Jon Skeet himself).
In general it boils down to two things:
You have very little influence on how long a String stays in memory. Because of that you might leak sensitive data through a memory dump.
Leaking sensitive data accidentally in application logs as clear text is much more likely with plain strings
More reading:
Why we read password from console in char array instead of String
https://www.codebyamir.com/blog/use-character-arrays-to-store-sensitive-data-java
https://www.geeksforgeeks.org/use-char-array-string-storing-passwords-java/amp/
https://www.baeldung.com/java-storing-passwords
https://javarevisited.blogspot.com/2012/03/why-character-array-is-better-than.html
https://javainsider.wordpress.com/2012/12/10/character-array-is-better-than-string-for-storing-password-in-java/amp/
String is a class, not a build in type. It most likely does what it does by using a char array underneath, but there is no guarantee. "We dont care how it is implemented". It has methods that make sense for strings, like comparing strings. Comparing arrays?? Hmm. Doesn't really make sense to do it. You could check if they are equal sure, but less or greater than...
Back in point. One scenario is you want to operate with chars, not a string. For example you have letters of the alphabet and want to sort them. Or grades in A-F system and you want to sort them. Generally where it makes sense having chars that are not connected to have some meaning together (like in a message string, or a text message). You would not generally need to sort the chars of a text message now, would you? So, you use an array.
To sort, you can take advantage of the Arrays.sort() method for example, while i dont think there is a method that does it for strings. Perhaps 3rd part libraries.
On another note(unrelated to question) , you can use StringBuilder to if you want to modify strings often. Its better at performace.
You don't have to look much further than at methods in the JDK core API that use char[].
Such as this one (java.io.Reader):
public int read(char[] cbuf)
throws IOException
Reads characters into an array. This method will block until some input is available, an I/O error occurs, or the end of the stream is reached.
Parameters:
cbuf - Destination buffer
Returns:
The number of characters read, or -1 if the end of the stream has been reached
Throws:
IOException - If an I/O error occurs
Instead of returning a String they ask you to pass in a char[] to use as a buffer to write the result into. The reason is efficiency.
You might be knowing String is immutable and how Substring can cause memory leak in Java.
Since Strings are immutable in Java if you store password as plain text it will be available in memory until Garbage collector clears it and since String are used in String pool for reusability there is pretty high chance that it will be remain in memory for long duration, which pose a security threat. Since any one who has access to memory dump can find the password in clear text. Since Strings are immutable there is no way contents of Strings can be changed because any change will produce new String, while if you char[] you can still set all his element as blank or zero. So Storing password in character array clearly mitigates security risk of stealing password.
Whilst handling passwords in Java, its my understanding that they should always be handled in char[]'s to allow GC and remove hanging references.
My question is would,
char[] password = String.valueOf(authentication.getCredentials()).toCharArray();
Could the value of authentication.getCredentials() to be interned or not?
It's not a question of interning the String, any security concerns around using Strings to store passwords arise from the amount of time they are present in memory.
With a char array you have the ability to wipe the contents once you've finished reading them. With a String (which is immutable) you're left relying on the garbage collector, this means that if someone has access to your server and dumps the memory there may be password visible.
String.valueOf() doesn't intern Strings. The only way to intern Strings during runtime is with password.intern(). There's no need to use char[] for passwords. Using char[] allows you to clear the array directly after use, narrowing the attacker's timeframe to dump the memory and retrieve the plaintext password.
A String by itself is nothing special to the GC. Interning affects it a bit, but in regular use you wouldn't encounter anything out of the ordinary.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
hiding strings in Obfuscated code
I'm trying to hide a little some static Strings of my app in order to make it harder to decompile, this way like the constants like cipher algorithms names are harder to find in the obfuscated code.
I've considered things like:
String CONCAT= "concat"+"string";
String RAW_STRING= "raw_string";
String FROM_BYTES=new String("from_bytes".getBytes());
String FROM_CHARS=new String(new char[]{'f','r','o','m','_','c','h','a','r','s'});
String FROM_CHAR2=new String(new char[]{102,114,111,109,95,99,104,97,114,115,95,50});
And the last two options seems to be "darker" than the raw option but I imagine there are better ways for doing this.
How can I improve this? Thanks
For one, you shouldn't just write
String FROM_CHAR2=new String(new char[]{102,114,111,109,95,99,104,97,114,115,95,50});
It's a dead give-away that the char array is actually a String.
You can do a combination of the followings:
put your "String" in an int[] array
or even better, break your String into several int arrays
calculate/manipulate the array's values at various stage of the application, so its value will only become valid at a certain interval during a runtime, guaranteeing that it won't be deciphered at a curious glance by decompiling your code
passes the array(s) back and forth, through local variables, back to instance variables, etc, before finally converting the arrays to a single array to be passed to the String constructor
immediately set the String to null after use, just to reduce the amount of time the actual String exist at runtime
I would prefer to set the value in the static (class) initializer using an decryption algo
Something like
class ...
String CONCAT;
static {
CONCAT = uncrypt ("ahgsdhagcf");
}
where uncrypt might be really a good unencryption algo or somewhat weaker a base64 decode.
In any case you need a simple program to encode your string first.
In reference to the link: File IO Tuning, last section titled "Further Tuning" where the author suggests using char[] to avoid generating String objects for n lines in the file, I need to understand how does
char[] arr = new char{'a','u','t','h', 'o', 'r'}
differ with
String s = "author"
in terms of memory consumption or any other performance factor? Isn't String object internally stored as a character array? I feel silly since I never thought of this before. :-)
In Oracle's JDK a String has four instance-level fields:
A character array
An integral offset
An integral character count
An integral hash value
That means that each String introduces an extra object reference (the String itself), and three integers in addition to the character array itself. (The offset and character count are there to allow sharing of the character array among String instances produced through the String#substring() methods, a design choice that some other Java library implementers have eschewed.) Beyond the extra storage cost, there's also one more level of access indirection, not to mention the bounds checking with which the String guards its character array.
If you can get away with allocating and consuming just the basic character array, there's space to be saved there. It's certainly not idiomatic to do so in Java though; judicious comments would be warranted to justify the choice, preferably with mention of evidence from having profiled the difference.
In the example you've referred to, it's because there's only a single character array being allocated for the whole loop. It's repeatedly reading into that same array, and processing it in place.
Compare that with using readLine which needs to create a new String instance on each iteration. Each String instance will contain a few int fields and a reference to a char[] containing the actual data - so it would need two new instances per iteration.
I'd usually expect the differences to be insignificant (with a decent GC throwing away unused "young" objects very efficiently) compared with the IO involved in reading the data - assuming it's from disk - but I believe that's the point the author was trying to make.
The author didn't get the reason right. The real overhead in in.readLine() is the copying a char[] buffer when making a String out of it. The additional copying is the most damning cost when dealing with large data.
It is possible to optimize this within JDK so that the additional copying is not needed.
Here are few reasons which makes sense to believe that character array is better choice in Java than String:
Say for Storing the Password
1) Since Strings are immutable in Java, if you store password as plain text it will be available in memory until Garbage collector clears it and since String are used in String pool for reusability there is pretty high chance that it will be remain in memory for long duration, which pose a security threat.
Since any one who has access to memory dump can find the password in clear text and that's another reason you should always used an encrypted password than plain text.
Since Strings are immutable there is no way contents of Strings can be changed because any change will produce new String, while if you char[] you can still set all his element as blank or zero. So Storing password in character array clearly mitigates security risk of stealing password.
2) Java itself recommends using getPassword() method of JPasswordField which returns a char[] and deprecated getText() method which returns password in clear text stating security reason. Its good to follow advice from Java team and adhering to standard rather than going against it.
3) With String there is always a risk of printing plain text in log file or console but if use Array you won't print contents of array instead its memory location get printed. though not a real reason but still make sense.
For this simple program
String strPassword="Unknown";
char[] charPassword= new char[]{'U','n','k','n','o','w','n'};
System.out.println("String password: " + strPassword);
System.out.println("Character password: " + charPassword);
Output:
String password: Unknown
Character password: [C#110b053
That's all on why character array is better choice than String for storing passwords in Java. Though using char[] is not just enough you need to erase content to be more secure.
Hope this will help.
My answer is going to focus on other stack questions along this similar line, others have already posted more direct answers.
There have been other questions similar to this, advice seems to go along the lines of using StringBuilder.
If you're concerned with string concentenation this have a look at the performance as described here between three different implementations. With another stack post which can give you some additional pointers and examples you could try yourself to see the performance.