Java provides the URLEncoder class for URL-encoding Strings. But it is considered insecure to store passwords as Strings. Is this code to send a password via POST over an HttpsURLConnection output stream wr secure enough?
try (DataOutputStream wr = new DataOutputStream(con.getOutputStream())) {
// Write some post params
for (char c : pass) {
wr.writeBytes(URLEncoder.encode(String.valueOf(c), "UTF-8"));
}
// Write some more
}
On the one hand, it is using Strings. On the other hand, those Strings are 1 character long, and conceptually the same after encoding. Also, it seems to me that this could fail on multi-byte characters. Would an attacker be able to locate these 1-char Strings in memory and reconstruct the original password? Is there a better way to do this?
It's true that when using Strings for passwords, you can't reliably discard them since you can't zero out the String contents manually (except through reflection) and the contents will stay in memory for an unknown time. Therefore char[] is often recommended to be used for any password input, followed by a manual zeroing of the said char[].
However the attack is in no way easy to mount since it requires access to the memory, luck with timing. It's unlikely the password will stay in the memory for very long amounts of time, as the GC does its work and the memory gets reused. In most cases this attack vector can be infeasible compared to other, simpler attacks.
An OutputStream has no method writeBytes; but it offers a write(byte[]) method that could be used to write all chars of the password with one call.
And beyond that: the whole idea of using HTTPS is that the connection itself is securely encrypted; so it really should not matter whether you transmit such content with single byte or multi byte bursts.
Then: keep in mind that all these classes are abstractions, that are layered upon each other. HTTPS uses TCP; and TCP packets have a certain size; in that sense there aren't single bytes going over the network anyway.
And regarding the second part of your question: you are iterating an array of char values. So it really depends how that char array was created; but normally, you don't need to worry (see here on that topic).
Related
I feel strings can replace character array in all the scenarios. Even considering the immutability characteristic of Strings, declaration of strings in appropriate scope and java's garbage collection feature should help us avoid any memory leaks. I want to know if there is any corner case where character array should be used instead of Strings in Java.
Character arrays have some slight advantage over plain strings when it comes to storing security sensitive data. There's a lot of resources on that, for example this question: Why is char[] preferred over String for passwords? (with an answer by Jon Skeet himself).
In general it boils down to two things:
You have very little influence on how long a String stays in memory. Because of that you might leak sensitive data through a memory dump.
Leaking sensitive data accidentally in application logs as clear text is much more likely with plain strings
More reading:
Why we read password from console in char array instead of String
https://www.codebyamir.com/blog/use-character-arrays-to-store-sensitive-data-java
https://www.geeksforgeeks.org/use-char-array-string-storing-passwords-java/amp/
https://www.baeldung.com/java-storing-passwords
https://javarevisited.blogspot.com/2012/03/why-character-array-is-better-than.html
https://javainsider.wordpress.com/2012/12/10/character-array-is-better-than-string-for-storing-password-in-java/amp/
String is a class, not a build in type. It most likely does what it does by using a char array underneath, but there is no guarantee. "We dont care how it is implemented". It has methods that make sense for strings, like comparing strings. Comparing arrays?? Hmm. Doesn't really make sense to do it. You could check if they are equal sure, but less or greater than...
Back in point. One scenario is you want to operate with chars, not a string. For example you have letters of the alphabet and want to sort them. Or grades in A-F system and you want to sort them. Generally where it makes sense having chars that are not connected to have some meaning together (like in a message string, or a text message). You would not generally need to sort the chars of a text message now, would you? So, you use an array.
To sort, you can take advantage of the Arrays.sort() method for example, while i dont think there is a method that does it for strings. Perhaps 3rd part libraries.
On another note(unrelated to question) , you can use StringBuilder to if you want to modify strings often. Its better at performace.
You don't have to look much further than at methods in the JDK core API that use char[].
Such as this one (java.io.Reader):
public int read(char[] cbuf)
throws IOException
Reads characters into an array. This method will block until some input is available, an I/O error occurs, or the end of the stream is reached.
Parameters:
cbuf - Destination buffer
Returns:
The number of characters read, or -1 if the end of the stream has been reached
Throws:
IOException - If an I/O error occurs
Instead of returning a String they ask you to pass in a char[] to use as a buffer to write the result into. The reason is efficiency.
You might be knowing String is immutable and how Substring can cause memory leak in Java.
Since Strings are immutable in Java if you store password as plain text it will be available in memory until Garbage collector clears it and since String are used in String pool for reusability there is pretty high chance that it will be remain in memory for long duration, which pose a security threat. Since any one who has access to memory dump can find the password in clear text. Since Strings are immutable there is no way contents of Strings can be changed because any change will produce new String, while if you char[] you can still set all his element as blank or zero. So Storing password in character array clearly mitigates security risk of stealing password.
I am developing a Java-based downloader for binary data. This data is transferred via a text-based protocol (UU-encoded). For the networking task the netty library is used. The binary data is split by the server into many thousands of small packets and sent to the client (i.e. the Java application).
From netty I receive a ChannelBuffer object every time a new message (data) is received. Now I need to process that data, beside other tasks I need to check the header of the package coming from the server (like the HTTP status line). To do so I call ChannelBuffer.array() to receive a byte[] array. This array I can then convert into a string via new String(byte[]) and easily check (e.g. compare) its content (again, like comparison to the "200" status message in HTTP).
The software I am writing is using multiple threads/connections, so that I receive multiple packets from netty in parallel.
This usually works fine, however, while profiling the application I noticed that when the connection to the server is good and data comes in very fast, then this conversion to the String object seems to be a bottleneck. The CPU usage is close to 100% in such cases, and according to the profiler very much time is spent in calling this String(byte[]) constructor.
I searched for a better way to get from the ChannelBuffer to a String, and noticed the former also has a toString() method. However, that method is even slower than the String(byte[]) constructor.
So my question is: Does anyone of you know a better alternative to achieve what I am doing?
Perhaps you could skip the String conversion entirely? You could have constants holding byte arrays for your comparison values and check array-to-array instead of String-to-String.
Here's some quick code to illustrate. Currently you're doing something like this:
String http200 = "200";
// byte[] -> String conversion happens every time
String input = new String(ChannelBuffer.array());
return input.equals(http200);
Maybe this is faster:
// Ideally only convert String->byte[] once. Store these
// arrays somewhere and look them up instead of recalculating.
final byte[] http200 = "200".getBytes("UTF-8"); // Select the correct charset!
// Input doesn't have to be converted!
byte[] input = ChannelBuffer.array();
return Arrays.equals(input, http200);
Some of the checking you are doing might just look at part of the buffer. If you could use the alternate form of the String constructor:
new String(byteArray, startCol, length)
That might mean a lot less bytes get converted to a string.
Your example of looking for "200" within the message would be an example.
2
You might find that you can use the length of the byte array as a clue. If some messages are long and you are looking for a short one, ignore the long ones and don't convert to characters. Or something like that.
3
Along with what #EricGrunzke said, partially looking in the byte buffer to filter out some messages and find that you don't need to convert them from bytes to characters.
4
If your bytes are ASCII characters, the conversion to characters might be quicker if you use charset "ASCII" instead of whatever the default is for your server:
new String(bytes, "ASCII")
might be faster in that case.
In fact, you might be able to pick and choose the charset for conversion byte-character in some organized fashion that speeds up things.
Depending on what you are trying to do there are a few options:
If you are just trying to get the response status to then can't you just call getStatus()? This would probably be faster than getting the string out.
If you are trying to convert the buffer, then, assuming you know it will be ASCII, which it sounds like you do, then just leave the data as byte[] and convert your UUDecode method to work on a byte[] instead of a String.
The biggest cost of the string conversion is most likely the copying of the data from the byte array to the internal char array of the String, this combined with the conversion is most likely just a bunch of work that you don't need to do.
After reading this beautiful question: Why is char[] preferred over String for passwords?, I'm curious as to how this applies to servlet based web applications. Say your UI has some input field for the password, the password will be retrievable with request.getParameter("passwordFieldName") which returns a String. Even if you then convert it to a char[], you have to wait for GC to clear the String object.
Also, many of the Encryption/Hashing libraries I'm looking into using for password hashing have a method such as checkPassword(plaintext, hashed) that takes two Strings and returns true if the entered plain text string gives a hash equal to hashed. With this, even if you had a char[], you would still need to convert the array to a String with the new String(char[]) constructor.
It seems to me like you can't really avoid having your password as a String for comparing it to its stored representation. If you are worried about attacks during that small window, how do you protect yourself?
This is an overreaction and really just "security theater". There is really no scenario in which having a long String as a password in a Java application would be at all desirable to an attacker. If a memory exhaustion attack is a concern, then don't use Strings anywhere.
That being said CWE-521 states that passwords must have a max size. Strings don't really have a max size, and using a char[x] is a good way of enforcing a max size.
In reference to the link: File IO Tuning, last section titled "Further Tuning" where the author suggests using char[] to avoid generating String objects for n lines in the file, I need to understand how does
char[] arr = new char{'a','u','t','h', 'o', 'r'}
differ with
String s = "author"
in terms of memory consumption or any other performance factor? Isn't String object internally stored as a character array? I feel silly since I never thought of this before. :-)
In Oracle's JDK a String has four instance-level fields:
A character array
An integral offset
An integral character count
An integral hash value
That means that each String introduces an extra object reference (the String itself), and three integers in addition to the character array itself. (The offset and character count are there to allow sharing of the character array among String instances produced through the String#substring() methods, a design choice that some other Java library implementers have eschewed.) Beyond the extra storage cost, there's also one more level of access indirection, not to mention the bounds checking with which the String guards its character array.
If you can get away with allocating and consuming just the basic character array, there's space to be saved there. It's certainly not idiomatic to do so in Java though; judicious comments would be warranted to justify the choice, preferably with mention of evidence from having profiled the difference.
In the example you've referred to, it's because there's only a single character array being allocated for the whole loop. It's repeatedly reading into that same array, and processing it in place.
Compare that with using readLine which needs to create a new String instance on each iteration. Each String instance will contain a few int fields and a reference to a char[] containing the actual data - so it would need two new instances per iteration.
I'd usually expect the differences to be insignificant (with a decent GC throwing away unused "young" objects very efficiently) compared with the IO involved in reading the data - assuming it's from disk - but I believe that's the point the author was trying to make.
The author didn't get the reason right. The real overhead in in.readLine() is the copying a char[] buffer when making a String out of it. The additional copying is the most damning cost when dealing with large data.
It is possible to optimize this within JDK so that the additional copying is not needed.
Here are few reasons which makes sense to believe that character array is better choice in Java than String:
Say for Storing the Password
1) Since Strings are immutable in Java, if you store password as plain text it will be available in memory until Garbage collector clears it and since String are used in String pool for reusability there is pretty high chance that it will be remain in memory for long duration, which pose a security threat.
Since any one who has access to memory dump can find the password in clear text and that's another reason you should always used an encrypted password than plain text.
Since Strings are immutable there is no way contents of Strings can be changed because any change will produce new String, while if you char[] you can still set all his element as blank or zero. So Storing password in character array clearly mitigates security risk of stealing password.
2) Java itself recommends using getPassword() method of JPasswordField which returns a char[] and deprecated getText() method which returns password in clear text stating security reason. Its good to follow advice from Java team and adhering to standard rather than going against it.
3) With String there is always a risk of printing plain text in log file or console but if use Array you won't print contents of array instead its memory location get printed. though not a real reason but still make sense.
For this simple program
String strPassword="Unknown";
char[] charPassword= new char[]{'U','n','k','n','o','w','n'};
System.out.println("String password: " + strPassword);
System.out.println("Character password: " + charPassword);
Output:
String password: Unknown
Character password: [C#110b053
That's all on why character array is better choice than String for storing passwords in Java. Though using char[] is not just enough you need to erase content to be more secure.
Hope this will help.
My answer is going to focus on other stack questions along this similar line, others have already posted more direct answers.
There have been other questions similar to this, advice seems to go along the lines of using StringBuilder.
If you're concerned with string concentenation this have a look at the performance as described here between three different implementations. With another stack post which can give you some additional pointers and examples you could try yourself to see the performance.
I have a Java socket connection that is receiving data intermittently. The number of bytes of data received with each burst varies. The data may or may not be terminated by a well-known character (such as CR or LF). The length of each burst of data is variable.
I'm attempting to build a string out of each burst of data. What is the fastest way (speed, not memory), to build a string that would later need to be parsed?
I began by using a byte array to store the incoming bytes, then converting them to a String with each burst, like so:
byte[] message = new byte[1024];
...
message[i] = //byte from socket
i++;
...
String messageStr = new String(message);
...
//parse the string here
The obvious disadvantage of this is that some bursts may be longer than 1024. I don't want to arbitrarily create a larger byte array (what if my burst is larger?).
What is the best way of doing this? Should I create a StringBuilder object and append() to it? That way I don't have to convert from StringBuilder to String (since the former has most of the methods I need).
Again, speed of execution is my biggest concern.
TIA.
I would probably use an InputStreamReader wrapped around a BufferedInputStream, which in turn wraps the socket. And write code that processes a message at a time, potentially blocking for input. If the input is bursty, I might run on a background thread and use a concurrent queue to hold the messages.
Reading a buffer at a time and trying to convert it to characters is exactly what BufferedInputStream/InputStreamReader does. And it does so while paying attention to encoding, something that (as other people have noted) your solution does not.
I don't know why you're focused on speed, but you'll find that the time to process data coming off a socket is far less than the time it takes to transmit over that socket.
Note that as you're transmitting across network layers, your speed of conversion may not be the bottleneck. It would be worth measuring, if you believe this to be important.
Note (also) that you're not specifying a character encoding in your conversion from bytes to String (via characters). I would enforce that somehow, otherwise your client/server communication can become corrupted if/when your client/server run in different environments. You can enforce that via JVM runtime args, but it's not a particularly safe option.
Given the above, you may want to consider StringBuilder(int capacity) to configure it in advance with an appropriate size, such that it doesn't have to reallocate on the fly.
First of all, you are making a lot of assumptions about charachter encoding that you receive from your client. Is it US-ASCII, ISO-8859-1, UTF-8?
Because in Java string is not a sequence of bytes, when it comes to building portable String serialization code you should make explicit decisions about character encoding. For this reason you should NEVER use StringBuilder to convert bytes to String. If you look at StringBuilder interface you will notice that it does not even have an append( byte ) method, and that's not because designers just overlooked it.
In your case you should definetly use a ByteArrayOutputStream. The only drawback of using straight implementation of ByteArrayOutputStream is that its toByteArray() method returns a copy of the array held by the object internaly. For this reason you may create your own subclass of ByteArrayOutputStream and provide direct access to the protected buf member.
Note that if you don't use default implementation, remember to specify byte array bounds in your String constructor. Your code should look something like this:
MyByteArrayOutputStream message = new MyByteArrayOutputStream( 1024 );
...
message.write( //byte from socket );
...
String messageStr = new String(message.buf, 0, message.size(), "ISO-8859-1");
Substitute ISO-8859-1 for the character set that's suitable for your needs.
StringBuilder is your friend. Add as many characters as needed, then call toString() to obtain the String.
I would create a "small" array of characters and append characters to it.
When the array is full (or transmission ends), use the StringBuilder.append(char[] str) method to append the content of the array to your string.
Now for the "small" size of the array - you will need to try various sizes and see which one is fastest for your production environment (performance "may" depend on the JVM, OS, processor type and speed and so on)
EDIT: Other people mentioned ByteArrayOutputStream, I agree it is another option as well.
You may wish to look at ByteArrayOutputStream depending if you are dealing with Bytes instead of Characters.
I generally will use a ByteArrayOutputStream to assemble a message then use toString/toByteArray to retrive it when the message is finished.
Edit: ByteArrayOutputStream can handle various Character set encoding through the toString call.
Personally, independent of language, I would send all characters to an in-memory data stream and once I need the string, I would read all characters from this stream into a string.
As an alternative, you could use a dynamic array, making it bigger whenever you need to add more characters. Even better, keep track of the actual length and increase the array with additional blocks instead of single characters. Thus, you would start with 1 character in an array of 1000 chars. Once you get at 1001, the array needs to be resized to 2000, then 3000, 4000, etc...
Fortunately, several languages including Java have a special build-in class that specializes in this. These are the stringbuilder classes. Whatever technique they use isn't that important but they have been created to boost performance so they should be your fastest option.
Have a look at the Text class. It's faster (for the operations you perform) and more deterministic than StringBuilder.
Note: the project containing the class is aimed at RTSJ VMs. It is perfectly usable in standard SE/EE environments though.