What happens when a Java String overflows? - java

As far as I understand, Java Strings are just an array of characters, with the maximum length being an integer value.
If I understand this answer correctly, it is possible to cause an overflow with a String - albeit in "unusual circumstances".
Since Java Strings are based on char arrays and Java automatically checks array bounds, buffer overflows are only possible in unusual scenarios:
If you call native code via JNI
In the JVM itself (usually written in C++)
The interpreter or JIT compiler does not work correctly (Java bytecode mandated bounds checks)
Correct me if I'm wrong, but I believe this means that you can write outside the bounds of the array, without triggering the ArrayIndexOutOfBounds (or similar) exception.
I've encountered issues in C++ with buffer overflows, and I can find plenty of advice about other languages, but none specifically answering what would happen if you caused a buffer overflow with a String (or any other array type) in Java.
I know that Java Strings are bounds-checked, and can't be overflowed by native Java code alone (unless issues are present in the compiler or JVM, as per points 2 and 3 above), but the first point implies that it is technically possible to get a char[] into an... undesirable position.
Given this, I have two specific questions about the behaviour of such issues in Java, assuming the above is correct:
If a String can overflow, what happens when it does?
What would the implications of this behaviour be?
Thanks in advance.

To answer you first question, I had the luck of actually causing a error of such, and the execution just stopped throwing one of these errors:
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
So that was my case, I don't know if that represents a security problem as buffer overflow in C and C++.

A String in Java is immutable, so once created there is no writing to the underlying array of char or array of byte (it depends on the Java version and contents of the String whether one or the other is used). Ok, using JNI could circumvent that, but with pure Java, it is impossible to leave the bounds of the array without causing an ArrayOutOfBoundsException or alike.
The only way to cause a kind of an overflow in the context of String handling would be to create a new String that is too long. Make sure that your JVM will have enough heap (around 36 GB), create a char array of Integer.MAX_VALUE - 1, populate that appropriately, call new String( byte [] ) with that array, and then execute
var string = string.concat( new String( array ) );
But the result is just an exception telling you that it was attempted to create a too large array.

Related

Is there any scenario where character array is better than Strings in Java

I feel strings can replace character array in all the scenarios. Even considering the immutability characteristic of Strings, declaration of strings in appropriate scope and java's garbage collection feature should help us avoid any memory leaks. I want to know if there is any corner case where character array should be used instead of Strings in Java.
Character arrays have some slight advantage over plain strings when it comes to storing security sensitive data. There's a lot of resources on that, for example this question: Why is char[] preferred over String for passwords? (with an answer by Jon Skeet himself).
In general it boils down to two things:
You have very little influence on how long a String stays in memory. Because of that you might leak sensitive data through a memory dump.
Leaking sensitive data accidentally in application logs as clear text is much more likely with plain strings
More reading:
Why we read password from console in char array instead of String
https://www.codebyamir.com/blog/use-character-arrays-to-store-sensitive-data-java
https://www.geeksforgeeks.org/use-char-array-string-storing-passwords-java/amp/
https://www.baeldung.com/java-storing-passwords
https://javarevisited.blogspot.com/2012/03/why-character-array-is-better-than.html
https://javainsider.wordpress.com/2012/12/10/character-array-is-better-than-string-for-storing-password-in-java/amp/
String is a class, not a build in type. It most likely does what it does by using a char array underneath, but there is no guarantee. "We dont care how it is implemented". It has methods that make sense for strings, like comparing strings. Comparing arrays?? Hmm. Doesn't really make sense to do it. You could check if they are equal sure, but less or greater than...
Back in point. One scenario is you want to operate with chars, not a string. For example you have letters of the alphabet and want to sort them. Or grades in A-F system and you want to sort them. Generally where it makes sense having chars that are not connected to have some meaning together (like in a message string, or a text message). You would not generally need to sort the chars of a text message now, would you? So, you use an array.
To sort, you can take advantage of the Arrays.sort() method for example, while i dont think there is a method that does it for strings. Perhaps 3rd part libraries.
On another note(unrelated to question) , you can use StringBuilder to if you want to modify strings often. Its better at performace.
You don't have to look much further than at methods in the JDK core API that use char[].
Such as this one (java.io.Reader):
public int read(char[] cbuf)
throws IOException
Reads characters into an array. This method will block until some input is available, an I/O error occurs, or the end of the stream is reached.
Parameters:
cbuf - Destination buffer
Returns:
The number of characters read, or -1 if the end of the stream has been reached
Throws:
IOException - If an I/O error occurs
Instead of returning a String they ask you to pass in a char[] to use as a buffer to write the result into. The reason is efficiency.
You might be knowing String is immutable and how Substring can cause memory leak in Java.
Since Strings are immutable in Java if you store password as plain text it will be available in memory until Garbage collector clears it and since String are used in String pool for reusability there is pretty high chance that it will be remain in memory for long duration, which pose a security threat. Since any one who has access to memory dump can find the password in clear text. Since Strings are immutable there is no way contents of Strings can be changed because any change will produce new String, while if you char[] you can still set all his element as blank or zero. So Storing password in character array clearly mitigates security risk of stealing password.

Nashorn OutOfMemoryError when building large js objects (strings)

I'm using Nashorn on 1.8u60 to create model objects to pass back to view tier (thymeleaf). Part of the model object is a somewhat large string (not big enough to cause any issues in plain java) containing HTML. When trying to convert the object back into Java using ScriptObjectMirror methods i'm hitting the following exception. Changing max heap size doesn't seem to have any effect ( changed from 900mb to 1800mb, same error). I couldn't find much online about this, but are there any restrictions that Nashorn places on object sizes? I'm going to try latest 1.8 JDK now.
java.lang.OutOfMemoryError: Java heap space
at jdk.nashorn.internal.runtime.ConsString.flatten(ConsString.java:105)
at jdk.nashorn.internal.runtime.ConsString.flattened(ConsString.java:98)
at jdk.nashorn.internal.runtime.ConsString.toString(ConsString.java:69)
at jdk.nashorn.api.scripting.ScriptObjectMirror.wrap(ScriptObjectMirror.java:704)
at jdk.nashorn.api.scripting.ScriptObjectMirror.wrapLikeMe(ScriptObjectMirror.java:721)
at jdk.nashorn.api.scripting.ScriptObjectMirror.wrapLikeMe(ScriptObjectMirror.java:730)
at jdk.nashorn.api.scripting.ScriptObjectMirror.access$300(ScriptObjectMirror.java:64)
at jdk.nashorn.api.scripting.ScriptObjectMirror$13.call(ScriptObjectMirror.java:371)
at jdk.nashorn.api.scripting.ScriptObjectMirror$13.call(ScriptObjectMirror.java:364)
at jdk.nashorn.api.scripting.ScriptObjectMirror.inGlobal(ScriptObjectMirror.java:859)
at jdk.nashorn.api.scripting.ScriptObjectMirror.entrySet(ScriptObjectMirror.java:364)
...
Thanks,Adrian
That line reads
final char[] chars = new char[length];
so it appears there's indeed not enough memory for the final string. Nashorn uses ConsString as a way to amortize concatenation costs by delaying concatenation until the result is used (most JS engines use this optimization otherwise e.g. a concatenating lots of strings in a loop will require O(n^2) time).
This means that you might have a result of many + operators on strings be a tree of ConsString objects that get "flattened" at once. The tradeoff for linearizing the time of the concatenation is the need to keep those ConsStrings around, which'll require more than twice the memory required for the string (more than twice 'cause of the ConsString objects own overhead).
One way to get around this is to periodically invoke str.toString(). It is seemingly a no-op but internally it forces flattening of the concatenation tree. Try introducing it into your code at some point and see whether it helps.

Why char[] performs better than String ?- Java

In reference to the link: File IO Tuning, last section titled "Further Tuning" where the author suggests using char[] to avoid generating String objects for n lines in the file, I need to understand how does
char[] arr = new char{'a','u','t','h', 'o', 'r'}
differ with
String s = "author"
in terms of memory consumption or any other performance factor? Isn't String object internally stored as a character array? I feel silly since I never thought of this before. :-)
In Oracle's JDK a String has four instance-level fields:
A character array
An integral offset
An integral character count
An integral hash value
That means that each String introduces an extra object reference (the String itself), and three integers in addition to the character array itself. (The offset and character count are there to allow sharing of the character array among String instances produced through the String#substring() methods, a design choice that some other Java library implementers have eschewed.) Beyond the extra storage cost, there's also one more level of access indirection, not to mention the bounds checking with which the String guards its character array.
If you can get away with allocating and consuming just the basic character array, there's space to be saved there. It's certainly not idiomatic to do so in Java though; judicious comments would be warranted to justify the choice, preferably with mention of evidence from having profiled the difference.
In the example you've referred to, it's because there's only a single character array being allocated for the whole loop. It's repeatedly reading into that same array, and processing it in place.
Compare that with using readLine which needs to create a new String instance on each iteration. Each String instance will contain a few int fields and a reference to a char[] containing the actual data - so it would need two new instances per iteration.
I'd usually expect the differences to be insignificant (with a decent GC throwing away unused "young" objects very efficiently) compared with the IO involved in reading the data - assuming it's from disk - but I believe that's the point the author was trying to make.
The author didn't get the reason right. The real overhead in in.readLine() is the copying a char[] buffer when making a String out of it. The additional copying is the most damning cost when dealing with large data.
It is possible to optimize this within JDK so that the additional copying is not needed.
Here are few reasons which makes sense to believe that character array is better choice in Java than String:
Say for Storing the Password
1) Since Strings are immutable in Java, if you store password as plain text it will be available in memory until Garbage collector clears it and since String are used in String pool for reusability there is pretty high chance that it will be remain in memory for long duration, which pose a security threat.
Since any one who has access to memory dump can find the password in clear text and that's another reason you should always used an encrypted password than plain text.
Since Strings are immutable there is no way contents of Strings can be changed because any change will produce new String, while if you char[] you can still set all his element as blank or zero. So Storing password in character array clearly mitigates security risk of stealing password.
2) Java itself recommends using getPassword() method of JPasswordField which returns a char[] and deprecated getText() method which returns password in clear text stating security reason. Its good to follow advice from Java team and adhering to standard rather than going against it.
3) With String there is always a risk of printing plain text in log file or console but if use Array you won't print contents of array instead its memory location get printed. though not a real reason but still make sense.
For this simple program
String strPassword="Unknown";
char[] charPassword= new char[]{'U','n','k','n','o','w','n'};
System.out.println("String password: " + strPassword);
System.out.println("Character password: " + charPassword);
Output:
String password: Unknown
Character password: [C#110b053
That's all on why character array is better choice than String for storing passwords in Java. Though using char[] is not just enough you need to erase content to be more secure.
Hope this will help.
My answer is going to focus on other stack questions along this similar line, others have already posted more direct answers.
There have been other questions similar to this, advice seems to go along the lines of using StringBuilder.
If you're concerned with string concentenation this have a look at the performance as described here between three different implementations. With another stack post which can give you some additional pointers and examples you could try yourself to see the performance.

How to deal with large arrays in Java [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 11 years ago.
I am reading a file that has 10,000 int values and then trying to store these in an array. There is an exception thrown which says that the array value is too large.
I was wondering, rather than write this array out in to a variable, could i possibly just keep it in memory and read it from there. Would this be a suitable way of solving this problem?
edit:
After more examination it appears that the error being thrown is a "code to large for try statement" error. I am reading each array element and appending it to a string, maybe this is what is causing the error?
You could use an ArrayList instead - but an array should be fine with 10,000 values. Can you post more detail? Code, full stack trace etc. Theoretically it should be fine with Integer.MAX_VALUE elements (a LOT more than 10k), but of course you may run out of memory first!
In terms of "just keep it in memory and read it from there", well variables are just kept in memory, so whether you use an array or a list (or any other data structure) you'll always be reading it from memory!
EDIT: Based on your additional explanation then it's not a problem with the array size at all, it's a problem with you generating 10,000 lines of code to put in a single block, which is too many and thus it complains. Alter your code to generate code that uses a loop instead and all should be well, however many elements you have in there (up to Integer.MAX_VALUE of course.)
An array of 10,000 int values is about 40KB.
You could try to reduce the memory used further however I suspect this is not your problem.
Can you give us the actual error message? An array value is only too large if its a long e.g. say you used File.length()/4 to determine the size of the array, in which case you need to cast it to an int
It is strange that you cannot create 10000 elements long array. I believe that your problem is not the array length but the value of particular array element. Anyway if you need bigger arrays use Lists instead. Specifically java.util.LinkedList.
Your problem is that you are writing each array or String assignment out in full, something like this:
array[0] = 0;
array[1] = 1;
array[2] = 2;
// all the way up to 9999!
or this:
String s = "";
s += array[0];
s += array[1];
s += array[2];
// all the way up to 9999!
instead of in a loop. So you generate more code than Java will allow in a method.
This results in a compilation error as you describe:
$ javac Test.java
Test.java:7: code too large for try statement
try {
^
Test.java:4: code too large
public static void main(String[] args) {
^
2 errors
Following from discussion in comments, the code that you say is producing this compiler error does not have an enormous number of lines. Something doesn't make sense - the error you report does not line up with the code you say is causing it. At this late stage I strongly recommend that you post some code, and the error so that others can try to understand what might be causing this.
(Also, your question isn't likely to get much attention because you have accepted an answer. You might want to reconsider that if your question is not in fact answered.)
An array of 10,000 ints isn't very big at all. I can't think why you would have a problem keeping the data in memory (ie assigned to a variable).
I find it odd that 10,000 ints takes up too much memory. It could be that other stuff if eating up your memory. Have you tried increasing the available memory to Java? (i.e.-Xmx512m). If this is not possible, you can always try to use shorts or bytes if the numbers are small enough.
The array will take just as much space as chunk of memory (like c does).
This is a known bug in the JVM. It prohibits you from creating an array of integers with size 10,000 (and also 16,384 on Mac OS X). It has to do with the way in which Java translates the byte code into machine code. Changing the size of the array to 10,001 will solve the problem.

What is the max. capacity of byte-Array?

I made a JavaClass which is making addition, sub, mult. etc.
And the numbers are like (155^199 [+,-,,/] 555^669 [+,-,,/] ..... [+,-,*,/] x^n);
each number is stored in Byte-Array and byte-Array can contain max. 66.442
example:
(byte) array = [1][0] + [9][0] = [1][0][0]
(byte) array = [9][0] * [9][0] = [1][8][0][0]
My Class file is not working if the number is bigger then (example: 999^999)
How i can solve this problem to make addition between much bigger numbers?
When the byte-Array reachs the 66.443 values, VM gives this error:
Caused by: java.lang.ClassNotFoundException. which is actually not the correct error-description.
well it means, if i have a byte-array with 66.443 values, the class cannot read correctly.
Solved:
Used multidimensional-Byte Array to solve this problem.
array{array, ... nth-array} [+, -, /] nth-array{array, ... nth-array}
only few seconds to make an addition between big numbers.
Thank you!
A single method in Java is limited to 64KB of byte code. When you initialise an array in code it uses byte code to do this. This would limit the maximum size you can define an array to about this size.
If you have a large byte array of value I suggest you store it in an external file and load it at runtime. This way you can have a byte array of up to 2 GB. If you need more than this you need to have an array of arrays.
What does your actual code look like? What error are you getting?
A Java byte array can hold up to 2^31-1 values, if there is that much contiguous memory available.
Each array can hold a maximum of Integer.MAX_VALUE values. If it crashes, I guess you see an OutOfMemoryError. Fix that by starting you java vm with more heap space:
java -Xmx1024M <...>
(example give 1024 MByte heap space)
java.lang.ClassNotFoundException is thrown if the virtual machine needs a class and can't load it - usually because it is not on the class path (sometimes the case when we simply forget to compile a java source file..). This exception is totally unrelated to java array operations.
To continue the discussion in the comments section:
The name of the missing class is very important. At the line of code, where the exception is thrown, the VM tries to load the class ClassBigMath for the very first time and fails. The classloader can't find a file ClassBigMath.class on the classpath.
Double check first if the compiled java file is really present and double check that you don't have a typo in your source code. Typical reasons for this error:
We simply forget to compile a source file
A class file is on the classpath at compilation time but not at execution time
We do a Class.forName("MyClass") and have a typo in the class name
java.math.BigInteger is much better solution to handle large number. Is there any reason , you have choosed byte array ?
The maximum size of an array in Java is given by Integer.MAX_VALUE. This is 2^31-1 elements. You might get OOM exceptions for less if there is not enough memory free. Besides that, for what you are doing you might want to look at the BigInteger class. It seems you are doing your math in some form of decimal representation, which is not very memory efficient.

Categories

Resources