Maximum number of enum elements in Java - java

What is the maximum number of elements allowed in an enum in Java?
I wanted to find out the maximum number of cases in a switch statement. Since the largest primitive type allowed in switch is int, we have cases from -2,147,483,648 to 2,147,483,647 and one default case. However enums are also allowed... so the question..

From the class file format spec:
The per-class or per-interface constant pool is limited to 65535 entries by the 16-bit constant_pool_count field of the ClassFile structure (§4.1). This acts as an internal limit on the total complexity of a single class or interface.
I believe that this implies that you cannot have more then 65535 named "things" in a single class, which would also limit the number of enum constants.
If a see a switch with 2 billion cases, I'll probably kill anyone that has touched that code.
Fortunately, that cannot happen:
The amount of code per non-native, non-abstract method is limited to 65536 bytes by the sizes of the indices in the exception_table of the Code attribute (§4.7.3), in the LineNumberTable attribute (§4.7.8), and in the LocalVariableTable attribute (§4.7.9).

The maximum number of enum elements is 2746. Reading the spec was very misleading and caused me to create a flawed design with the assumption I would never hit the 64K or even 32K high-water mark. Unfortunately, the number is much lower than the spec seems to indicate. As a test, I tried the following with both Java 7 and Java 8: Ran the following code redirecting it to a file, then compiled the resulting .java file.
System.out.println("public enum EnumSizeTest {");
int max = 2746;
for ( int i=0; i<max; i++) {
System.out.println("VAR"+i+",");
}
System.out.println("VAR"+max+"}");
Result, 2746 works, and 2747 does not.
After 2746 entries, the compiler throws a code too large error, like
EnumSizeTest.java:2: error: code too large
Decompiling this Enum class file, the restriction appears to be caused by the code generated for each enum value in the static constructor (mostly).

Enums definitely have limits, with the primary (hard) limit around 32K values. They are subject to Java class maximums, both of the 'constant pool' (64K entries) and -- in some compiler versions -- to a method size limit (64K bytecode) on the static initializer.
'Enum' initialization internally, uses two constants per value -- a FieldRef and a Utf8 string. This gives the "hard limit" at ~32K values.
Older compilers (Eclipse Indigo at least) also run into an issue as to the static initializer method-size. With 24 bytes of bytecode required to instantiate each value & add it to the values array. a limit around 2730 values may be encountered.
Newer compilers (JDK 7 at least) automatically split large static initializers off into methods named " enum constant initialization$2", " enum constant initialization$3" etc so are not subject to the second limit.
You can disassemble bytecode via javap -v -c YourEnum.class to see how this works.
[It might be theoretically possible to write an "old-style" Enum class as handcoded Java, to break the 32K limit and approach close to 64K values. The approach would be to initialize the enum values by reflection to avoid needing string constants in the pool. I tested this approach and it works in Java 7, but the desirability of such an approach (safety issues) are questionable.]
Note to editors: Utf8 was an internal type in the Java classfile IIRC, it's not a typo to be corrected.

Well, on jdk1.6 I hit this limit. Someone has 10,000 enum in an xsd and when we generate, we get a 60,000 line enum file and I get a nice java compiler error of
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile (default-compile) on project framework: Compilation failure
[ERROR] /Users/dhiller/Space/ifp-core/framework/target/generated-sources/com/framework/util/LanguageCodeSimpleType.java:[7627,4] code too large
so quite possibly the limit is much lower than the other answers here OR maybe the comments and such that are generated are taking up too much space. Notice the line number is 7627 in the java compiler error though if the line limit is 7627, I wonder what the line length limit is ;) which may be similar. ie. the limits may be not be based on number of enums but line length limits or number of lines in the file limit so you would have rename enums to A, B, etc. to be very small to fit more enums into the enum.
I can't believe someone wrote an xsd with a 10,000 enum..they must have generated this portion of the xsd.

Update for Java 15+
In JDK 15, the maximal number of constants in enums was raised to about 4103: https://bugs.openjdk.java.net/browse/JDK-8241798
This was achieved by splitting the static initializer into two parts:
Before Java 15 (pseudocode):
enum E extends Enum<E> {
...
static {
C1 = new E(...);
C2 = new E(...);
...
CN = new E(...);
$VALUES = new E[N];
$VALUES[0] = C1;
$VALUES[1] = C2;
...
$VALUES[N-1] = CN;
}
}
Since Java 15:
enum E extends Enum<E> {
...
static {
C1 = new E(...);
C2 = new E(...);
...
CN = new E(...);
$VALUES = $values();
}
private static E[] $values() {
E[] array = new E[N];
array[0] = C1;
array[1] = C2;
...
array[N-1] = CN;
return array ;
}
}
This allowed the static initializer to contain more code (until it hits the 64 kilobytes limit) and therefore to initialize more enum constants.

The maximum size of any method in Java is 65536 bytes. While you can theoretically have a large switch or more enum values, its the maximum size of a method you are likely to hit first.

The Enum class uses an int to track each value's ordinal, so the max would be the same as int at best, if not much lower.
And as others have said, if you have to ask you're Doing It Wrong

There is no maximum number per se for any practical purposes. If you need to define thousands of enums in one class you need to rewrite your program.

Related

The difference between arrays in Java and C

In my book there is an example which explains the differences between arrays in Java and C.
In Java we can create an array by writing:
int[] a = new int[5];
This just allocates storage space on the stack for five integers and we can access them exactly as we would have done in Java
int a[5] = {0};
int i;
for (i = 0, i < 5; i++){
printf("%2d: %7d\n", i, a[i]);
}
Then the author says the following
Of course our program should not use a number 5 as we did on several places in the example, instead we use a constant. We can use the C preprocessor to do this:
#define SIZE 5
What are advantages of defining a constant SIZE 5?
Using a named constant is generally considered good practice because if it is used in multiple places, you only need to change the definition to change the value, rather than change every occurrence - which is error prone.
For example, as mentioned by stark in the comments, it is likely that you'll want to loop over an array. If the size of the array is defined by a named constant called SIZE, then you can use that in the loop bounds. Changing the size of the array then only requires changing the definition of SIZE.
There is also the question of whether #define is really the right solution.
To borrow another comment, from Jonathan Leffer: see static const vs #define vs enum for a discussion of different ways of naming constants. While modern C does allow using a variable as an array size specifier, this technically results in a variable-length array which may incur a small overhead.
You should use a constant, because embedding magic numbers in code makes it harder to read and maintain. For instance, if you see 52 in some code, you don't know what it is. However, if you write #define DECKSIZE 52, then whenever you see DECKSIZE, you know exactly what it means. In addition, if you want to change the deck size, say 36 for durak, you could simply change one line, instead of changing every instance throughout the code base.
Well, imagine that you create a static array of 5 integer just like you did int my_arr [5]; ,you code a whole programm with it, but.. suddenly you realise that maybe you need more space. Imagine that you wrote a code of 6-700 lines, you MUST replace every occurence of you array with the fixed number of your choice. Every for loop, and everything that is related with the size of this array. You can avoid all of this using the preprocessor command #define which will replace every occurence of a "keyword" with the content you want, it's like a synonymous for something. Eg: #define SIZE 5 will replace in your code every occurence of the word SIZE with the value 5.
I find comments here to be superflous. As long as you use your constant (5 in this case) only once, it doesn't matter where it is. Moreover, having it in place improves readability. And you certainly do not need to use the constant in more than one place - afterall, you should infer the size of array through sizeof operator anyways. The benefit of sizeof approach is that it works seamlessly with VLAs.
The drawback of global #define (or any other global name) is that it pollutes global namespace. One should understand that global names is a resource to be used conservatively.
#define SIZE 5
This looks like an old outdated way of declaring constants in C code that was popular in dinosaur era. I suppose some lovers of this style are still alive.
The preferred way to declare constants in C languages nowadays is:
const int kSize = 5;

How many objects are created by using the Integer wrapper class?

Integer i = 3;
i = i + 1;
Integer j = i;
j = i + j;
How many objects are created as a result of the statements in the sample code above and why? Is there any IDE in which we can see how many objects are created (maybe in a debug mode)?
The answer, surprisingly, is zero.
All the Integers from -128 to +127 are pre-computed by the JVM.
Your code creates references to these existing objects.
The strictly correct answer is that the number of Integer objects created is indeterminate. It could be between 0 and 3, or 2561 or even more2, depending on
the Java platform3,
whether this is the first time that this code is executed, and
(potentially) whether other code that relies on boxing of int values runs before it4.
The Integer values for -128 to 127 are not strictly required to be precomputed. In fact, JLS 5.1.7 which specified the Boxing conversion says this:
If the value p being boxed is an integer literal of type int between -128 and 127 inclusive (§3.10.1) ... then let a and b be the results of any two boxing conversions of p. It is always the case that a == b.
Two things to note:
The JLS only requires this for >>literals<<.
The JLS does not mandate eager caching of the values. Lazy caching also satisfies the JLS's behavioral requirements.
Even the javadoc for Integer.valueof(int) does not specify that the results are cached eagerly.
If we examine the Java SE source code for java.lang.Integer from Java 6 through 8, it is clear that the current Java SE implementation strategy is to precompute the values. However, for various reasons (see above) that is still not enough to allow us to give a definite answer to the "how many objects" question.
1 - It could be 256 if execution of the above code triggers class initialization for Integer in a version of Java where the cache is eagerly initialized during class initialization.
2 - It could be even more, if the cache is larger than the JVM spec requires. The cache size can be increased via a JVM option in some versions of Java.
3 - In addition to the platform's general approach to implementing boxing, a compiler could spot that some or all of the computation could be done at compile time or optimized it away entirely.
4 - Such code could trigger either lazy or eager initialization of the integer cache.
First of all: The answer you are looking for is 0, as others already mentioned.
But let's go a bit deeper. As Stephen menthioned it depends on the time you execute it. Because the cache is actually lazy initialized.
If you look at the documentation of java.lang.Integer.IntegerCache:
The cache is initialized on first usage.
This means that if it is the first time you call any Integer you actually create:
256 Integer Objects (or more: see below)
1 Object for the Array to store the Integers
Let's ignore the Objects needed for Store the Class (and Methods / Fields). They are anyway stored in the metaspace.
From the second time on you call them, you create 0 Objects.
Things get more funny once you make the numbers a bit higher. E.g. by the following example:
Integer i = 1500;
Valid options here are: 0, 1 or any number between 1629 to 2147483776 (this time only counting the created Integer-values.
Why? The answer is given in the next sentence of Integer-Cache definition:
The size of the cache may be controlled by the -XX:AutoBoxCacheMax= option.
So you actually can vary the size of the cache which is implemented.
Which means you can reach for above line:
1: new Object if your cache is smaller than 1500
0: new Objects if your cache has been initialized before and contains 1500
1629: new (Integer) - Objects if your cache is set to exactly 1500 and has not been initialized yet. Then Integer-values from -128 to 1500 will be created.
As in the sentence above you reach any amount of integer Objects here up to: Integer.MAX_VALUE + 129, which is the mentioned: 2147483776.
Keep in mind: This is only guaranteed on Oracle / Open JDK (i checked Version 7 and 8)
As you can see the completely correct answer is not so easy to get. But just saying 0 will make people happy.
PS: using the menthoned parameter can make the following statement true: Integer.valueOf(1500) == 1500
The compiler unboxes the Integer objects to ints to do arithmetic with them by calling intValue() on them, and it calls Integer.valueOf to box the int results when they are assigned to Integer variables, so your example is equivalent to:
Integer i = Integer.valueOf(3);
i = Integer.valueOf(i.intValue() + 1);
Integer j = i;
j = Integer.valueOf(i.intValue() + j.intValue());
The assignment j = i; is a completely normal object reference assignment which creates no new objects. It does no boxing or unboxing, and doesn't need to as Integer objects are immutable.
The valueOf method is allowed to cache objects and return the same instance each time for a particular number. It is required to cache ints −128 through +127. For your starting number of i = 3, all the numbers are small and guaranteed to be cached, so the number of objects that need to be created is 0. Strictly speaking, valueOf is allowed to cache instances lazily rather than having them all pre-generated, so the example might still create objects the first time, but if the code is run repeatedly during a program the number of objects created each time on average approaches 0.
What if you start with a larger number whose instances will not be cached (e.g., i = 300)? Then each valueOf call must create one new Integer object, and the total number of objects created each time is 3.
(Or, maybe it's still zero, or maybe it's millions. Remember that compilers and virtual machines are allowed to rewrite code for performance or implementation reasons, so long as its behavior is not otherwise changed. So it could delete the above code entirely if you don't use the result. Or if you try to print j, it could realize that j will always end up with the same constant value after the above snippet, and thus do all the arithmetic at compile time, and print a constant value. The actual amount of work done behind the scenes to run your code is always an implementation detail.)
You can debug the Integer.valueOf(int i) method to find out it by yourself.
This method is called by the autoboxing process by the compiler.

Are 0.0 and 1.0 considered magic numbers?

I know that -1, 0, 1, and 2 are exceptions to the magic number rule. However I was wondering if the same is true for when they are floats. Do I have to initialize a final variable for them or can I just use them directly in my program.
I am using it as a percentage in a class. If the input is less than 0.0 or greater than 1.0 then I want it set the percentage automatically to zero. So if (0.0 <= input && input <= 1.0).
Thank you
Those numbers aren't really exceptions to the magic number rule. The common sense rule (as far as there is "one" rule), when it isn't simplified to the level of dogma, is basically, "Don't use numbers in a context that doesn't make their meaning obvious." It just so happens that these four numbers are very commonly used in obvious contexts. That doesn't mean they're the only numbers where this applies, e.g. if I have:
long kilometersToMeters(int km) { return km * 1000L; }
there is really no point in naming the number: it's obvious from the tiny context that it's a conversion factor. On the other hand, if I do this in some low-level code:
sendCommandToDevice(1);
it's still wrong, because that should be a constant kResetCommand = 1 or something like it.
So whether 0.0 and 1.0 should be replaced by a constant completely depends on the context.
It really depends on the context. The whole point of avoiding magic numbers is to maintain the readability of your code. Use your best judgement, or provide us with some context so that we may use ours.
Magic numbers are [u]nique values with unexplained meaning or multiple occurrences which could (preferably) be replaced with named constants.
http://en.wikipedia.org/wiki/Magic_number_(programming)
Edit: When to document code with variables names vs. when to just use a number is a hotly debated topic. My opinion is that of the author of the Wiki article linked above: if the meaning is not immediately obvious and it occurs multiple times in your code, use a named constant. If it only occurs once, just comment the code.
If you are interested in other people's (strongly biased) opinions, read
What is self-documenting code and can it replace well documented code?
Usually, every rule has exceptions (and this one too). It is a matter of style to use some mnemonic names for these constants.
For example:
int Rows = 2;
int Cols = 2;
Is a pretty valid example where usage of raw values will be misleading.
The meaning of the magic number should be obvious from the context. If it is not - give the thing a name.
Attaching a name for something creates an identity. Given the definitions
const double Moe = 2.0;
const double Joe = 2.0;
...
double Larry = Moe;
double Harry = Moe;
double Garry = Joe;
the use of symbols for Moe and Joe suggests that the default value of Larry and Harry are related to each other in a way that the default value of Garry is not. The decision of whether or not to define a name for a particular constant shouldn't depend upon the value of that constant, but rather whether it will non-coincidentally appear multiple places in the code. If one is communicating with a remote device which requires that a particular byte value be sent to it to trigger a reset, I would consider:
void ResetDevice()
{
// The 0xF9 command is described in the REMOTE_RESET section of the
// Frobnitz 9000 manual
transmitByte(0xF9);
}
... elsewhere
myDevice.ResetDevice();
...
otherDevice.ResetDevice();
to be in many cases superior to
// The 0xF9 command is described in the REMOTE_RESET section of the
// Frobnitz 9000 manual
const int FrobnitzResetCode = 0xF9;
... elsewhere
myDevice.transmitByte(FrobnitzResetCode );
...
otherDevice.transmitByte(FrobnitzResetCode );
The value 0xF9 has no real meaning outside the context of resetting the Frobnitz 9000 device. Unless there is some reason why outside code should prefer to send the necessary value itself rather than calling a ResetDevice method, the constant should have no value to any code outside the method. While one could perhaps use
void ResetDevice()
{
// The 0xF9 command is described in the REMOTE_RESET section of the
// Frobnitz 9000 manual
int FrobnitzResetCode = 0xF9;
transmitByte(FrobnitzResetCode);
}
there's really not much point to defining a name for something which is in such a narrow context.
The only thing "special" about values like 0 and 1 is that used significantly more often than other constants like e.g. 23 in cases where they have no domain-specific identity outside the context where they are used. If one is using a function which requires that the first parameter indicates the number of additional parameters (somewhat common in C) it's better to say:
output_multiple_strings(4, "Bob", Joe, Larry, "Fred"); // There are 4 arguments
...
output_multiple_strings(4, "George", Fred, "James", Lucy); // There are 4 arguments
than
#define NUMBER_OF_STRINGS 4 // There are 4 arguments
output_multiple_strings(NUMBER_OF_STRINGS, "Bob", Joe, Larry, "Fred");
...
output_multiple_strings(NUMBER_OF_STRINGS, "George", Fred, "James", Lucy);
The latter statement implies a stronger connection between the value passed to the first method and the value passed to the second, than exists between the value passed to the first method and anything else in that method call. Among other things, if one of the calls needs to be changed to pass 5 arguments, it would be unclear in the second code sample what should be changed to allow that. By contrast, in the former sample, the constant "4" should be changed to "5".

Switch statement in Java

How many cases are possible for a switch statement in Java? For example if we are checking an integer how many case blocks are possible?
The bound you will most likely meet first is that of the maximum number of entries in the constant pool per class which is 65535. This will allow for a few thousand case blocks of small complexity. The constant pool contains one entry for each numeric or string literal that is used at least once in the class but also one or more entries for all field, method and/or class reference as these entries are composed on behalf of other constants that must be present in the constant pool as well. I.e. a method reference entry consists of a reference to a string entry for the signature of the method and a reference to the class entry of the declaring class. The class entry itself again references a string entry for the class name.
See: Limitations of the Java virtual machine and The Constant Pool in the Java Virtual Machine Specification
The absolute upper bound for a switch ignoring or reusing the code in the case blocks is slightly less than 2^30 cases since each case has a jump target which is a signed 32 bit integer (see tableswitch and lookupswitch instructions) and thus needs 4 bytes per case and the byte code size for each method is limited to slightly less than 2^32 bytes. This is because the byte code is wrapped in a code attribute and the length of a attribute is given as a unsigned 32 bit integer. This size is fruther reduced because the code attribute has some header information, the method needs some entry and exit code and the tableswitch statement needs some bytes for itself with its min/max values and at most 3 bytes of padding.
There is no limit, except the size of your JVM to accommodate all the bytecode
16377. At least for a simple code like:
public class SwitchLimit {
public static void main(String[] args) {
int x = 0;
switch(x) {
case 0:
...
case 16376:
default:
}
System.out.println("done.");
}
}
You can have 16377 case statements in this example (not counting default) and if you add a case 16377:, the code won't compile with the following error:
The code of method main(String[]) is exceeding the 65535 bytes limit
As others pointed out, this number will probably be significantly lower if your method actually does anything that makes sense.
It depends on your requirement. you can have that many cases of range int type. As the range of int type is finite and after that concept of integer cycle will come into the picture.
As the size of int ranges from -2,147,483,648 to 2,147,483,647, so you can have a case for each number of them. So there is a limited number of case in case of integer.
But if you want to use String in case, then you can have unlimited number of cases as said by Bohemian.
The total number of cases will be maximum number that int can take depending on the hardware. Have a look at datatypes in java
So, you will have the entire range as possible number of case blocks.
No limit of case statements in a switch. At worst you can get heap space but not in easy way.
Reading the question, the answers, and the comments, I don't see why it is relevant. You can certainly have more cases than you can manually write. And, in the improbable case that you machine-generate your code, there are better choices than switches in Java.
Infinite!! There is no such restriction.

What is the max. capacity of byte-Array?

I made a JavaClass which is making addition, sub, mult. etc.
And the numbers are like (155^199 [+,-,,/] 555^669 [+,-,,/] ..... [+,-,*,/] x^n);
each number is stored in Byte-Array and byte-Array can contain max. 66.442
example:
(byte) array = [1][0] + [9][0] = [1][0][0]
(byte) array = [9][0] * [9][0] = [1][8][0][0]
My Class file is not working if the number is bigger then (example: 999^999)
How i can solve this problem to make addition between much bigger numbers?
When the byte-Array reachs the 66.443 values, VM gives this error:
Caused by: java.lang.ClassNotFoundException. which is actually not the correct error-description.
well it means, if i have a byte-array with 66.443 values, the class cannot read correctly.
Solved:
Used multidimensional-Byte Array to solve this problem.
array{array, ... nth-array} [+, -, /] nth-array{array, ... nth-array}
only few seconds to make an addition between big numbers.
Thank you!
A single method in Java is limited to 64KB of byte code. When you initialise an array in code it uses byte code to do this. This would limit the maximum size you can define an array to about this size.
If you have a large byte array of value I suggest you store it in an external file and load it at runtime. This way you can have a byte array of up to 2 GB. If you need more than this you need to have an array of arrays.
What does your actual code look like? What error are you getting?
A Java byte array can hold up to 2^31-1 values, if there is that much contiguous memory available.
Each array can hold a maximum of Integer.MAX_VALUE values. If it crashes, I guess you see an OutOfMemoryError. Fix that by starting you java vm with more heap space:
java -Xmx1024M <...>
(example give 1024 MByte heap space)
java.lang.ClassNotFoundException is thrown if the virtual machine needs a class and can't load it - usually because it is not on the class path (sometimes the case when we simply forget to compile a java source file..). This exception is totally unrelated to java array operations.
To continue the discussion in the comments section:
The name of the missing class is very important. At the line of code, where the exception is thrown, the VM tries to load the class ClassBigMath for the very first time and fails. The classloader can't find a file ClassBigMath.class on the classpath.
Double check first if the compiled java file is really present and double check that you don't have a typo in your source code. Typical reasons for this error:
We simply forget to compile a source file
A class file is on the classpath at compilation time but not at execution time
We do a Class.forName("MyClass") and have a typo in the class name
java.math.BigInteger is much better solution to handle large number. Is there any reason , you have choosed byte array ?
The maximum size of an array in Java is given by Integer.MAX_VALUE. This is 2^31-1 elements. You might get OOM exceptions for less if there is not enough memory free. Besides that, for what you are doing you might want to look at the BigInteger class. It seems you are doing your math in some form of decimal representation, which is not very memory efficient.

Categories

Resources