Difference between packed switch and sparse switch dalvik opcode

Difference between packed switch and sparse switch dalvik opcode - java

I want to know the difference between packed switch and sparse switch opcodes in dalvik. Please if you can provide examples. The explanation provided by google is unclear to me.
packed-switch
sparse switch
Thanks.

It sounds as if packed-switch is equivalent to Java's tableswitch, and sparse-switch to lookupswitch.
A packed-switch uses a simple jump table, indexed by the form low + n, where low is the lowest test value among the case labels, and n is the input to the switch. The values at each index represent the bytecode offsets for each case. Finding the correct jump address is a constant-time operation.
A sparse-switch uses a sorted list of key-value pairs, where each key is a test value from a case label, and the values are jump offsets. Finding the correct jump target for a lookupswitch requires a binary search on the key, so it is a logarithmic-time operation.
The compiler will choose which to use. If the keys tend to be clustered or packed closely together, then a packed-switch (or, in Java terms, a tableswitch) can be emitted efficiently. But if the keys are sparse, and the range of values (high - low + 1) is large, then using a jump table would require a large block of bytecode, as all values in that range must exist in the jump table regardless of whether there is a corresponding case label. In these scenarios, the compiler will emit a sparse-switch (lookupswitch).
Interestingly, the Dalvik engineers chose to name these opcodes in a way that describes the key distributions for which they should be used, whereas the Java engineers chose names which describe the conceptual data structures that the bytecode operands resemble.
Let's look at some examples. Consider the following Java code, which will produce a tableswitch (and, when converted to Dalvik, a packed-switch):
static String packedSwitch(final int n) {
switch (n) {
case 5:
return "Five";
case 3:
return "Three";
case 1:
return "One";
default:
return "Other";
}
}
Conceptually, the payload for the packed-switch opcode would look something like this:
As you can see, it's fairly compact. Three out of the five slots point to actual case targets, with the remaining two jumping to the default target. But what if our test values were more spread out?
static String sparseSwitch(final int n) {
switch (n) {
case 500:
return "Five Hundred";
case 300:
return "Three Hundred";
case 100:
return "One Hundred";
default:
return "Other";
}
}
If the compiler tried to emit this as a packed-switch, the payload would look something like this:
Notice how only three out of a few hundred slots actually point to case labels from the original code. The rest are there simply to fill up the jump table. Not very space efficient, is it? That's why the compiler would emit a sparse-switch, which has a far more compact bytecode footprint for this particular example:
Now, that's much more reasonable, don't you think? The downside, however, is that instead of knowing exactly which index to jump to based on the input, we have to perform a binary search on the table until we find a matching test value. The larger the switch, the more significant the impact on performance, though the effect has a logarithmic curve.

Related

Can you connect two cases of a switch statement using a comma [duplicate]

This question already has answers here:
What are switch expressions and how are they different from switch statements?
(4 answers)
Closed 1 year ago.
I've been working with switch statements in my latest project and wanted to have two possible values execute the same code. So I looked up methods on how do that. Quite the only answer I found looks like this:
switch (element) {
case hi:
case hello:
//do something here
break;
}
But I experimented a bit more and found that my IDE doesn't complain when I did:
switch (element) {
case hi, hello:
//do something here
break;
}
So I tried it out and found that it works. So I was wondering if there is something wrong with using this method as I found nothing on it online. I would just love to be able to use it as it looks so much cleaner.

Recent versions of java (12 and up) have been putting in a ton of work on switch. There are now quite a few forms. Some of these forms have an underlying principle that strongly suggests some new syntactical feature is important, and for consistency's sake, if that feature can also be allowed for the other forms, the other forms are updated too.
History
The original java 1.0 switch is a straight copy from C (which explains their utterly bizarre and silly syntax; it's what C had, and java was designed to be comfortable to C coders. We can whinge about the fact that this means the syntax is daft, but it's hard to deny that the C style syntax, no matter how crazy it seems, has taken over the planet, with C, C#, Java, and Javascript added together taking up a huge chunk of the 'market' so to speak).
It:
Required that the thing you switch is on an int and only an int
That each case is a specific and constant number (no case 10-100, and definitely no case <5 or case x:).
It's a statement (The switch itself doesn't have a value, it's a way to conditionally jump to a statement).
Unlike the C version, treating the switch's nature as a weird GOTO-esque construct and thus allowing you to interleave a control structure such as a do/while loop through it is not actually allowed (fortunately, I guess)? - So, Duff's Device is not and has never been legal in java.
Recent updates:
Together with enums: Enums in switches
Java1.5 brought enums, and as enums are intended to replace int constants, they needed to be usable in switches. Given that enum values have an 'ordinal' which is an int, the implementation is trivial: switch (someEnum) { case ENUMVAL: } is light sugar around switch(someEnum.ordinal()) { case 1: }, where 1 is the ordinal value of ENUMVAL.
Only semi-recent: Strings
In JDK7, strings-in-switch support was added. The same principles apply: Only constants, null is not okay - and it's 'fast', in that its implemented as a jump table (contrast to a sequence of if/elseif statements, which requires a comparison for each one. A jump table just jumps, using a single comparison, straight to the right line or to the end if none match.
Now the new stuff (12+): As an expression
These days switch can be an expression. As in, the switch returns a thing. However, in order to make that work, each and every 'path' throughout your switch must return something, and it cannot possibly be the case (heh) that there is a missing case statement; after all, what should the expression resolve to then? Thus, in this 'mode', you must have exhausted all options. Interesting subcase of this is enums, where you can be 'exhaustive', and yet at runtime this can never be guaranteed; what if someone adds a value to an enum and recompiles JUST the enum and drops that in there? Now an erstwhile exhaustive (covers every option) switch no longer is:
int y = switch(textyNumber) {
case "one": yield 1;
case "two": yield 2;
default: yield 0;
};
yield is a new approach here. It's a lot like 'return' - it 'returns' from the switch, returning that value. It is a compiler error if that default clause is missing.
The weirdness with enums means that if you're exhaustive (cover every enum value), it compiles, but if you then add an enum value, do not recompile the code with the switch, and attempt to toss a newly created enum value through, the switch throws a java.lang.IncompatibleClassChangeError. The more you know.
arrow syntax
Because 'yield' is a tad unwieldy and it can be useful to just one-liner the desired expression, similar to how you can write e.g.:
Comparator<String> byLength = (a, b) -> a.length() - b.length();
that same arrow syntax is now legal in switches, and for consistency, it's even legal in statement-form switches:
int x = 1;
int y = switch(x) {
case 1 -> 2;
default -> 0;
};
switch (y) {
case 0 -> System.out.println("Interesting");
}
The arrow prevents fallthrough (it implies break), and takes only one statement. But a whole block also counts as one statement:
switch (y) {
case 0 -> {
System.out.println("one");
System.out.println("two");
}
case 1 -> {
System.out.println("three");
}
}
would print just 'one', 'two'.
multi-case
Commas can be used to separate values; this is similar to how you can use bars to separate exception types (catch (IOException | SQLException e) { ... }. Applies to all forms.
pattern matching
Now we get to the real intriguing stuff:
record Point(int x, int y) {}
...
Object p = new Point(10, 20);
switch (p) {
case Point(x, y) -> System.out.printf("Point [%d, %d]", x, y);
}
will soon be legal java! Now the thing that follows the 'case' is a 'template'. Can you smash whatever p resolves to into this template? If yes, the case matches, and the variables are filled in for you. This works with 'deconstructable' types (currently: only records; in future any type can provide a deconstructor), as well as an instanceof kinda deal:
Object o = someExpression;
switch (o) {
case Number n -> System.out.println("is a number. intvalue: " + n.intValue());
case String s -> System.out.println("is a string: " + s.toLowerCase());
}
I think that covers all the updates that are available for switch up to java v 17 and a bit beyond it, even.
NB: You find this info in extreme detail and as early as you want it by following the appropriate openjdk dev mailing lists. This stuff would be found mostly on amber-dev, with some of the pattern matching stuff found on valhalla-dev due to being closely associated with records, and records overlaps with value types (valhalla = value types for java, amber = general language updates). At java conferences somebody will usually give a breakdown of where this stands, reddit's /r/java tends to post it when someone from the OpenJDK team (for this stuff, you're looking at Brian Goetz generally) posts a major update on how the feature is envisioned / has been implemented. Follow Brian on twitter, that helps too. The release notes, and the umbrella JEP page associated with any new java release should also mention this stuff with links you can then follow.

Your original code segment uses fall-through to give the same response for both cases; some people find this difficult to read and follow. Java 12 gave us two related features that people find help clean up their switch statements, one of which you've discovered here:
cases can be comma separated rather than relying on fall-through
the arrow label can be used instead of the colon and break; commands.
Therefore, if you find yourself not liking fall-through, and not wanting to worry about whether you remembered your break commands, you can write the switch as follows:
switch (element) {
case hi, hello -> // do something here
case goodbye, bye -> // do something else here
}
Notice that in this case, we don't have any break statements, but there is no fall-through between the hi case and the bye case; they are separate due to the arrow label.

This is not available in Java, However You can do this thing with Kotlin, which is 100% Compatible with Java

Performance of arrays as alternative to switch cases

Usually, in problems that I encounter in school, there are ones that the expects us to use switch cases. Though I always use arrays as alternative for this kind of problem for the sake of shorter codes.
So a solution with Switch-Statement will look like this:
String day;
int dayNum = n%7; //n is an input
switch(dayNum){
case 0:
day = "Sunday";
break;
case 1:
day = "Monday";
break;
case 2:
day = "Tuesday";
break;
case 3:
day = "Wednesday";
break;
case 4:
day = "Thursday";
break;
case 5:
day = "Friday";
break;
case 6:
day = "Saturday";
break;
}
System.out.println("Today is " + day);
And a solution with array as alternative for this looks like this:
int dayNum = n%7; //n is an input
String[] day = {"Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"};
System.out.println("Today is " + day[dayNum]);
The second example is the kind of code I usually prefer to use.
What I'd like to ask is how does the 2 solutions compare when it comes to memory and time complexity?

The code you wrote is just bad use of switches.
switches are great when one, or preferably both, of these things holds:
The values you are triggering on aren't consecutive (i.e. case 384:, then case 30491:, etcetera. Trivial example is when switching on a string for a certain command.
The thing you want to do is a bunch of code (vs. returning a single result, as here).
For what its worth, you've also written the case block in a rather convoluted fashion. If you have a separate method for this, you could write:
switch (foo) {
case 0: return "Sunday";
case 1: return "Monday";
}
Also (new in.. java 14?), you can write this too:
String dayName = switch(foo) {
case 0 -> "Sunday";
case 1 -> "Monday";
default -> throw new IllegalArgumentException("Unknown day");
};
Lastly, this code should not exist. Because java.time.DayOfWeek already exists. You've badly reinvented the wheel.
Performance
When talking about 7 elements it just doesn't matter. Other factors dominate, simple as that.
As a general principle (which doesn't kick in until many many hundreds, maybe thousands, see footnote!), the reason your array-based code is linear as you wrote it is solely because java would not (currently; future java changes are afoot that may change this) realize that the line that 'makes' your string array with day names can be executed once for the lifetime of a JVM boot and never again; in other words, this code, under the hood, compiles down to making a new string array with 7 slots and then assigning 7 constants to the slots (those constants are the references to the strings, which are calculated "once per VM lifetime").
If you move that array into a static field, that goes away and they're both O(1) performance (the same 'speed' whether you have 7 days or 7 million days), because they both just do a single lookup. No looping.
IMPORTANT PERFORMANCE FOOTNOTE: Computers are incredibly complex and act nothing like a von neumann model. Predicting the performance of code based on common sense therefore usually steers you extremely wrong. There are therefore 3 important rules to keep in mind:
Write code the way java programmers the world over tend to write it, and keep things flexible. IF performance issues occur, you will probably need to change the relevant part of your app. This is easier if your code is flexible. Most rules of thumb about performance you read about make your code less flexible and are therefore bad! - also, the JVM optimizes code, and does this essentially by being a giant pattern matching machine, detecting patterns it knows how to run well. The programmers who make this pattern matching optimizing machine tend to write patterns that are commonly used in java code. Thus, idiomatic java code tends to be fast in practice.
Nobody can look at code and just know how it ends up performing in real life, because it's too complex. Don't take my word for it; the very engineers that write the VM say this all the time. If they can't figure it out, you stand no chance whatsoever. Therefore, a profiler report or JMH timing result is required before even thinking about letting code performance ideas determine how you write your code. See the rule above: Without this, the best, fastest code is the cleanest, most flexible, most easy to read code.
The one exception is algorithmic complexity (big O notation). This is a complex topic (generally, university-level informatics and very math heavy), well beyond an SO answer, but plenty of online resources can be found if you want to dive in. But do note that the big-O factor often doesn't start controlling until you get a ton of data. For example, in this case technically your array-based code is O(n) (because you create the array inside the code, every time you run it), whereas the switch is O(1), but given that n is 7 here, that's irrelevant. It probably won't get relevant until you get to hundreds of items.

Cyclomatic Compleity IF ELSE vs SWITCH CASE

Many code analysis tools like sonar throw errors for switch case if used instead of multiple is-else for cyclomatic complexity . Is there a number which should be a threshold of when one should use if else and when one should use switch case ?
Say if its a case of 50 , to reduce cyclomatic complexity one should use switch case ? Would be good if you could support your answer with an example ?

To answer the question with another question.
Would you rather read:
if(cond1)..
else if(cond2)...
else if(cond50)...
else ...
or
value = map.get(key)
If there are so many possible cases, than neither the switch or if seem appropriate and should therefore be replaced with key-value structure or state machine pattern - if branching is really complicated.
You can find an example implementation here:
fsm
As to when to use if or switch, keep in mind that switch cannot accept the following types:
long, double, float, boolean so when you have such an input, switch is obviously not a choice.
I wouldn't use number as a metric to use switch or if statement. There is one exception however, if an input can have only 2 possible values it is better to use if. Even Sonar should suggest you this.
That said, switch is usually more readable than the series of if statements. As an important decision factor, other than readability, within switch statement there can be groups of values which might require same execution path. It is basically safer version of the goto. Unlike if, where each outcome is usually bound to specific execution path.

Switch statement in Java

How many cases are possible for a switch statement in Java? For example if we are checking an integer how many case blocks are possible?

The bound you will most likely meet first is that of the maximum number of entries in the constant pool per class which is 65535. This will allow for a few thousand case blocks of small complexity. The constant pool contains one entry for each numeric or string literal that is used at least once in the class but also one or more entries for all field, method and/or class reference as these entries are composed on behalf of other constants that must be present in the constant pool as well. I.e. a method reference entry consists of a reference to a string entry for the signature of the method and a reference to the class entry of the declaring class. The class entry itself again references a string entry for the class name.
See: Limitations of the Java virtual machine and The Constant Pool in the Java Virtual Machine Specification
The absolute upper bound for a switch ignoring or reusing the code in the case blocks is slightly less than 2^30 cases since each case has a jump target which is a signed 32 bit integer (see tableswitch and lookupswitch instructions) and thus needs 4 bytes per case and the byte code size for each method is limited to slightly less than 2^32 bytes. This is because the byte code is wrapped in a code attribute and the length of a attribute is given as a unsigned 32 bit integer. This size is fruther reduced because the code attribute has some header information, the method needs some entry and exit code and the tableswitch statement needs some bytes for itself with its min/max values and at most 3 bytes of padding.

There is no limit, except the size of your JVM to accommodate all the bytecode

16377. At least for a simple code like:
public class SwitchLimit {
public static void main(String[] args) {
int x = 0;
switch(x) {
case 0:
...
case 16376:
default:
}
System.out.println("done.");
}
}
You can have 16377 case statements in this example (not counting default) and if you add a case 16377:, the code won't compile with the following error:
The code of method main(String[]) is exceeding the 65535 bytes limit
As others pointed out, this number will probably be significantly lower if your method actually does anything that makes sense.

It depends on your requirement. you can have that many cases of range int type. As the range of int type is finite and after that concept of integer cycle will come into the picture.
As the size of int ranges from -2,147,483,648 to 2,147,483,647, so you can have a case for each number of them. So there is a limited number of case in case of integer.
But if you want to use String in case, then you can have unlimited number of cases as said by Bohemian.

The total number of cases will be maximum number that int can take depending on the hardware. Have a look at datatypes in java
So, you will have the entire range as possible number of case blocks.

No limit of case statements in a switch. At worst you can get heap space but not in easy way.

Reading the question, the answers, and the comments, I don't see why it is relevant. You can certainly have more cases than you can manually write. And, in the improbable case that you machine-generate your code, there are better choices than switches in Java.

Infinite!! There is no such restriction.

If vs if-else vs switch when each condition results in a return

Can anyone explain what the trade off (even if it's negligible) would be between using if, if else, or switch in a sizable block of code similar to the following? Is the situation different if it were comparing a String or another Object instead of an int? The examples are in Java but it is meant as a general question.
EDIT
As several answers stated, a switch is going to be faster and should probably be used if there are more than a few cases. However, nobody has commented on if vs if else when in a long chain like this. What sparked this question is that I am frequently creating these blocks where a switch can't be used because most of the cases require multiple expressions. I guess excluding the else feels sloppy, but it isn't really necessary so why include it?
public String getValueString(int x) {
if (x == 1) return "one";
if (x == 2) return "two";
if (x == 3) return "three";
if (x == 4) return "four";
...
return null;
}
VS
public String getValueString(int x) {
if (x == 1) return "one";
else if (x == 2) return "two";
else if (x == 3) return "three";
else if (x == 4) return "four";
...
return null;
}
VS
public String getValueString(int x) {
switch(x) {
case 1: return "one";
case 2: return "two";
case 3: return "three";
case 4: return "four";
...
}
return null;
}

If you have a lot of cases, then the switch approach is the preferred method. The reason is because the first two requires essentially a linear search through all the if-statements. So it is O(N) to the number of cases you have.
On the other hand, switch statements are optimized differently and can be either O(log(N)) or even O(1) for finding that correct case.
How can the compiler achieve O(log(N)) or even O(1)?
Binary search of the case values will allow it to be done in O(log(N)).
If the case values are dense enough, the compiler may even use a jump table indexed by the case variable. In that case it is O(1).

Most compilers will optimize the examples in your question to be nearly or even exactly the same. The issue, therefore, is one of readability.
If you have one or two cases, an if statement usually makes sense. If you have many, especially if the code for each case is small, then a switch statement tends to be more economical in terms of code required and can be easier to read.
But, at least up to a point, readability is a matter of personal preference.

Switch is faster than if/else blocks when it can be used. When there are more than 5 entries, it is implemented as a lookup. This provides some information regarding performance: Is "else if" faster than "switch() case"?
I believe it is also more readable in these cases.

For the fewer items there will be no significant performance difference between if statements and switch statement. In switch statement every item is accessed directly, in same time, so the last item will take the same time as the first item. In if statement accessing the last item will take a longer time than the first one because it has to traverse through all the items before it. Anyway the latency will not be noticeable for fewer items as in your example.
Here is a good discussion on this topic. Have a look.

If you have a large number of conditions similar to the ones in the provided examples, then I would recommend a Map.
Map<Integer,String> map = new HashMap<Integer,String>();
map.put(1,"one");
map.put(2,"two");
map.put(3,"three");
map.put(4,"four");
I am not sure about the trade-off. I would imagine it would behave similarly to a switch statement. It would reduce the cyclomatic complexity of your code which should make it more readable.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.