Why is there no Optional.mapToInt() in Java8? - java

In Java8 streams I can use the mapToInt method to create an IntStream, which will return OptionalInts for some actions (like findFirst). Why isn't there anything similar in Optional?
int i = Stream
.of("1") // just as an example
.mapToInt(Integer::parseInt) // mapToInt exists for streams
.findFirst() // this even returns an OptionalInt!
.getAsInt(); // quite handy
int j = Optional
.of("1") // same example
.map(Integer::parseInt) // no mapToInt available
.get().intValue(); // not as handy as for streams

Apparently a handful of additional methods will appear in Optionals in Java-9. However it's unlikely that mapToInt will be added. I discussed this problem several days before in core-libs-dev. Here's Paul Sandoz answer:
I don’t wanna go there, my response is transform Optional* into a *Stream. An argument for adding mapOrElseGet (notice that the primitive variants return U) is that other functionality can be composed from it.
And later:
I think it’s fine to to pollute OptionalInt etc with Optional but i want to avoid it for the other direction.
In general I think it's reasonable. The purpose of primitive streams is to improve the performance when you process many primitive values. However for Optional the performance gain of using the primitive value is quite marginal if any (there are much bigger chances compared to streams that extra boxing will by optimized out by JIT-compiler). Also even though project Valhalla will not appear in Java-9, it's gradually moving forward and it's possible that in Java-10 we will finally see generics-over-primitives, so these primitive optionals will become completely unnecessary. In this context adding more interoperability between Object Optional and primitive OptionalInt seems unnecessary.

It makes sense to have specializations in the Stream API as a stream may represent bulk operations processing millions of elements, thus the performance impact can be dramatic. But as far as I know, even this decision wasn’t without a controversy.
For an Optional, carrying at most one element, the performance impact is not justifying additional APIs (if there ever is an impact). It’s not quite clear whether OptionalInt, etc. are really necessary at all.
Regarding the convenience, I can’t get your point. The following works:
int j = Optional.of("1").map(Integer::parseInt).get();
your proposal is to add another API which allows to rewrite the above statement as
int j = Optional.of("1").mapToInt(Integer::parseInt).getAsInt();
I don’t see how this raises the convenience…
But following the logic, with Java 9, you can write
int j = Optional.of("1").stream().mapToInt(Integer::parseInt).findFirst().getAsInt();
which raises this kind of “convenience” even more…

The following also works quite nicely:
int j = Optional
.of("1")
.map(Integer::parseInt)
.map(OptionalInt::of) // -> Optional<OptionalInt>
.orElse(OptionalInt.empty()) // -> OptionalInt
.getAsInt();
The trick is to map Optional<Integer> to Optional<OptionalInt> and then unwrap the inner OptionalInt. Thus, as far as the Optional is concerned, no primitive int is involved, so it works with Java generics.
One advantage of this approach is that it doesnt need autoboxing. That's an important point for me as we enabled warnings for autoboxing in our project, primarily to prevent unnecessary boxing and unboxing operations.
I stumbled upon this problem and the solution while implementing a public method that returns an OptionalInt, where the implementation uses another method returning Optional<Something>.
I didn't want to return Optional<Integer> from my public method, so I searched for something like Optional.mapToInt just like you.
But after reading the responses of Holger and Tagir Valeev, I agree that it's perfectly reasonable to omit such a method in the JDK. There are enough alternatives available.

Related

AtomicInteger & lambda expressions in single-threaded app

I need to modify a local variable inside a lambda expression in a JButton's ActionListener and since I'm not able to modify it directly, I came across the AtomicInteger type.
I implemented it and it works just fine but I'm not sure if this is a good practice or if it is the correct way to solve this situation.
My code is the following:
newAnchorageButton.addActionListener(e -> {
AtomicInteger anchored = new AtomicInteger();
anchored.set(0);
cbSets.forEach(cbSet ->
cbSet.forEach(cb -> {
if (cb.isSelected())
anchored.incrementAndGet();
})
);
// more code where I use the 'anchored' variable...
}
I'm not sure if this is the right way to solve this since I've read that AtomicInteger is used mostly for concurrency-related applications and this program is single-threaded, but at the same time I can't find another way to solve this.
I could simply use two nested for-loops to go over those arrays but I'm trying to reduce the method's cognitive complexity as much as I can according to the sonarlint vscode extension, and leaving those for-loops theoretically increases the method complexity and therefore its readability and maintainability.
Replacing the for-loops with lambda expressions reduces the cognitive complexity but maybe I shouldn't pay that much attention to it.
While it is safe enough in single-threaded code, it would be better to count them in a functional way, like this:
long anchored = cbSets.stream() // get a stream of the sets
.flatMap(List::stream) // flatten to list of cb's
.filter(JCheckBox::isSelected) // only selected ones
.count(); // count them
Instead of mutating an accumulator, we limit the flattened stream to only the ones we're interested in and ask for the count.
More generally, though, it is always possible to sum things up or generally aggregate the values without a mutable variable. Consider:
record Country(int population) { }
countries.stream()
.mapToInt(Country::population)
.reduce(0, Math::addExact)
Note: we never mutate any values; instead, we combine each successive value with the preceding one, producing a new value. One could use sum() but I prefer reduce(0, Math::addExact) to avoid the possibility of overflow.
and leaving those for-loops theoretically increases the method complexity and therefore its readability and maintainability.
This is obvious horsepuckey. x.forEach(foo -> bar) is not 'cognitively simpler' than for (var foo : x) bar; - you can map each AST node straight over from one to the other.
If a definition is being used to define complexity which concludes that one is significantly more complex than the other, then the only correct conclusion is that the definition is silly and should be fixed or abandoned.
To make it practical: Yes, introducing AtomicInteger, whilst performance wise it won't make one iota of difference, does make the code way more complicated. AtomicInteger's simple existence in the code suggests that concurrency is relevant here. It isn't, so you'd have to add a comment to explain why you're using it. Comments are evil. (They imply the code does not speak for itself, and they cannot be tested in any way). They are often the least evil, but evil they are nonetheless.
The general 'trick' for keeping lambda-based code cognitively easily followed is to embrace the pipeline:
You write some code that 'forms' a stream. This can be as simple as list.stream(), but sometimes you do some stream joining or flatmapping a collection of collections.
You have a pipeline of operations that operate on single elements in the stream and do not refer to the whole or to any neighbour.
At the end, you reduce (using collect, reduce, max - some terminator) such that the reducing method returns what you need.
The above model (and the other answer follows it precisely) tends to result in code that is as readable/complex as the 'old style' code, and rarely (but sometimes!) more readable, and significantly less complicated. Deviate from it and the result is virtually always considerably more complicated - a clear loser.
Not all for loops in java fit the above model. If it doesn't fit, then trying to force that particular square peg into the round hole will take a lot of effort and almost always results in code that is significantly worse: Either an order of magnitude slower or considerably more cognitively complicated.
It also means that it is virtually never 'worth' rewriting perfectly fine readable non-stream based code into stream based code; at best it becomes a percentage point more readable according to some personal tastes, with no significant universally agreed upon improvement.
Turn off that silly linter rule. The fact that it considers the above 'less' complex, and that it evidently determines that for (var foo : x) bar; is 'more complicated' than x.forEach(foo -> bar) is proof enough that it's hurting way more than it is helping.
I have the following to add to the two other answers:
Two general good practices in your code are in question:
Lambdas shouldn't be longer than 3-4 lines
Except in some precise cases, lambdas of stream operations should be stateless.
For #1, consider extracting the code of the lambda to a private method for example, when it's getting too long.
You will probably gain in readability, and you will also probably gain in better separating UI from business logic.
For #2, you are probably not concerned since you are working in a single thread at the moment, but streams can be parallelized, and they may not always execute exactly as you think it does.
For that reason, it's always better to keep the code stateless in stream pipeline operations. Otherwise you might be surprised.
More generally, streams are very good, very concise, but sometimes it's just better to do the same with good old loops.
Don't hesitate to come back to classic loops.
When Sonar tells you that the complexity is too high, in fact, you should try to factorize your code: split into smaller methods, improve the model of your objects, etc.

Valid usage of Optional type in Java 8

Is this a valid (intended) usage of Optional type in Java 8?
String process(String s) {
return Optional.ofNullable(s).orElseGet(this::getDefault);
}
I'll take another swing at this.
Is this a valid usage? Yes, in the narrow sense that it compiles and produces the results that you're expecting.
Is this intended usage? No. Now, sometimes things find usefulness beyond what they were originally for, and if this works out, great. But for Optional, we have found that usually things don't work out very well.
Brian Goetz and I discussed some of the issues with Optional in our JavaOne 2015 talk, API Design With Java 8 Lambdas and Streams:
link to video
link to slides
The primary use of Optional is as follows: (slide 36)
Optional is intended to provide a limited mechanism for library method return types where there is a clear need to represent "no result," and where using null for that is overwhelmingly likely to cause errors.
The ability to chain methods from an Optional is undoubtedly very cool, and in some cases it reduces the clutter from conditional logic. But quite often this doesn't work out. A typical code smell is, instead of the code using method chaining to handle an Optional returned from some method, it creates an Optional from something that's nullable, in order to chain methods and avoid conditionals. Here's an example of that in action (also from our presentation, slide 42):
// BAD
String process(String s) {
return Optional.ofNullable(s).orElseGet(this::getDefault);
}
// GOOD
String process(String s) {
return (s != null) ? s : getDefault();
}
The method that uses Optional is longer, and most people find it more obscure than the conventional code. Not only that, it creates extra garbage for no good reason.
Bottom line: just because you can do something doesn't mean that you should do it.
Since this is more or less an opinion-based question, I'll throw mine in. If you're trying to say
if (id == 1) {
Foo f = new Foo(id, "Bar", "US");
return "Bar".equals(f.getName()) && "US".equals(f.getCountryCode());
} else {
return false;
}
then just say that. Making things "functional" doesn't automatically make things clearer or better. By introducing a needless Optional, a couple lambdas, and some Optional methods that I had to look up, you've made the code more convoluted and difficult to understand. I don't think the designers of Java "intended" for people to use Optional to help make code more obscure.
EDIT: After reading some responses, I think it's worth adding some comments. This is not a functional programming idiom I'm familiar with, which would make it harder to understand. The idioms I am familiar with mostly involve Java streams, or (in other languages) functional idioms applied to multiple values in arrays or lists or other collections of multiple values. In those cases, once you get past the unfamiliarity, the functional syntax can be seen as an improvement because it allows some details to be hidden (loop indexes, iterators, running pointers, accumulator variables). So overall, it can simplify things. This example, by itself, doesn't do any such simplification.
However, some of the Optional features are useful in stream contexts. Suppose we had a parseInt() method that returns an Optional<Integer>, which is empty if the input string is invalid. (Java 8 really should have provided this.) This would make it easy to take an array of strings and produce an array of integers in which the strings that don't parse are simply eliminated from the result--use parseInt in a stream map(), and use a stream filter to filter out the empty Optionals. (I've seen multiple StackOverflow questions asking how to do this.) If you want to keep only the positive values, you could use an Optional.filter() to change the nonpositives to Optional.empty() before using the stream filter (although in this case, you could add another stream filter afterwards, but in a more complex case the Optional filter could be more useful). That's what I see as the main benefit of Optional from a functional standpoint. It allows you to work with a collection of values all at once, by giving you a way to represent "non-values" and write a function that will still work with them. So I guess the main use of Optional, besides a replacement for null, would be to represent empty spaces in a sequence of values while you're applying functions to the entire sequence as a whole.
Asking whether it's "valid" is rather opinion-based, but as to whether it's the intended use case: no, it's not.
Brian Goetz, Oracle's language architect for Java, has stated that the use case for Optional is for when you need a "no value" marker, and when using null for this is likely to cause errors. Specifically, if a reasonable user of your method is not likely to consider the possibility that its result is null, then you should use Optional. It was explicitly not intended to be a general "Maybe"-type object, as you're using it here.
In your case, the method that returns the Optional is private. That means it can only be used by the implementers of the class, and you can assume that they have good knowledge of the class' methods — including which of them may return null. Since there's no reasonable risk of confusion, Brian Goetz would (probably) say that he would not consider this a valid use case.
Its a little contrived, but 'valid' (as in 'syntactically') , but as #yshavit pointed to, it was intended for use in library development.
Previous answer was due to FP code being difficult to read. Below is commented(a little verbose, b/c that is the javadoc comments) but still. Much easier to read IMHO. (2nd is no-comments, and at least alignment to help readability)
private boolean isFooValid(final Integer id) {
return getFoo(id)
// filter if 'f' matches the predicate, return Optional w/f if true, empty Optional if false
.filter(f -> "Bar".equals(f.getName()) && "US".equals(f.getCountryCode()))
// If a value is present, apply the provided mapping function to it,
// If non-null, return an Optional describing the result.
.map(f -> true)
// Return the value if present, otherwise return other.
.orElse(false);
}
Or at least line it up so its more apparent what is going on and easier to read.
private boolean isFooValid(final Integer id) {
return getFoo(id)
.filter(f -> "Bar".equals(f.getName()) && "US".equals(f.getCountryCode()))
.map(f -> true)
.orElse(false);
}

Why does Java CharSequence.chars() return an IntStream? [duplicate]

In Java 8, there is a new method String.chars() which returns a stream of ints (IntStream) that represent the character codes. I guess many people would expect a stream of chars here instead. What was the motivation to design the API this way?
As others have already mentioned, the design decision behind this was to prevent the explosion of methods and classes.
Still, personally I think this was a very bad decision, and there should, given they do not want to make CharStream, which is reasonable, different methods instead of chars(), I would think of:
Stream<Character> chars(), that gives a stream of boxes characters, which will have some light performance penalty.
IntStream unboxedChars(), which would to be used for performance code.
However, instead of focusing on why it is done this way currently, I think this answer should focus on showing a way to do it with the API that we have gotten with Java 8.
In Java 7 I would have done it like this:
for (int i = 0; i < hello.length(); i++) {
System.out.println(hello.charAt(i));
}
And I think a reasonable method to do it in Java 8 is the following:
hello.chars()
.mapToObj(i -> (char)i)
.forEach(System.out::println);
Here I obtain an IntStream and map it to an object via the lambda i -> (char)i, this will automatically box it into a Stream<Character>, and then we can do what we want, and still use method references as a plus.
Be aware though that you must do mapToObj, if you forget and use map, then nothing will complain, but you will still end up with an IntStream, and you might be left off wondering why it prints the integer values instead of the strings representing the characters.
Other ugly alternatives for Java 8:
By remaining in an IntStream and wanting to print them ultimately, you cannot use method references anymore for printing:
hello.chars()
.forEach(i -> System.out.println((char)i));
Moreover, using method references to your own method do not work anymore! Consider the following:
private void print(char c) {
System.out.println(c);
}
and then
hello.chars()
.forEach(this::print);
This will give a compile error, as there possibly is a lossy conversion.
Conclusion:
The API was designed this way because of not wanting to add CharStream, I personally think that the method should return a Stream<Character>, and the workaround currently is to use mapToObj(i -> (char)i) on an IntStream to be able to work properly with them.
The answer from skiwi covered many of the major points already. I'll fill in a bit more background.
The design of any API is a series of tradeoffs. In Java, one of the difficult issues is dealing with design decisions that were made long ago.
Primitives have been in Java since 1.0. They make Java an "impure" object-oriented language, since the primitives are not objects. The addition of primitives was, I believe, a pragmatic decision to improve performance at the expense of object-oriented purity.
This is a tradeoff we're still living with today, nearly 20 years later. The autoboxing feature added in Java 5 mostly eliminated the need to clutter source code with boxing and unboxing method calls, but the overhead is still there. In many cases it's not noticeable. However, if you were to perform boxing or unboxing within an inner loop, you'd see that it can impose significant CPU and garbage collection overhead.
When designing the Streams API, it was clear that we had to support primitives. The boxing/unboxing overhead would kill any performance benefit from parallelism. We didn't want to support all of the primitives, though, since that would have added a huge amount of clutter to the API. (Can you really see a use for a ShortStream?) "All" or "none" are comfortable places for a design to be, yet neither was acceptable. So we had to find a reasonable value of "some". We ended up with primitive specializations for int, long, and double. (Personally I would have left out int but that's just me.)
For CharSequence.chars() we considered returning Stream<Character> (an early prototype might have implemented this) but it was rejected because of boxing overhead. Considering that a String has char values as primitives, it would seem to be a mistake to impose boxing unconditionally when the caller would probably just do a bit of processing on the value and unbox it right back into a string.
We also considered a CharStream primitive specialization, but its use would seem to be quite narrow compared to the amount of bulk it would add to the API. It didn't seem worthwhile to add it.
The penalty this imposes on callers is that they have to know that the IntStream contains char values represented as ints and that casting must be done at the proper place. This is doubly confusing because there are overloaded API calls like PrintStream.print(char) and PrintStream.print(int) that differ markedly in their behavior. An additional point of confusion possibly arises because the codePoints() call also returns an IntStream but the values it contains are quite different.
So, this boils down to choosing pragmatically among several alternatives:
We could provide no primitive specializations, resulting in a simple, elegant, consistent API, but which imposes a high performance and GC overhead;
we could provide a complete set of primitive specializations, at the cost of cluttering up the API and imposing a maintenance burden on JDK developers; or
we could provide a subset of primitive specializations, giving a moderately sized, high performing API that imposes a relatively small burden on callers in a fairly narrow range of use cases (char processing).
We chose the last one.

Idiomatic Collection iteration in Java 8

What is considered idiomatic iteration of a Collection in Java 8, and why?
for (String foo : foos) {
String bar = bars.get(foo);
if (bar != null)
System.out.println(foo);
}
or
foos.forEach(foo -> {
String bar = bars.get(foo);
if (bar != null)
System.out.println(foo);
});
In the comment thread to this answer, user Bringer128 mentioned these questions regarding a similar issue in C#:
foreach vs someList.Foreach(){}
Generic lists: foreach or list.ForEach?
I would caution against applying the C# discussion to Java. The discussion is interesting, to be sure, and the issues are superficially similar. However, Java and C# are different languages and thus different considerations apply.
For example, this answer mentions that the C# foreach statement is preferable, because the compiler might be able to optimize the loop better in the future. This is not true of Java. In Java, the "enhanced for" loop is defined to be syntactic sugar for getting an Iterator and calling its hasNext and next methods repeatedly. This pretty much guarantees a minimum of two method calls per loop iteration (although there is a possibility for the JIT to inline small methods).
Another example is from this answer, which mentions that in C# it is legal for the delegate invoked by a list's ForEach method to modify the list that it's iterating. In Java there is a blanket prohibition of "interference" with the stream source for the Stream.forEach method, whereas for the enhanced-for loop, the behavior of modifying the underlying list (or whatever) is determined by the Iterator. Many are fail-fast and will throw ConcurrentModificationException if the underlying list is modified during iteration. Others will silently give unexpected results.
In any case, don't read the C# discussion and assume that similar reasoning applies to Java.
Now, to answer the question. :-)
I think it's too early to declare one style to be idiomatic or preferable to another at this point. Java 8 has just been released and very few people have much experience with it. Lambdas are new and unfamiliar, and this will make many programmers uncomfortable. They'll thus want to stick to their tried-and-true for-loops. That's perfectly sensible. In a few years, though, after everyone gets used to lambdas, it might be that for-loops will start to look distinctly old-fashioned. Time will tell.
(I think this happened with generics. When they were new, they were intimidating and scary, especially wildcards. Nowadays, though, non-generic code looks distinctly old-fashioned, and to me it has a musty odor about it.)
I have an early sense of how this might turn out. Of course, I might be wrong though.
I'd say that for short loops where the computation is fixed, such as the question posted initially:
for (String foo : foos)
System.out.println(foo);
it just doesn't matter. This could be rewritten as
foos.forEach(foo -> System.out.println(foo));
or even
foos.forEach(System.out::println);
But really, this code is so simple that it's hard to argue that one way is clearly better.
There are situations where the scales tip in one direction or another. If the loop body can throw a checked exception, a for-loop is clearly better. If the loop body is pluggable (e.g., the Consumer is passed in as a parameter) or if internal iteration has different semantics (e.g., locking of a synchronized list during the entire call to forEach) then the new forEach approach has the edge.
The updated example,
for (String foo : foos) {
String bar = bars.get(foo);
if (bar != null)
System.out.println(foo);
}
is a bit more complicated, but only slightly. I would not write this using a multi-line lambda:
foos.forEach(foo -> {
String bar = bars.get(foo);
if (bar != null)
System.out.println(foo);
});
This offers no advantage over the straight for-loop, in my opinion, and the different semantics of the lambda are signaled by the little arrow way up in the corner of the first line. However, (similar to Bringer128's answer) I would recast this from a big forEach block into a stream pipeline:
foos.stream()
.filter(foo -> bars.get(foo) != null)
.forEach(System.out::println)
I think the lambda/streams approach starts to show a bit of an advantage here, but only a bit, as this is still a really simple example. Using lambda/streams replaces some conditional control logic with a data filtering operation. This might make sense for some operations, but not for others.
The difference between the approaches starts to become clearer as things get more complicated. The simple examples are so simple that it's obvious what they do. Real-world examples can be considerably more complex. Consider this code from the method Class.getEnclosingMethod of the JDK (scroll to lines 1023-1052):
Class<?> enclosingCandidate = enclosingInfo.getEnclosingClass();
// ...
for(Method m: enclosingCandidate.getDeclaredMethods()) {
if (m.getName().equals(enclosingInfo.getName()) ) {
Class<?>[] candidateParamClasses = m.getParameterTypes();
if (candidateParamClasses.length == parameterClasses.length) {
boolean matches = true;
for(int i = 0; i < candidateParamClasses.length; i++) {
if (!candidateParamClasses[i].equals(parameterClasses[i])) {
matches = false;
break;
}
}
if (matches) { // finally, check return type
if (m.getReturnType().equals(returnType) )
return m;
}
}
}
}
throw new InternalError("Enclosing method not found");
(Some security checks and comments have been omitted for the sake of the example.)
Here we have a couple nested for-loops with a couple levels of conditional logic and a boolean flag. Read through this code for a while and see if you can figure out what it does.
Using lambda and streams, this code can be rewritten as follows:
return Arrays.stream(enclosingInfo.getEnclosingClass().getDeclaredMethods())
.filter(m -> Objects.equals(m.getName(), enclosingInfo.getName()))
.filter(m -> Arrays.equals(m.getParameterTypes(), parameterClasses))
.filter(m -> Objects.equals(m.getReturnType(), returnType))
.findFirst()
.orElseThrow(() -> new InternalError("Enclosing method not found");
What's going on in the classic version is that the loop control and conditional logic is all about searching a data structure for a match. It's a bit contorted because it breaks early out of the inner loop if it detects a non-match, but returns early from the method if it does find a match. But once you stare at this code long enough, you can see that it's searching for the first element that matches a series of criteria, and returns it; and if it doesn't find one, it throws an error. Once you realize that, the lambda/streams approach just pops right out. Not only is it a lot shorter, it's much easier to understand what it's doing.
There are certainly for-loops that will have weird conditions and side effects that can't be turned easily into streams. But there are a lot of for-loops that are just searching data structures, processing elements conditionally, returning the first match, or accumulating a collection of matches, or accumulating transformed elements. These operations naturally lend themselves to being rewritten into streams, and dare I say, in an idiomatic fashion.
In general the lambda form is more idiomatic for single-statement loops, whereas the non-lambda makes more sense for multi-statement loops. (This ignores composing into a more functional style if possible).
One more style you didn't mention is the method reference:
foos.forEach(System.out::println);
EDIT:
As you're looking for a more general answer; you might find that since lambdas are new in Java, the List.forEach method is less used in practice.
In response to "So why is non-lambda more idiomatic for multi-statement?", it's more the reverse, that multi-statement lambdas are not idiomatic in most languages. Lambdas tend to be used for composition, so if I was to take the example from your question and compose it into a functional style:
// Thanks to #skiwi for fixing this code
foos.stream().filter(foo -> bars.get(foo) != null).forEach(System.out::println);
In the above example, using multi-statement lambdas would make it harder to read rather than easier.
You should only be using the new stream/list's forEach if it really makes your code more concise, else stick with the old version, especially for code that gets executed linearly.
I would rewrite your statement to the following, which does make sense with streams:
foos.stream()
.filter(foo -> (bars.get(foo) != null))
.forEach(System.out::println);
This is a functional approach, that will:
Turn your List<String> into a Stream<String>.
Filter the objects such that you retain all elements of which bars.get(foo) is not null, which is of type Predicate<String>.
Then you call System.out::println on the Stream<String>, which resolves to bar -> System.out.println(bar), which is of type Consumer<String>.
So in more normal words:
Obtain a stream.
Filter out all unwanted elements, retain the wanted ones.
Consume all elements from the stream.

Why is String.chars() a stream of ints in Java 8?

In Java 8, there is a new method String.chars() which returns a stream of ints (IntStream) that represent the character codes. I guess many people would expect a stream of chars here instead. What was the motivation to design the API this way?
As others have already mentioned, the design decision behind this was to prevent the explosion of methods and classes.
Still, personally I think this was a very bad decision, and there should, given they do not want to make CharStream, which is reasonable, different methods instead of chars(), I would think of:
Stream<Character> chars(), that gives a stream of boxes characters, which will have some light performance penalty.
IntStream unboxedChars(), which would to be used for performance code.
However, instead of focusing on why it is done this way currently, I think this answer should focus on showing a way to do it with the API that we have gotten with Java 8.
In Java 7 I would have done it like this:
for (int i = 0; i < hello.length(); i++) {
System.out.println(hello.charAt(i));
}
And I think a reasonable method to do it in Java 8 is the following:
hello.chars()
.mapToObj(i -> (char)i)
.forEach(System.out::println);
Here I obtain an IntStream and map it to an object via the lambda i -> (char)i, this will automatically box it into a Stream<Character>, and then we can do what we want, and still use method references as a plus.
Be aware though that you must do mapToObj, if you forget and use map, then nothing will complain, but you will still end up with an IntStream, and you might be left off wondering why it prints the integer values instead of the strings representing the characters.
Other ugly alternatives for Java 8:
By remaining in an IntStream and wanting to print them ultimately, you cannot use method references anymore for printing:
hello.chars()
.forEach(i -> System.out.println((char)i));
Moreover, using method references to your own method do not work anymore! Consider the following:
private void print(char c) {
System.out.println(c);
}
and then
hello.chars()
.forEach(this::print);
This will give a compile error, as there possibly is a lossy conversion.
Conclusion:
The API was designed this way because of not wanting to add CharStream, I personally think that the method should return a Stream<Character>, and the workaround currently is to use mapToObj(i -> (char)i) on an IntStream to be able to work properly with them.
The answer from skiwi covered many of the major points already. I'll fill in a bit more background.
The design of any API is a series of tradeoffs. In Java, one of the difficult issues is dealing with design decisions that were made long ago.
Primitives have been in Java since 1.0. They make Java an "impure" object-oriented language, since the primitives are not objects. The addition of primitives was, I believe, a pragmatic decision to improve performance at the expense of object-oriented purity.
This is a tradeoff we're still living with today, nearly 20 years later. The autoboxing feature added in Java 5 mostly eliminated the need to clutter source code with boxing and unboxing method calls, but the overhead is still there. In many cases it's not noticeable. However, if you were to perform boxing or unboxing within an inner loop, you'd see that it can impose significant CPU and garbage collection overhead.
When designing the Streams API, it was clear that we had to support primitives. The boxing/unboxing overhead would kill any performance benefit from parallelism. We didn't want to support all of the primitives, though, since that would have added a huge amount of clutter to the API. (Can you really see a use for a ShortStream?) "All" or "none" are comfortable places for a design to be, yet neither was acceptable. So we had to find a reasonable value of "some". We ended up with primitive specializations for int, long, and double. (Personally I would have left out int but that's just me.)
For CharSequence.chars() we considered returning Stream<Character> (an early prototype might have implemented this) but it was rejected because of boxing overhead. Considering that a String has char values as primitives, it would seem to be a mistake to impose boxing unconditionally when the caller would probably just do a bit of processing on the value and unbox it right back into a string.
We also considered a CharStream primitive specialization, but its use would seem to be quite narrow compared to the amount of bulk it would add to the API. It didn't seem worthwhile to add it.
The penalty this imposes on callers is that they have to know that the IntStream contains char values represented as ints and that casting must be done at the proper place. This is doubly confusing because there are overloaded API calls like PrintStream.print(char) and PrintStream.print(int) that differ markedly in their behavior. An additional point of confusion possibly arises because the codePoints() call also returns an IntStream but the values it contains are quite different.
So, this boils down to choosing pragmatically among several alternatives:
We could provide no primitive specializations, resulting in a simple, elegant, consistent API, but which imposes a high performance and GC overhead;
we could provide a complete set of primitive specializations, at the cost of cluttering up the API and imposing a maintenance burden on JDK developers; or
we could provide a subset of primitive specializations, giving a moderately sized, high performing API that imposes a relatively small burden on callers in a fairly narrow range of use cases (char processing).
We chose the last one.

Categories

Resources