Using Stream API to fill up a Collection - java

I'm just learning the java.util.stream API and I'm looking for a way to quickly fill up a Collection with some data.
I've come up with this code to add 5 random numbers:
List<Integer> lottery = Stream.of(random.nextInt(90), random.nextInt(90), random.nextInt(90),random.nextInt(90),
random.nextInt(90)).collect(Collectors.toList());
That would be however a problem in case I have to add hundreds of items. Is there a more concise way to do it using the java.util.stream API?
(I could obviously loop in the ordinary way...)
Thanks

In this case the stream API only makes things more complicated vs. a regular loop, but you could create an IntStream (or Stream<Integer> depending on where you convert the int to Integer) that generates infinite random numbers with
IntStream rndStream = IntStream.generate(() -> rnd.nextInt(90));
Now you can use limit() to create non-infinite substreams that can be collected, such as
rndStream.limit(5).boxed().collect(Collectors.toList());
However this is just stream trickery and has no advantages over say having a method called List<Integer> getRandoms(int number) with a regular loop inside.
There's no reason to use the code displayed in this answer as it's less readable and more complex than alternatives. This just demonstrates how to get finite amounts of elements from an infinitely generated stream.

Use Random.ints(long, int, int)
Everyone else on this page is generating the stream because they can, but Java already has such an out of the box method, Random.ints(long, int, int), available since Java 8 (and therefore the beginning of streams).
Just use it like this:
List<Integer> lottery = rnd.ints(5, 0, 90)
.boxed()
.collect(Collectors.toList());
The instruction rnd.ints(5L, 0, 90) means: "create a stream of 5 integers between 0 (included) and 90 (excluded)". So if you want 100 numbers instead of 5, just change 5 to 100.

This should do:
List<Integer> randomList = new ArrayList<>();
SecureRandom random = new SecureRandom();
IntStream.rangeClosed(1,100).forEach(value->randomList.add(random.nextInt(90)));

Related

Java stream: limit doesnt work with Random().ints().sorted()

I have following code:
new Random().ints()
.map(i -> i / 2)
.limit(100)
.toArray()
...and it works as expected.
But this doesn't work:
new Random().ints()
.sorted()
.map(i -> i / 2)
.limit(100)
.toArray()
It throws:
Exception in thread "main" java.lang.IllegalArgumentException: Stream size exceeds max array size
Shouldn't limit shortcircuit the stream and help in restricting it to 100 elements?
The call new Random().ints() returns an infinite stream of pseudo random numbers. When you sort that, it internally uses an array. However, the max size of an array is obviously less than ∞. To fix the issue, use limit and make your stream finite first and then perform the sorting. In fact, this optimizes away the stream processing pipeline since you are sorting only necessary elements. Here's how it looks.
new Random().ints().limit(100).map(i -> i / 2).sorted().toArray();
Update
As per the suggestion made in the below comment, you can further improve it like so.
new Random().ints(100).map(i -> i / 2).sorted().toArray();
The ints procedure produces a stream with exactly 100 pseudo random numbers and with that you can merely dispense with the limit. Moreover, the latter approach is a bit more succinct than the former.

Is there a possibility to make a new Stream out of 2 Streams in Java?

I am quite new to Java 8 Streams and have a question with faced me, doing homework for University.
If I have 2 Intstreams with have the same amount of Integers as
IntStream one = Arrays.stream(test.matrix).flatMapToInt(Arrays::stream);
IntStream two = Arrays.stream(test2.matrix).flatMapToInt(Arrays :: stream);
IntStream difference = one - two ???
is it possible to generate a new Stream holding the elementwhise difference of the two streams?
I have tried a lot like using stream.iterator(); which worked, but I am not allowed to use "normal" iterators, and should solve it with lambda expressions.
Has anyone a tip on how to solve this?
The operation combining two streams/lists is often called zip.
Unfortunately this operation is not provided in the Java 8 standard library.
You can read more about how users defined their own zip functions here: Zipping streams using JDK8 with lambda (java.util.stream.Streams.zip)
Overall, you'll essentially just be iterating the arrays by index. If you only iterate by elements then you cannot combine the streams in a way analogous to a zip operation. First, define what subtracting matrices looks like. For now, I'll only define same size matrices:
public static void subtract(int[] origin, int[] operand) {
if (origin.length != operand.length) throw new IllegalArgumentException("Array lengths must match");
//Make a new result, so as not to muck with the original array
return IntStream.range(0, origin.length).map(i -> origin[i] - operand[i]).toArray();
}
Then generating the difference is a matter of passing in those matrices directly, and the stream is used simply as a matter of iterating both sets of elements. Thus the task could also be accomplished with a simple vanilla for loop. Nonetheless:
int[] result = subtract(one, two);
We can abstract the logic above as well to allow a more flexible set of operations:
public static int[] operate(int[] origin, int[] operand, IntBiFunction operation) {
if (origin.length != operand.length) throw new IllegalArgumentException("Array lengths must match");
return IntStream.range(0, origin.length)
.forEach(i -> result[i] = operation.apply(origin[i], operand[i]))
.toArray();
}
public static void subtract(int[] origin, int[] operand) {
//Producing result from each index matched element
return operate(origin, operand, (one, two) -> one - two);
}
This is where I'd see more of an advantage of using anything function api related.
If you wish to utilize a return of IntStream instead of int[], you can, but you'll run into the original issue of having to reference the arrays by index. You could really push it and supply an arbitrary amount of operations (IntBiFunction... operations), but I think you risk overcomplicating your logic.

Length of an infinite IntStream?

I have created an randomIntStream by this:
final static PrimitiveIterator.OfInt startValue = new Random().ints(0, 60).iterator();
The documentation says this stream is actually endless.
I want to understand what happens there in the backround.
ints(0,60) is generating an infinite stream of integers. If this is infinite, why my machine is not leaking any memory?
I wonder, how many numbers are actually really generated and if this implemenentation can cause an error at the point where the stream still ends? Or will this stream constantly filled with new integers on the fly and it really never ends therefore?
And if I already ask this question, what is the best practise right now to generate random numbers nowadays?
The stream is infinite¹ so you can generate as many ints as you want without running out. It does not mean that it keeps generating them when you aren't asking for any.
How many numbers are actually generated depends on the code you write. Every time you retrieve a value from the iterator, a value is generated. None is generated in the background, so there's no "extra" memory being used.
¹ as far as your lifetime is concerned, see Eran's answer
To be exact,
IntStream java.util.Random.ints(int randomNumberOrigin, int randomNumberBound) returns:
an effectively unlimited stream of pseudorandom int values, each conforming to the given origin (inclusive) and bound (exclusive).
This doesn't mean infinite. Looking at the Javadoc, you'll see an implementation note stating that it actually limits the returned IntStream to Long.MAX_VALUE elements:
Implementation Note:
This method is implemented to be equivalent to ints(Long.MAX_VALUE, randomNumberOrigin, randomNumberBound).
Of course Long.MAX_VALUE is a very large number, and therefore the returned IntStream can be seen as "effectively" without limit. For example, if you consume 1000000 ints of that stream every second, it will take you about 292471 years to run out of elements.
That said, as mentioned by the other answers, that IntStream only generates as many numbers as are required by its consumer (i.e. the terminal operation that consumes the ints).
Streams do not (in general1) store all of their elements in any kind of a data structure:
No storage. A stream is not a data structure that stores elements; instead, it conveys elements from a source such as a data structure, an array, a generator function, or an I/O channel, through a pipeline of computational operations.
Instead, each stream element is computed one-by-one each time the stream advances. In your example, each random int would actually be computed when you invoke startValue.nextInt().
So when we do e.g. new Random().ints(0,60), the fact that the stream is effectively infinite isn't a problem, because no random ints are actually computed until we perform some action that traverses the stream. Once we do traverse the stream, ints are only computed when we request them.
Here's a small example using Stream.generate (also an infinite stream) which shows this order of operations:
Stream.generate(() -> {
System.out.println("generating...");
return "hello!";
})
.limit(3)
.forEach(elem -> {
System.out.println(elem);
});
The output of that code is:
generating...
hello!
generating...
hello!
generating...
hello!
Notice that our generator supplier is called once just before every call of our forEach consumer, and no more. If we didn't use limit(3), the program could run forever, but it wouldn't run out of memory.
If we did new Random().ints(0,60).forEach(...), it would work the same way. The stream would do random.nextInt(60) once before every call to the forEach consumer. The elements wouldn't be accumulated anywhere unless we used some action that required it, such as distinct() or a collector instead of forEach.
Some streams probably use a data structure behind the scenes for temporary storage. For example, it's common to use a stack during tree traversal algorithms. Also, some streams such as those created using a Stream.Builder will require a data structure to put their elements in.
As said by #Kayaman in his answer. The stream is infinite in the way that numbers can be generated forever. The point lies in the word can. It does only generate numbers if you really request them. It will not just generate X amount of numbers and then stores them somewhere (unless you tell it to do so).
So if you want to generate n (where n is an integer) random numbers. You can just call the overload of ints(0, 60), ints(n, 0, 60) on the stream returned by Random#ints():
new Random().ints(n, 0, 60)
Above will still not generate n random numbers, because it is an IntStream which is lazily executed. So when not using a terminal operation (e.g. collect() or forEach()) nothing really happens.
Creating a generator does not generate any numbers. In concept, this generator will continually generate new numbers; there is no point at which it would not return the next value when asked.

java 8 make a stream of the multiples of two

I'm practicing streams in java 8 and im trying to make a Stream<Integer> containing the multiples of 2. There are several tasks in one main class so I won't link the whole block but what i got so far is this:
Integer twoToTheZeroth = 1;
UnaryOperator<Integer> doubler = (Integer x) -> 2 * x;
Stream<Integer> result = ?;
My question here probably isn't related strongly to the streams, more like the syntax, that how should I use the doubler to get the result?
Thanks in advance!
You can use Stream.iterate.
Stream<Integer> result = Stream.iterate(twoToTheZeroth, doubler);
or using the lambda directly
Stream.iterate(1, x -> 2*x);
The first argument is the "seed" (ie first element of the stream), the operator gets applied consecutively with every element access.
EDIT:
As Vinay points out, this will result in the stream being filled with 0s eventually (this is due to int overflow). To prevent that, maybe use BigInteger:
Stream.iterate(BigInteger.ONE,
x -> x.multiply(BigInteger.valueOf(2)))
.forEach(System.out::println);
Arrays.asList(1,2,3,4,5).stream().map(x -> x * x).forEach(x -> System.out.println(x));
so you can use the doubler in the map caller

When should I use IntStream.range in Java?

I would like to know when I can use IntStream.range effectively. I have three reasons why I am not sure how useful IntStream.range is.
(Please think of start and end as integers.)
If I want an array, [start, start+1, ..., end-2, end-1], the code below is much faster.
int[] arr = new int[end - start];
int index = 0;
for(int i = start; i < end; i++)
arr[index++] = i;
This is probably because toArray() in IntStream.range(start, end).toArray() is very slow.
I use MersenneTwister to shuffle arrays. (I downloaded MersenneTwister class online.) I do not think there is a way to shuffle IntStream using MersenneTwister.
I do not think just getting int numbers from start to end-1 is useful. I can use for(int i = start; i < end; i++), which seems easier and not slow.
Could you tell me when I should choose IntStream.range?
There are several uses for IntStream.range.
One is to use the int values themselves:
IntStream.range(start, end).filter(i -> isPrime(i))....
Another is to do something N times:
IntStream.range(0, N).forEach(this::doSomething);
Your case (1) is to create an array filled with a range:
int[] arr = IntStream.range(start, end).toArray();
You say this is "very slow" but, like other respondents, I suspect your benchmark methodology. For small arrays there is indeed more overhead with stream setup, but this should be so small as to be unnoticeable. For large arrays the overhead should be negligible, as filling a large array is dominated by memory bandwidth.
Sometimes you need to fill an existing array. You can do that this way:
int[] arr = new int[end - start];
IntStream.range(0, end - start).forEach(i -> arr[i] = i + start);
There's a utility method Arrays.setAll that can do this even more concisely:
int[] arr = new int[end - start];
Arrays.setAll(arr, i -> i + start);
There is also Arrays.parallelSetAll which can fill an existing array in parallel. Internally, it simply uses an IntStream and calls parallel() on it. This should provide a speedup for large array on a multicore system.
I've found that a fair number of my answers on Stack Overflow involve using IntStream.range. You can search for them using these search criteria in the search box:
user:1441122 IntStream.range
One application of IntStream.range I find particularly useful is to operate on elements of an array, where the array indexes as well as the array's values participate in the computation. There's a whole class of problems like this.
For example, suppose you want to find the locations of increasing runs of numbers within an array. The result is an array of indexes into the first array, where each index points to the start of a run.
To compute this, observe that a run starts at a location where the value is less than the previous value. (A run also starts at location 0). Thus:
int[] arr = { 1, 3, 5, 7, 9, 2, 4, 6, 3, 5, 0 };
int[] runs = IntStream.range(0, arr.length)
.filter(i -> i == 0 || arr[i-1] > arr[i])
.toArray();
System.out.println(Arrays.toString(runs));
[0, 5, 8, 10]
Of course, you could do this with a for-loop, but I find that using IntStream is preferable in many cases. For example, it's easy to store an unknown number of results into an array using toArray(), whereas with a for-loop you have to handle copying and resizing, which distracts from the core logic of the loop.
Finally, it's much easier to run IntStream.range computations in parallel.
Here's an example:
public class Test {
public static void main(String[] args) {
System.out.println(sum(LongStream.of(40,2))); // call A
System.out.println(sum(LongStream.range(1,100_000_000))); //call B
}
public static long sum(LongStream in) {
return in.sum();
}
}
So, let's look at what sum() does: it counts the sum of an arbitrary stream of numbers. We call it in two different ways: once with an explicit list of numbers, and once with a range.
If you only had call A, you might be tempted to put the two numbers into an array and pass it to sum() but that's clearly not an option with call B (you'd run out of memory). Likewise you could just pass the start and end for call B, but then you couldn't support the case of call A.
So to sum it up, ranges are useful here because:
We need to pass them around between methods
The target method doesn't just work on ranges but any stream of numbers
But it only operates on individual numbers of the stream, reading them sequentially. (This is why shuffling with streams is a terrible idea in general.)
There is also the readability argument: code using streams can be much more concise than loops, and thus more readable, but I wanted to show an example where a solution relying on IntStreans is functionally superior too.
I used LongStream to emphasise the point, but the same goes for IntStream
And yes, for simple summing this may look like a bit of an overkill, but consider for example reservoir sampling
IntStream.range returns a range of integers as a stream so you can do stream processing over it.
like taking square of each element
IntStream.range(1, 10).map(i -> i * i);
Here are few differences that comes to my head between IntStream.range and traditional for loops :
IntStream are lazily evaluated, the pipeline is traversed when calling a terminal operation. For loops evaluate at each iteration.
IntStream will provides you some functions that are commonly applied to a range of ints such as sum and avg.
IntStream will allow you to code multiple operation over a range of int in a functional way which read more fluently - specially if you have a lot of operations.
So basically use IntStream when one or more of these differences are useful to you.
But please bear in mind that shuffling a Stream sound quite strange as a Stream is not a data structure and therefore it does not really make sense to shuffle it (in case you were planning on building a special IntSupplier). Shuffle the result instead.
As for the performance, while there may be a few overhead, you will still iterate N times in both case and should not really care more.
Basically, if you want Stream operations, you can use the range() method. For example, to use concurrency or want to use map() or reduce(). Then you are better off with IntStream.
For example:
IntStream.range(1, 5).parallel().forEach(i -> heavyOperation());
Or:
IntStream.range(1, 5).reduce(1, (x, y) -> x * y)
// > 24
You can achieve the second example also with a for-loop, but you need intermediate variables etc.
Also, if you want the first match for example, you can use findFirst() and cousins to stop consuming the rest of the Stream
It totally depends on the use case. However, the syntax and stream API adds lot of easy one liners which can definitely replace the conventional loops.
IntStream is really helpful and syntactic sugar in some cases,
IntStream.range(1, 101).sum();
IntStream.range(1, 101).average();
IntStream.range(1, 101).filter(i -> i % 2 == 0).count();
//... and so on
Whatever you can do with IntStream you can do with conventional loops. As one liner is more precise to understand and maintain.
Still for negative loops we can not use IntStream#range, it only works in positive increment. So following is not possible,
for(int i = 100; i > 1; i--) {
// Negative loop
}
Case 1 : Yes conventional loop is much faster in this case as toArray has a bit overhead.
Case 2 : I don't know anything about it, my apologies.
Case 3 : IntStream is not slow at all, IntStream.range and conventional loop are almost same in terms of performance.
See :
Java 8 nested loops with streams & performance
You could implement your Mersenne Twister as an Iterator and stream from that.

Categories

Resources