I would like to know when I can use IntStream.range effectively. I have three reasons why I am not sure how useful IntStream.range is.
(Please think of start and end as integers.)
If I want an array, [start, start+1, ..., end-2, end-1], the code below is much faster.
int[] arr = new int[end - start];
int index = 0;
for(int i = start; i < end; i++)
arr[index++] = i;
This is probably because toArray() in IntStream.range(start, end).toArray() is very slow.
I use MersenneTwister to shuffle arrays. (I downloaded MersenneTwister class online.) I do not think there is a way to shuffle IntStream using MersenneTwister.
I do not think just getting int numbers from start to end-1 is useful. I can use for(int i = start; i < end; i++), which seems easier and not slow.
Could you tell me when I should choose IntStream.range?
There are several uses for IntStream.range.
One is to use the int values themselves:
IntStream.range(start, end).filter(i -> isPrime(i))....
Another is to do something N times:
IntStream.range(0, N).forEach(this::doSomething);
Your case (1) is to create an array filled with a range:
int[] arr = IntStream.range(start, end).toArray();
You say this is "very slow" but, like other respondents, I suspect your benchmark methodology. For small arrays there is indeed more overhead with stream setup, but this should be so small as to be unnoticeable. For large arrays the overhead should be negligible, as filling a large array is dominated by memory bandwidth.
Sometimes you need to fill an existing array. You can do that this way:
int[] arr = new int[end - start];
IntStream.range(0, end - start).forEach(i -> arr[i] = i + start);
There's a utility method Arrays.setAll that can do this even more concisely:
int[] arr = new int[end - start];
Arrays.setAll(arr, i -> i + start);
There is also Arrays.parallelSetAll which can fill an existing array in parallel. Internally, it simply uses an IntStream and calls parallel() on it. This should provide a speedup for large array on a multicore system.
I've found that a fair number of my answers on Stack Overflow involve using IntStream.range. You can search for them using these search criteria in the search box:
user:1441122 IntStream.range
One application of IntStream.range I find particularly useful is to operate on elements of an array, where the array indexes as well as the array's values participate in the computation. There's a whole class of problems like this.
For example, suppose you want to find the locations of increasing runs of numbers within an array. The result is an array of indexes into the first array, where each index points to the start of a run.
To compute this, observe that a run starts at a location where the value is less than the previous value. (A run also starts at location 0). Thus:
int[] arr = { 1, 3, 5, 7, 9, 2, 4, 6, 3, 5, 0 };
int[] runs = IntStream.range(0, arr.length)
.filter(i -> i == 0 || arr[i-1] > arr[i])
.toArray();
System.out.println(Arrays.toString(runs));
[0, 5, 8, 10]
Of course, you could do this with a for-loop, but I find that using IntStream is preferable in many cases. For example, it's easy to store an unknown number of results into an array using toArray(), whereas with a for-loop you have to handle copying and resizing, which distracts from the core logic of the loop.
Finally, it's much easier to run IntStream.range computations in parallel.
Here's an example:
public class Test {
public static void main(String[] args) {
System.out.println(sum(LongStream.of(40,2))); // call A
System.out.println(sum(LongStream.range(1,100_000_000))); //call B
}
public static long sum(LongStream in) {
return in.sum();
}
}
So, let's look at what sum() does: it counts the sum of an arbitrary stream of numbers. We call it in two different ways: once with an explicit list of numbers, and once with a range.
If you only had call A, you might be tempted to put the two numbers into an array and pass it to sum() but that's clearly not an option with call B (you'd run out of memory). Likewise you could just pass the start and end for call B, but then you couldn't support the case of call A.
So to sum it up, ranges are useful here because:
We need to pass them around between methods
The target method doesn't just work on ranges but any stream of numbers
But it only operates on individual numbers of the stream, reading them sequentially. (This is why shuffling with streams is a terrible idea in general.)
There is also the readability argument: code using streams can be much more concise than loops, and thus more readable, but I wanted to show an example where a solution relying on IntStreans is functionally superior too.
I used LongStream to emphasise the point, but the same goes for IntStream
And yes, for simple summing this may look like a bit of an overkill, but consider for example reservoir sampling
IntStream.range returns a range of integers as a stream so you can do stream processing over it.
like taking square of each element
IntStream.range(1, 10).map(i -> i * i);
Here are few differences that comes to my head between IntStream.range and traditional for loops :
IntStream are lazily evaluated, the pipeline is traversed when calling a terminal operation. For loops evaluate at each iteration.
IntStream will provides you some functions that are commonly applied to a range of ints such as sum and avg.
IntStream will allow you to code multiple operation over a range of int in a functional way which read more fluently - specially if you have a lot of operations.
So basically use IntStream when one or more of these differences are useful to you.
But please bear in mind that shuffling a Stream sound quite strange as a Stream is not a data structure and therefore it does not really make sense to shuffle it (in case you were planning on building a special IntSupplier). Shuffle the result instead.
As for the performance, while there may be a few overhead, you will still iterate N times in both case and should not really care more.
Basically, if you want Stream operations, you can use the range() method. For example, to use concurrency or want to use map() or reduce(). Then you are better off with IntStream.
For example:
IntStream.range(1, 5).parallel().forEach(i -> heavyOperation());
Or:
IntStream.range(1, 5).reduce(1, (x, y) -> x * y)
// > 24
You can achieve the second example also with a for-loop, but you need intermediate variables etc.
Also, if you want the first match for example, you can use findFirst() and cousins to stop consuming the rest of the Stream
It totally depends on the use case. However, the syntax and stream API adds lot of easy one liners which can definitely replace the conventional loops.
IntStream is really helpful and syntactic sugar in some cases,
IntStream.range(1, 101).sum();
IntStream.range(1, 101).average();
IntStream.range(1, 101).filter(i -> i % 2 == 0).count();
//... and so on
Whatever you can do with IntStream you can do with conventional loops. As one liner is more precise to understand and maintain.
Still for negative loops we can not use IntStream#range, it only works in positive increment. So following is not possible,
for(int i = 100; i > 1; i--) {
// Negative loop
}
Case 1 : Yes conventional loop is much faster in this case as toArray has a bit overhead.
Case 2 : I don't know anything about it, my apologies.
Case 3 : IntStream is not slow at all, IntStream.range and conventional loop are almost same in terms of performance.
See :
Java 8 nested loops with streams & performance
You could implement your Mersenne Twister as an Iterator and stream from that.
Related
Say I got 100 records from DB, I want to loop 10 times and perform some action.
and loop again and perform some action.. it continues until the last record is read.
The problem what i am seeing is
for (int number: numbers) {
add(number);
//after adding 10 items I want to complete a function and continue the loop
}
But here in Java, if use the above loop we can see it will iterate the complete list and comes out.
I know in older versions, we can iterate by counter like
for(int i=0; i<10;i++)
some thing like this.
My question is if forEach loop doesnot provide this flexibility, then why Sun Java introduced to a looping mechanism where it will iterate completely.
Trying to understand the logic of this design.
You can use a forEach or for-in with a nested conditional.
for(number: numbers){
if(number != (multOfTen)){
myFunction(number);
add(number);
}else{
add(number);
}
}
you will need to replace multOfTen w/ an expression that includes only multiples of ten.
One way to do this is to use regEx to check that the last digit is zero (so long as your using integers.)
As the name suggests, forEach will iterate over the whole collection. This is the same for JavaScript and several other languages. However you can abort / skip the iteration with a condition (which depends on the language implementation details).
There are difference between for and forEach loop. There are reason why java has these 2. forEach is enhance for loop. Both have their usage based on requirements.
for
This we can use for general purpose. This is totally based on indexes. If you want to play with data at particular index or want to perform some actions based on index of element, you should use for.
forEach
this is used with only collections and arrays. This iterate over whole collections at once. Means you can't have index of element while iterating it. This is used when you manipulate each data in list regardless whats its index. For example to print all element in a given list, instead of writing classical for loop
for (int i =0; i < list.length(); i++){
System.out.println(list(i));
}
we use forEach loop
list.forEach(e -> {
System.out.println(e);
});
this is more readable, easy to use and crisp.
because sometimes the question or the implementation you're doing using java doesn't need the index of the array "simply".
so instead of writing the whole for(int i=0;i<arr.length;i++) thing you can just use foreach and instead of arr[i] you use a simple variable name.
This is actually possible using the same Stream api but it's beyond the scope of forEach. What you want to do isn't possible by limiting yourself to forEach. The purpose of forEach is to execute some action without bias on all elements of an Iterable or Stream. What you can do is break down your objects into groups of 10 and then for each grouping of 10, do what you want which is add all to the underlying collection and then perform some other action which is what you want as well.
List<Integer> values = IntStream.range(0, 100)
.boxed()
.collect(Collectors.toList());
int batch = 10;
List<List<Integer>> groupsOfTen = IntStream.range(0, values.size() / batch + 1)
.map(index -> index * batch)
.mapToObj(index -> values.subList(index,
Math.min(index + batch, values.size())))
.collect(Collectors.toList());
groupsOfTen.forEach(myListOfTen -> myListOfTen.forEach(individual -> {
}));
I am quite new to Java 8 Streams and have a question with faced me, doing homework for University.
If I have 2 Intstreams with have the same amount of Integers as
IntStream one = Arrays.stream(test.matrix).flatMapToInt(Arrays::stream);
IntStream two = Arrays.stream(test2.matrix).flatMapToInt(Arrays :: stream);
IntStream difference = one - two ???
is it possible to generate a new Stream holding the elementwhise difference of the two streams?
I have tried a lot like using stream.iterator(); which worked, but I am not allowed to use "normal" iterators, and should solve it with lambda expressions.
Has anyone a tip on how to solve this?
The operation combining two streams/lists is often called zip.
Unfortunately this operation is not provided in the Java 8 standard library.
You can read more about how users defined their own zip functions here: Zipping streams using JDK8 with lambda (java.util.stream.Streams.zip)
Overall, you'll essentially just be iterating the arrays by index. If you only iterate by elements then you cannot combine the streams in a way analogous to a zip operation. First, define what subtracting matrices looks like. For now, I'll only define same size matrices:
public static void subtract(int[] origin, int[] operand) {
if (origin.length != operand.length) throw new IllegalArgumentException("Array lengths must match");
//Make a new result, so as not to muck with the original array
return IntStream.range(0, origin.length).map(i -> origin[i] - operand[i]).toArray();
}
Then generating the difference is a matter of passing in those matrices directly, and the stream is used simply as a matter of iterating both sets of elements. Thus the task could also be accomplished with a simple vanilla for loop. Nonetheless:
int[] result = subtract(one, two);
We can abstract the logic above as well to allow a more flexible set of operations:
public static int[] operate(int[] origin, int[] operand, IntBiFunction operation) {
if (origin.length != operand.length) throw new IllegalArgumentException("Array lengths must match");
return IntStream.range(0, origin.length)
.forEach(i -> result[i] = operation.apply(origin[i], operand[i]))
.toArray();
}
public static void subtract(int[] origin, int[] operand) {
//Producing result from each index matched element
return operate(origin, operand, (one, two) -> one - two);
}
This is where I'd see more of an advantage of using anything function api related.
If you wish to utilize a return of IntStream instead of int[], you can, but you'll run into the original issue of having to reference the arrays by index. You could really push it and supply an arbitrary amount of operations (IntBiFunction... operations), but I think you risk overcomplicating your logic.
I hope this question was not asked before.
In java 8, I have an array of String myArray in input and an integer maxLength.
I want to count the number of string in my array smaller than maxLength. I WANT to use stream to resolve this issue.
For that I thought to do this :
int solution = Arrays.stream(myArray).filter(s -> s.length() <= maxLength).count();
However I'm not sure if it is the right way to do this. It will need to go through first array once and then go through the filtered array to count.
But if I don't use a stream, I could easely make an algorithm where I loop once over myArray.
My questions are very easy: Is there a way to resolve this issue with the same time performance than with a loop ? Is it always a "good" solution to use stream ?
However I'm not sure if it is the right way to do this. It will need
to go through first array once and then go through the filtered array
to count.
Your assumption that it will perform multiple passes is wrong. There is something calling operation fusion i.e. multiple operations can be executed in a single pass on the data;
In this case Arrays.stream(myArray) will create a stream object (cheap operation and lightweight object) , filter(s -> s.length() <= maxLength).count(); will be combined into a single pass on the data because there is no stateful operation in the pipeline as opposed to filtering all the elements of the stream and then counting all the elements which pass the predicate.
A quote from Brian Goetz post here states:
Stream pipelines, in contrast, fuse their operations into as few
passes on the data as possible, often a single pass. (Stateful
intermediate operations, such as sorting, can introduce barrier points
that necessitate multipass execution.)
As for:
My questions are very easy: Is there a way to resolve this issue with
the same time performance than with a loop ?
depends on the amount of data and cost per element. Anyhow, for a small number of elements the imperative for loops will almost always win if not always.
Is it always a "good" solution to use stream ?
No, if you really care about performance then measure, measure and measure.
Use streams for it being declarative, for its abstraction, composition and the possibility of benefitting from parallelism when you know you will benefit from it that is.
You can use range instead of stream and filter the output.
int solution = IntStream.range(0, myArray.length)
.filter(index -> myArray[index].length() <= maxLength)
.count();
I'm just learning the java.util.stream API and I'm looking for a way to quickly fill up a Collection with some data.
I've come up with this code to add 5 random numbers:
List<Integer> lottery = Stream.of(random.nextInt(90), random.nextInt(90), random.nextInt(90),random.nextInt(90),
random.nextInt(90)).collect(Collectors.toList());
That would be however a problem in case I have to add hundreds of items. Is there a more concise way to do it using the java.util.stream API?
(I could obviously loop in the ordinary way...)
Thanks
In this case the stream API only makes things more complicated vs. a regular loop, but you could create an IntStream (or Stream<Integer> depending on where you convert the int to Integer) that generates infinite random numbers with
IntStream rndStream = IntStream.generate(() -> rnd.nextInt(90));
Now you can use limit() to create non-infinite substreams that can be collected, such as
rndStream.limit(5).boxed().collect(Collectors.toList());
However this is just stream trickery and has no advantages over say having a method called List<Integer> getRandoms(int number) with a regular loop inside.
There's no reason to use the code displayed in this answer as it's less readable and more complex than alternatives. This just demonstrates how to get finite amounts of elements from an infinitely generated stream.
Use Random.ints(long, int, int)
Everyone else on this page is generating the stream because they can, but Java already has such an out of the box method, Random.ints(long, int, int), available since Java 8 (and therefore the beginning of streams).
Just use it like this:
List<Integer> lottery = rnd.ints(5, 0, 90)
.boxed()
.collect(Collectors.toList());
The instruction rnd.ints(5L, 0, 90) means: "create a stream of 5 integers between 0 (included) and 90 (excluded)". So if you want 100 numbers instead of 5, just change 5 to 100.
This should do:
List<Integer> randomList = new ArrayList<>();
SecureRandom random = new SecureRandom();
IntStream.rangeClosed(1,100).forEach(value->randomList.add(random.nextInt(90)));
I am having 2 Lists and want to add them element by element. Like that:
Is there an easier way and probably much more well performing way than using a for loop to iterate over the first list and add it to the result list?
I appreciate your answer!
Depends on what kind of list and what kind of for loop.
Iterating over the elements (rather than indices) would almost certainly be plenty fast enough.
On the other hand, iterating over indices and repeatedly getting the element by index could work rather poorly for certain types of lists (e.g. a linked list).
My understanding is that you have List1 and List2 and that you want to find the best performing way to find result[index] = List1[index] + list2[index]
My main suggestion is that before you start optimising for performance is to measure whether you need to optimise at all. You can iterate through the lists as you said, something like:
for(int i = 0; i < listSize; i++)
{
result[i] = List1[i] + List2[i];
}
In most cases this is fine. See NPE's answer for a description of where this might be expensive, i.e. a linked list. Also see this answer and note that each step of the for loop is doing a get - on an array it is done in 1 step, but in a linked list it is done in as many steps at it takes to iterate to the element in the list.
Assuming a standard array, this is O(n) and (depending on array size) will be done so quickly that it will hardly result in a blip on your performance profiling.
As a twist, since the operations are completely independent, that is result[0] = List1[0] + List2[0] is independent of result[1] = List1[1] + List2[1], etc, you can run these operations in parallel. E.g. you could run the first half of the calculations (<= List.Size / 2) on one thread and the other half (> List.Size / 2) on another thread and expect the elapsed time to roughly halve (assuming at least 2 free CPUs). Now, the best number of threads to use depends on the size of your data, the number of CPUs, other operations happening at the same time and is normally best decided by testing and modeling under different conditions. All this adds complexity to your program, so my main recommendation is to start simple, then measure and then decide whether you need to optimise.
Looping is inevitable except you have a matrix API (e.g. OpenGL). You could implement a List<Integer> which is backed by the original Lists:
public class CalcList implements List<Integer> {
private List<Integer> l1, l2;
#Override
public int get(int index) {
return l1.get(index) + l2.get(index);
}
}
This avoids copy operations and moves the calculations at the end of your stack:
CalcList<Integer> results1 = new CalcList(list, list1);
CalcList<Integer> results2 = new CalcList(results1, list3);
// no calculation or memory allocated until now.
for (int result : results2) {
// here happens the calculation, still without unnecessary memory
}
This could give an advantage if the compiler is able to translate it into:
for (int i = 0; i < list1.size; i++) {
int result = list1[i] + list2[i] + list3[i] + …;
}
But I doubt that. You have to run a benchmark for your specific use case to find out if this implementation has an advantage.
Java doesn't come with a map style function, so the the way of doing this kind of operation is using a for loop.
Even if you use some other construct, the looping will be done anyway. An alternative is using the GPU for computations but this is not a default Java feature.
Also using arrays should be faster than operating with linked lists.