I have this list (List<String>):
["a", "b", null, "c", null, "d", "e"]
And I'd like something like this:
[["a", "b"], ["c"], ["d", "e"]]
In other words I want to split my list in sublists using the null value as separator, in order to obtain a list of lists (List<List<String>>). I'm looking for a Java 8 solution. I've tried with Collectors.partitioningBy but I'm not sure it is what I'm looking for. Thanks!
Although there are several answers already, and an accepted answer, there are still a couple points missing from this topic. First, the consensus seems to be that solving this problem using streams is merely an exercise, and that the conventional for-loop approach is preferable. Second, the answers given thus far have overlooked an approach using array or vector-style techniques that I think improves the streams solution considerably.
First, here's a conventional solution, for purposes of discussion and analysis:
static List<List<String>> splitConventional(List<String> input) {
List<List<String>> result = new ArrayList<>();
int prev = 0;
for (int cur = 0; cur < input.size(); cur++) {
if (input.get(cur) == null) {
result.add(input.subList(prev, cur));
prev = cur + 1;
}
}
result.add(input.subList(prev, input.size()));
return result;
}
This is mostly straightforward but there's a bit of subtlety. One point is that a pending sublist from prev to cur is always open. When we encounter null we close it, add it to the result list, and advance prev. After the loop we close the sublist unconditionally.
Another observation is that this is a loop over indexes, not over the values themselves, thus we use an arithmetic for-loop instead of the enhanced "for-each" loop. But it suggests that we can stream using the indexes to generate subranges instead of streaming over values and putting the logic into the collector (as was done by Joop Eggen's proposed solution).
Once we've realized that, we can see that each position of null in the input is the delimiter for a sublist: it's the right end of the sublist to the left, and it (plus one) is the left end of the sublist to the right. If we can handle the edge cases, it leads to an approach where we find the indexes at which null elements occur, map them to sublists, and collect the sublists.
The resulting code is as follows:
static List<List<String>> splitStream(List<String> input) {
int[] indexes = Stream.of(IntStream.of(-1),
IntStream.range(0, input.size())
.filter(i -> input.get(i) == null),
IntStream.of(input.size()))
.flatMapToInt(s -> s)
.toArray();
return IntStream.range(0, indexes.length-1)
.mapToObj(i -> input.subList(indexes[i]+1, indexes[i+1]))
.collect(toList());
}
Getting the indexes at which null occurs is pretty easy. The stumbling block is adding -1 at the left and size at the right end. I've opted to use Stream.of to do the appending and then flatMapToInt to flatten them out. (I tried several other approaches but this one seemed like the cleanest.)
It's a bit more convenient to use arrays for the indexes here. First, the notation for accessing an array is nicer than for a List: indexes[i] vs. indexes.get(i). Second, using an array avoids boxing.
At this point, each index value in the array (except for the last) is one less than the beginning position of a sublist. The index to its immediate right is the end of the sublist. We simply stream over the array and map each pair of indexes into a sublist and collect the output.
Discussion
The streams approach is slightly shorter than the for-loop version, but it's denser. The for-loop version is familiar, because we do this stuff in Java all the time, but if you're not already aware of what this loop is supposed to be doing, it's not obvious. You might have to simulate a few loop executions before you figure out what prev is doing and why the open sublist has to be closed after the end of the loop. (I initially forgot to have it, but I caught this in testing.)
The streams approach is, I think, easier to conceptualize what's going on: get a list (or an array) that indicates the boundaries between sublists. That's an easy streams two-liner. The difficulty, as I mentioned above, is finding a way to tack the edge values onto the ends. If there were a better syntax for doing this, e.g.,
// Java plus pidgin Scala
int[] indexes =
[-1] ++ IntStream.range(0, input.size())
.filter(i -> input.get(i) == null) ++ [input.size()];
it would make things a lot less cluttered. (What we really need is array or list comprehension.) Once you have the indexes, it's a simple matter to map them into actual sublists and collect them into the result list.
And of course this is safe when run in parallel.
UPDATE 2016-02-06
Here's a nicer way to create the array of sublist indexes. It's based on the same principles, but it adjusts the index range and adds some conditions to the filter to avoid having to concatenate and flatmap the indexes.
static List<List<String>> splitStream(List<String> input) {
int sz = input.size();
int[] indexes =
IntStream.rangeClosed(-1, sz)
.filter(i -> i == -1 || i == sz || input.get(i) == null)
.toArray();
return IntStream.range(0, indexes.length-1)
.mapToObj(i -> input.subList(indexes[i]+1, indexes[i+1]))
.collect(toList());
}
UPDATE 2016-11-23
I co-presented a talk with Brian Goetz at Devoxx Antwerp 2016, "Thinking In Parallel" (video) that featured this problem and my solutions. The problem presented there is a slight variation that splits on "#" instead of null, but it's otherwise the same. In the talk, I mentioned that I had a bunch of unit tests for this problem. I've appended them below, as a standalone program, along with my loop and streams implementations. An interesting exercise for readers is to run solutions proposed in other answers against the test cases I've provided here, and to see which ones fail and why. (The other solutions will have to be adapted to split based on a predicate instead of splitting on null.)
import java.util.*;
import java.util.function.*;
import java.util.stream.*;
import static java.util.Arrays.asList;
public class ListSplitting {
static final Map<List<String>, List<List<String>>> TESTCASES = new LinkedHashMap<>();
static {
TESTCASES.put(asList(),
asList(asList()));
TESTCASES.put(asList("a", "b", "c"),
asList(asList("a", "b", "c")));
TESTCASES.put(asList("a", "b", "#", "c", "#", "d", "e"),
asList(asList("a", "b"), asList("c"), asList("d", "e")));
TESTCASES.put(asList("#"),
asList(asList(), asList()));
TESTCASES.put(asList("#", "a", "b"),
asList(asList(), asList("a", "b")));
TESTCASES.put(asList("a", "b", "#"),
asList(asList("a", "b"), asList()));
TESTCASES.put(asList("#"),
asList(asList(), asList()));
TESTCASES.put(asList("a", "#", "b"),
asList(asList("a"), asList("b")));
TESTCASES.put(asList("a", "#", "#", "b"),
asList(asList("a"), asList(), asList("b")));
TESTCASES.put(asList("a", "#", "#", "#", "b"),
asList(asList("a"), asList(), asList(), asList("b")));
}
static final Predicate<String> TESTPRED = "#"::equals;
static void testAll(BiFunction<List<String>, Predicate<String>, List<List<String>>> f) {
TESTCASES.forEach((input, expected) -> {
List<List<String>> actual = f.apply(input, TESTPRED);
System.out.println(input + " => " + expected);
if (!expected.equals(actual)) {
System.out.println(" ERROR: actual was " + actual);
}
});
}
static <T> List<List<T>> splitStream(List<T> input, Predicate<? super T> pred) {
int[] edges = IntStream.range(-1, input.size()+1)
.filter(i -> i == -1 || i == input.size() ||
pred.test(input.get(i)))
.toArray();
return IntStream.range(0, edges.length-1)
.mapToObj(k -> input.subList(edges[k]+1, edges[k+1]))
.collect(Collectors.toList());
}
static <T> List<List<T>> splitLoop(List<T> input, Predicate<? super T> pred) {
List<List<T>> result = new ArrayList<>();
int start = 0;
for (int cur = 0; cur < input.size(); cur++) {
if (pred.test(input.get(cur))) {
result.add(input.subList(start, cur));
start = cur + 1;
}
}
result.add(input.subList(start, input.size()));
return result;
}
public static void main(String[] args) {
System.out.println("===== Loop =====");
testAll(ListSplitting::splitLoop);
System.out.println("===== Stream =====");
testAll(ListSplitting::splitStream);
}
}
The only solution I come up with for the moment is by implementing your own custom collector.
Before reading the solution, I want to add a few notes about this. I took this question more as a programming exercise, I'm not sure if it can be done with a parallel stream.
So you have to be aware that it'll silently break if the pipeline is run in parallel.
This is not a desirable behavior and should be avoided. This is why I throw an exception in the combiner part (instead of (l1, l2) -> {l1.addAll(l2); return l1;}), as it's used in parallel when combining the two lists, so that you have an exception instead of a wrong result.
Also this is not very efficient due to list copying (although it uses a native method to copy the underlying array).
So here's the collector implementation:
private static Collector<String, List<List<String>>, List<List<String>>> splitBySeparator(Predicate<String> sep) {
final List<String> current = new ArrayList<>();
return Collector.of(() -> new ArrayList<List<String>>(),
(l, elem) -> {
if (sep.test(elem)) {
l.add(new ArrayList<>(current));
current.clear();
}
else {
current.add(elem);
}
},
(l1, l2) -> {
throw new RuntimeException("Should not run this in parallel");
},
l -> {
if (current.size() != 0) {
l.add(current);
return l;
}
);
}
and how to use it:
List<List<String>> ll = list.stream().collect(splitBySeparator(Objects::isNull));
Output:
[[a, b], [c], [d, e]]
As the answer of Joop Eggen is out, it appears that it can be done in parallel (give him credit for that!). With that it reduces the custom collector implementation to:
private static Collector<String, List<List<String>>, List<List<String>>> splitBySeparator(Predicate<String> sep) {
return Collector.of(() -> new ArrayList<List<String>>(Arrays.asList(new ArrayList<>())),
(l, elem) -> {if(sep.test(elem)){l.add(new ArrayList<>());} else l.get(l.size()-1).add(elem);},
(l1, l2) -> {l1.get(l1.size() - 1).addAll(l2.remove(0)); l1.addAll(l2); return l1;});
}
which let the paragraph about parallelism a bit obsolete, however I let it as it can be a good reminder.
Note that the Stream API is not always a substitute. There are tasks that are easier and more suitable using the streams and there are tasks that are not. In your case, you could also create a utility method for that:
private static <T> List<List<T>> splitBySeparator(List<T> list, Predicate<? super T> predicate) {
final List<List<T>> finalList = new ArrayList<>();
int fromIndex = 0;
int toIndex = 0;
for(T elem : list) {
if(predicate.test(elem)) {
finalList.add(list.subList(fromIndex, toIndex));
fromIndex = toIndex + 1;
}
toIndex++;
}
if(fromIndex != toIndex) {
finalList.add(list.subList(fromIndex, toIndex));
}
return finalList;
}
and call it like List<List<String>> list = splitBySeparator(originalList, Objects::isNull);.
It can be improved for checking edge-cases.
The solution is to use Stream.collect. To create a Collector using its builder pattern is already given as solution. The alternative is the other overloaded collect being a tiny bit more primitive.
List<String> strings = Arrays.asList("a", "b", null, "c", null, "d", "e");
List<List<String>> groups = strings.stream()
.collect(() -> {
List<List<String>> list = new ArrayList<>();
list.add(new ArrayList<>());
return list;
},
(list, s) -> {
if (s == null) {
list.add(new ArrayList<>());
} else {
list.get(list.size() - 1).add(s);
}
},
(list1, list2) -> {
// Simple merging of partial sublists would
// introduce a false level-break at the beginning.
list1.get(list1.size() - 1).addAll(list2.remove(0));
list1.addAll(list2);
});
As one sees, I make a list of string lists, where there always is at least one last (empty) string list.
The first function creates a starting list of string lists. It specifies the result (typed) object.
The second function is called to process each element. It is an action on the partial result and an element.
The third is not really used, it comes into play on parallelising the processing, when partial results must be combined.
A solution with an accumulator:
As #StuartMarks points out, the combiner does not fullfill the contract for parallelism.
Due to the comment of #ArnaudDenoyelle a version using reduce.
List<List<String>> groups = strings.stream()
.reduce(new ArrayList<List<String>>(),
(list, s) -> {
if (list.isEmpty()) {
list.add(new ArrayList<>());
}
if (s == null) {
list.add(new ArrayList<>());
} else {
list.get(list.size() - 1).add(s);
}
return list;
},
(list1, list2) -> {
list1.addAll(list2);
return list1;
});
The first parameter is the accumulated object.
The second function accumulates.
The third is the aforementioned combiner.
Please do not vote. I do not have enough place to explain this in comments.
This is a solution with a Stream and a foreach but this is strictly equivalent to Alexis's solution or a foreach loop (and less clear, and I could not get rid of the copy constructor) :
List<List<String>> result = new ArrayList<>();
final List<String> current = new ArrayList<>();
list.stream().forEach(s -> {
if (s == null) {
result.add(new ArrayList<>(current));
current.clear();
} else {
current.add(s);
}
}
);
result.add(current);
System.out.println(result);
I understand that you want to find a more elegant solution with Java 8 but I truly think that it has not been designed for this case. And as said by Mr spoon, highly prefer the naive way in this case.
Although the answer of Marks Stuart is concise, intuitive and parallel safe (and the best), I want to share another interesting solution that doesn't need the start/end boundaries trick.
If we look at the problem domain and think about parallelism, we can easy solve this with a divide-and-conquer strategy. Instead of thinking about the problem as a serial list we have to traverse, we can look at the problem as a composition of the same basic problem: splitting a list at a null value. We can intuitively see quite easily that we can recursively break down the problem with the following recursive strategy:
split(L) :
- if (no null value found) -> return just the simple list
- else -> cut L around 'null' naming the resulting sublists L1 and L2
return split(L1) + split(L2)
In this case, we first search any null value and the moment find one, we immediately cut the list and invoke a recursive call on the sublists. If we don't find null (the base case), we are finished with this branch and just return the list. Concatenating all the results will return the List we are searching for.
A picture is worth a thousand words:
The algorithm is simple and complete: we don't need any special tricks to handle the edge cases of the start/end of the list. We don't need any special tricks to handle edge cases such as empty lists, or lists with only null values. Or lists ending with null or starting with null.
A simple naive implementation of this strategy looks as follows:
public List<List<String>> split(List<String> input) {
OptionalInt index = IntStream.range(0, input.size())
.filter(i -> input.get(i) == null)
.findAny();
if (!index.isPresent())
return asList(input);
List<String> firstHalf = input.subList(0, index.getAsInt());
List<String> secondHalf = input.subList(index.getAsInt()+1, input.size());
return asList(firstHalf, secondHalf).stream()
.map(this::split)
.flatMap(List::stream)
.collect(toList());
}
We first search for the index of any null value in the list. If we don't find one, we return the list. If we find one, we split the list in 2 sublists, stream over them and recursively call the split method again. The resulting lists of the sub-problem are then extracted and combined for the return value.
Remark that the 2 streams can easily be made parallel() and the algorithm will still work because of the functional decomposition of the problem.
Although the code is already pretty concise, it can always be adapted in numerous ways. For the sake of an example, instead of checking the optional value in the base case, we could take advantage of the orElse method on the OptionalInt to return the end-index of the list, enabling us to re-use the second stream and additionally filter out empty lists:
public List<List<String>> split(List<String> input) {
int index = IntStream.range(0, input.size())
.filter(i -> input.get(i) == null)
.findAny().orElse(input.size());
return asList(input.subList(0, index), input.subList(index+1, input.size())).stream()
.map(this::split)
.flatMap(List::stream)
.filter(list -> !list.isEmpty())
.collect(toList());
}
The example is only given to indicate the mere simplicity, adaptability and elegance of a recursive approach. Indeed, this version would introduce a small performance penalty and fail if the input was empty (and as such might need an extra empty check).
In this case, recursion might probably not be the best solution (Stuart Marks algorithm to find indexes is only O(N) and mapping/splitting lists has a significant cost), but it expresses the solution with a simple, intuitive parallelizable algorithm without any side effects.
I won't digg deeper into the complexity and advantages/disadvantages or use cases with stop criteria and/or partial result availability. I just felt the need to share this solution strategy, since the other approaches were merely iterative or using an overly complex solution algorithm that was not parallelizable.
Here's another approach, which uses a grouping function, which makes use of list indices for grouping.
Here I'm grouping the element by the first index following that element, with value null. So, in your example, "a" and "b" would be mapped to 2. Also, I'm mapping null value to -1 index, which should be removed later on.
List<String> list = Arrays.asList("a", "b", null, "c", null, "d", "e");
Function<String, Integer> indexGroupingFunc = (str) -> {
if (str == null) {
return -1;
}
int index = list.indexOf(str) + 1;
while (index < list.size() && list.get(index) != null) {
index++;
}
return index;
};
Map<Integer, List<String>> grouped = list.stream()
.collect(Collectors.groupingBy(indexGroupingFunc));
grouped.remove(-1); // Remove null elements grouped under -1
System.out.println(grouped.values()); // [[a, b], [c], [d, e]]
You can also avoid getting the first index of null element every time, by caching the current min index in an AtomicInteger. The updated Function would be like:
AtomicInteger currentMinIndex = new AtomicInteger(-1);
Function<String, Integer> indexGroupingFunc = (str) -> {
if (str == null) {
return -1;
}
int index = names.indexOf(str) + 1;
if (currentMinIndex.get() > index) {
return currentMinIndex.get();
} else {
while (index < names.size() && names.get(index) != null) {
index++;
}
currentMinIndex.set(index);
return index;
}
};
Well, after a bit of work U have come up with a one-line stream-based solution. It ultimately uses reduce() to do the grouping, which seemed the natural choice, but it was a bit ugly getting the strings into the List<List<String>> required by reduce:
List<List<String>> result = list.stream()
.map(Arrays::asList)
.map(x -> new LinkedList<String>(x))
.map(Arrays::asList)
.map(x -> new LinkedList<List<String>>(x))
.reduce( (a, b) -> {
if (b.getFirst().get(0) == null)
a.add(new LinkedList<String>());
else
a.getLast().addAll(b.getFirst());
return a;}).get();
It is however 1 line!
When run with input from the question,
System.out.println(result);
Produces:
[[a, b], [c], [d, e]]
This is a very interesting problem. I came up with a one line solution. It might not very performant but it works.
List<String> list = Arrays.asList("a", "b", null, "c", null, "d", "e");
Collection<List<String>> cl = IntStream.range(0, list.size())
.filter(i -> list.get(i) != null).boxed()
.collect(Collectors.groupingBy(
i -> IntStream.range(0, i).filter(j -> list.get(j) == null).count(),
Collectors.mapping(i -> list.get(i), Collectors.toList()))
).values();
It is a similar idea that #Rohit Jain came up with. I'm grouping the space between the null values.
If you really want a List<List<String>> you may append:
List<List<String>> ll = cl.stream().collect(Collectors.toList());
Group by different token whenever you find a null (or separator). I used here a different integer (used atomic just as holder)
Then remap the generated map to transform it into a list of lists.
AtomicInteger i = new AtomicInteger();
List<List<String>> x = Stream.of("A", "B", null, "C", "D", "E", null, "H", "K")
.collect(Collectors.groupingBy(s -> s == null ? i.incrementAndGet() : i.get()))
.entrySet().stream().map(e -> e.getValue().stream().filter(v -> v != null).collect(Collectors.toList()))
.collect(Collectors.toList());
System.out.println(x);
Here is code by abacus-common
List<String> list = N.asList(null, null, "a", "b", null, "c", null, null, "d", "e");
Stream.of(list).splitIntoList(null, (e, any) -> e == null, null).filter(e -> e.get(0) != null).forEach(N::println);
Declaration: I'm the developer of abacus-common.
In my StreamEx library there's a groupRuns method which can help you to solve this:
List<String> input = Arrays.asList("a", "b", null, "c", null, "d", "e");
List<List<String>> result = StreamEx.of(input)
.groupRuns((a, b) -> a != null && b != null)
.remove(list -> list.get(0) == null).toList();
The groupRuns method takes a BiPredicate which for the pair of adjacent elements returns true if they should be grouped. After that we remove groups containing nulls and collect the rest to the List.
This solution is parallel-friendly: you may use it for parallel stream as well. Also it works nice with any stream source (not only random access lists like in some other solutions) and it's somewhat better than collector-based solutions as here you can use any terminal operation you want without intermediate memory waste.
With String one can do:
String s = ....;
String[] parts = s.split("sth");
If all sequential collections (as the String is a sequence of chars) had this abstraction this could be doable for them too:
List<T> l = ...
List<List<T>> parts = l.split(condition) (possibly with several overloaded variants)
If we restrict the original problem to List of Strings (and imposing some restrictions on it's elements contents) we could hack it like this:
String als = Arrays.toString(new String[]{"a", "b", null, "c", null, "d", "e"});
String[] sa = als.substring(1, als.length() - 1).split("null, ");
List<List<String>> res = Stream.of(sa).map(s -> Arrays.asList(s.split(", "))).collect(Collectors.toList());
(please don't take it seriously though :))
Otherwise, plain old recursion also works:
List<List<String>> part(List<String> input, List<List<String>> acc, List<String> cur, int i) {
if (i == input.size()) return acc;
if (input.get(i) != null) {
cur.add(input.get(i));
} else if (!cur.isEmpty()) {
acc.add(cur);
cur = new ArrayList<>();
}
return part(input, acc, cur, i + 1);
}
(note in this case null has to be appended to the input list)
part(input, new ArrayList<>(), new ArrayList<>(), 0)
I was watching the video on Thinking in Parallel by Stuart. So decided to solve it before seeing his response in the video. Will update the solution with time. for now
Arrays.asList(IntStream.range(0, abc.size()-1).
filter(index -> abc.get(index).equals("#") ).
map(index -> (index)).toArray()).
stream().forEach( index -> {for (int i = 0; i < index.length; i++) {
if(sublist.size()==0){
sublist.add(new ArrayList<String>(abc.subList(0, index[i])));
}else{
sublist.add(new ArrayList<String>(abc.subList(index[i]-1, index[i])));
}
}
sublist.add(new ArrayList<String>(abc.subList(index[index.length-1]+1, abc.size())));
});
Related
I'm trying to construct a map from a list. My goal is to compare two lists and found differences between thoses two lists. Then, I want to construct a map, in order to know in which index I found differences.
I did it in Java, not in a great way I believe, but it's working.
//I compare the two values for a given index, if value are the same, I set null in my result list
List<String> result = IntStream.range(0, list1.size()).boxed()
.map(i -> list1.get(i) != list2.get(i) ? (list1.get(i) + " != "+ list2.get(i)) : null)
.collect(Collectors.toList());
//I filter all the null values, in order to retrieve only the differences with their index
Map<Integer, String> mapResult =
IntStream.range(0, result.size())
.boxed().filter(i-> null != result.get(i))
.collect(Collectors.toMap(i -> i,result::get));
It's not optimal, but it's working. If you have suggestions regarding thoses lines of codes, I will gladly take it.
I tried two replicate this kind of behavior in Kotlin, but I didn't succeed to use the map() constructor. (I'm still learning Kotlin, I'm not very familiar with it).
Thank you for your help.
You may use zip function in collections to join two elements. The withIndex() function helps to turn a list into a list of pairs of an element index and value. The full solution may be as follows
val list1 = listOf("a", "b", "c")
val list2 = listOf("a", "B", "c")
val diff : Map<Int, String> = list1.withIndex()
.zip(list2) { (idx,a), b -> if (a != b) idx to "$a != $b" else null}
.filterNotNull().toMap()
Note that the zip function iterates while there are elements in both lists, it will skip a possible leftover from any of the lists. It can be fixed by adding empty elements with the following function:
fun <T> List<T>.addNulls(element: T, toSize: Int) : List<T> {
val elementsToAdd = (toSize - size)
return if (elementsToAdd > 0) {
this + List(elementsToAdd) { element }
} else {
this
}
}
and call the function on both lists before using the zip function
I want to determine if a list is anagram or not using Java 8.
Example input:
"cat", "cta", "act", "atc", "tac", "tca"
I have written the following function that does the job but I am wondering if there is a better and elegant way to do this.
boolean isAnagram(String[] list) {
long count = Stream.of(list)
.map(String::toCharArray)
.map(arr -> {
Arrays.sort(arr);
return arr;
})
.map(String::valueOf)
.distinct()
.count();
return count == 1;
}
It seems I can't sort char array with Stream.sorted() method so that's why I used a second map operator. If there is some way that I can operate directly on char stream instead of Stream of char array, that would also help.
Instead of creating and sorting a char[] or int[], which can not be done inline and thus "breaks" the stream, you could get a Stream of the chars in the Strings and sort those before converting them to arrays. Note that this is an IntSteam, though, and String.valueOf(int[]) will include the array's memory address, which is not very useful here, so better use Arrays.toString in this case.
boolean anagrams = Stream.of(words)
.map(String::chars).map(IntStream::sorted)
.map(IntStream::toArray).map(Arrays::toString)
.distinct().count() == 1;
Of course, you can also use map(s -> Arrays.toString(s.chars().sorted().toArray())) instead of the series of four maps. Not sure if there's a (significant) difference in speed, it's probably mainly a matter of taste.
Also, you could use IntBuffer.wrap to make the arrays comparable, which should be considerably faster than Arrays.toString (thanks to Holger in comments).
boolean anagrams = Stream.of(words)
.map(s -> IntBuffer.wrap(s.chars().sorted().toArray()))
.distinct().count() == 1;
I would not deal with counting distinct values, as that’s not what you are interested in. What you want to know, is whether all elements are equal according to a special equality rule.
So when we create a method to convert a String to a canonical key (i.e. all characters sorted)
private CharBuffer canonical(String s) {
char[] array = s.toCharArray();
Arrays.sort(array);
return CharBuffer.wrap(array);
}
we can simply check whether all subsequent elements are equal to the first one:
boolean isAnagram(String[] list) {
if(list.length == 0) return false;
return Arrays.stream(list, 1, list.length)
.map(this::canonical)
.allMatch(canonical(list[0])::equals);
}
Note that for method references of the form expression::name, the expression is evaluate once and the result captured, so canonical(list[0]) is evaluated only once for the entire stream operation and only equals invoked for every element.
Of course, you can also use the Stream API to create the canonical keys:
private IntBuffer canonical(String s) {
return IntBuffer.wrap(s.chars().sorted().toArray());
}
(the isAnagram method does not need any change)
Note that CharBuffer and IntBuffer can be used as lightweight wrappers around arrays, like in this answer, and implement equals and hashCode appropriately, based on the actual array contents.
I wouldn't sort the char array, as sorting is O(NlogN), which is not necessary here.
All we need is, for each word of the list, to count the occurrences of each character. For this, we're collecting each word's characters to a Map<Integer, Long>, with the keys being each character and the value being its count.
Then, we check that, for all the words in the array argument, we have the same count of characters, i.e. the same map:
return Arrays.stream(list)
.map(word -> word.chars()
.boxed().collect(Collectors.grouping(c -> c, Collectors.counting()))
.distinct()
.count() == 1;
Alternatively an updated version of your implementation that could work would be:
boolean isAnagram(String[] list) {
return Stream.of(list) // Stream<String>
.map(String::toCharArray) // Stream<char[]>
.peek(Arrays::sort) // sort
.map(String::valueOf) // Stream<String>
.distinct() //distinct
.count() == 1;
}
Or may be with a BitSet:
System.out.println(stream.map(String::chars)
.map(x -> {
BitSet bitSet = new BitSet();
x.forEach(bitSet::set);
return bitSet;
})
.collect(Collector.of(
BitSet::new,
BitSet::xor,
(left, right) -> {
left.xor(right);
return left;
}
))
.cardinality() == 0);
Can I access the index of the object in the list somehow?
myList.stream().sorted((o1, o2) -> 0).collect(Collectors.toList())
e.g.:
I'd like odd indices to be displayed first and even indices at the end.
I wouldn’t consider index based reordering operations to be actual sorting operations. E.g., no-one would consider implementing an operation like Collections.reverse(List) as a sorting operation.
An efficient method for moving elements at odd positions to the front in-place would be
public static <T> void oddFirst(List<T> list) {
final int size = list.size();
for(int index1=0, index2=list.size()/2|1; index2<size; index1+=2, index2+=2)
Collections.swap(list, index1, index2);
}
Alternatively, you may stream over the indices like in this answer, to generate a new List.
A filter may help:
List<Integer> listEvenIndicesMembers = IntStream.range(0, list.size()).filter(n -> n % 2 == 0).mapToObj(list::get)
.collect(Collectors.toList());
List<Integer> listOddIndicesMembers = IntStream.range(0, list.size()).filter(n -> n % 2 != 0).mapToObj(list::get)
.collect(Collectors.toList());
System.out.println(listEvenIndicesMembers);
System.out.println(listOddIndicesMembers);
[1, 3, 5, 7, 9, 11]
[2, 4, 6, 8, 10]
the problem is now you have 2 lists, appending one after the other will produce the same you want... am still checking the doc maybe I find something more elegant/optimized.
Edit:
Thanks to #Holger for the neat suggestion:
you can concatenate the streams doing:
List<Integer> listOptimal = IntStream
.concat(IntStream.range(0, list.size()).filter(n -> n % 2 == 0),
IntStream.range(0, list.size()).filter(n -> n % 2 != 0))
.mapToObj(list::get).collect(Collectors.toList());
System.out.println(listOptimal);
The accepted answer works, but this works too (as long as there are no duplicates in the list):
// a list to test with
List<String> list = Arrays.asList("abcdefghijklmnopqrstuvwxyz".split(""));
List<String> list2 = list.stream()
.sorted(Comparator.comparing(x -> list.indexOf(x) % 2).thenComparing(x -> list.indexOf(x)))
.collect(Collectors.toList());
list2.forEach(System.out::print);
This prints odd indices first, then the even indices
acegikmoqsuwybdfhjlnprtvxz
Just to illustrate the point Holger made in his comment.
The solution in this answer took my machine 75 ms to run.
The solution in this answer took only 3 ms.
And Holger's own answer ends up with an astonishing < 0 ms.
I think solution of ΦXocę 웃 Пepeúpa ツ should be fine for you, here's just an alternative (remember, it's just an alternative for the sake of learning, this solution should not be used in 'real life', it's just to show the possibility):
public static void main(String arg[]) {
List<String> list = Arrays.asList("A", "B", "C", "D", "E", "F", "G", "H");
List<String> listCopy = new ArrayList<>(list);
list.sort((o1, o2) -> Integer.compare(listCopy.indexOf(o2) % 2, listCopy.indexOf(o1) % 2));
System.out.println(list);
}
Output is: [B, D, F, H, A, C, E, G]
You may use the decorator pattern to store your objects plus extra information for sorting. E.g. if you have a list of strings you want to sort (add getters):
class StringWithIndex {
String text;
int index;
int evenIndex;
StringWithIndex(String text, int index) {
this.text = text;
this.index = index;
this.evenIndex = index % 2 == 0 ? 1 : -1;
}
}
And then you can sort such objects instead of strings:
List<String> strings = Arrays.asList("a", "b", "c", "d");
List<String> sorted = IntStream.range(0, strings.size())
.mapToObj(i -> new StringWithIndex(strings.get(i), i))
.sorted(comparingInt(StringWithIndex::getEvenIndex).thenComparingInt(StringWithIndex::getIndex))
.map(StringWithIndex::getText)
.collect(Collectors.toList());
This adds some overhead to create temporary objects and requires another class. But it can prove very useful as the sorting rules become more complicated.
If all you want to do is move List values around according to their index, then see my other answer.
However, the original wording of your question suggests you want to use sort() or sorted() with a comparator that takes existing position into account as well as other aspects of the value.
This would also be difficult if you just used Collections.sort(), because the Comparator used there doesn't have access to the index either.
You could map your Stream<T> to a Stream<Entry<Integer,T>> and perhaps map it back to Stream<T> when you've finished:
(This is AbstractMap.SimpleEntry -- because it exists in the standard libs -- you could also write your own Pair or use one from a 3rd party -- see A Java collection of value pairs? (tuples?) )
mystream.map(addIndex()).sorted(compareEntry()).map(stripIndex()) ...
private <T> Function<T,Entry<Integer,T>> addIndex() {
final int[] index = new int[] { 0 };
return o -> new SimpleEntry(index[0]++, o);
}
private Comparator<Entry<Integer, T>> compareEntry() {
return a,b -> {
// your comparison code here -- you have access to
// getKey() and getValue() for both parameters
};
}
private <T> Function<Entry<Integer,T>, T> stripIndex() {
return e -> e.getValue();
}
Note that addIndex(), at least, is not parallelisable. I suppose that once all the entries are tagged with indices, downstream from there things could be done in parallel.
Bonus answer - if all you want to do is create a new List containing the odd entries followed by the even entries, then using Stream is adding needless complexity.
for(int i=0; i<inlist.size(); i+=2) {
outlist.add(inlist.get(i));
}
for(int i=1; i<inlist.size(); i+=2) {
outlist.add(inlist.get(i));
}
There's also a pretty simple algorithm what will re-order the list in-place -- which you can write for yourself. Think about it -- there's a simple function to get the new index from the old index, as long as you know the list length.
I've inherited a bunch of code that makes extensive use of parallel arrays to store key/value pairs. It actually made sense to do it this way, but it's sort of awkward to write loops that iterate over these values. I really like the new Java foreach construct, but it does not seem like there is a way to iterate over parallel lists using this.
With a normal for loop, I can do this easily:
for (int i = 0; i < list1.length; ++i) {
doStuff(list1[i]);
doStuff(list2[i]);
}
But in my opinion this is not semantically pure, since we are not checking the bounds of list2 during iteration. Is there some clever syntax similar to the for-each that I can use with parallel lists?
I would use a Map myself. But taking you at your word that a pair of arrays makes sense in your case, how about a utility method that takes your two arrays and returns an Iterable wrapper?
Conceptually:
for (Pair<K,V> p : wrap(list1, list2)) {
doStuff(p.getKey());
doStuff(p.getValue());
}
The Iterable<Pair<K,V>> wrapper would hide the bounds checking.
From the official Oracle page on the enhanced for loop:
Finally, it is not usable for loops
that must iterate over multiple
collections in parallel. These
shortcomings were known by the
designers, who made a conscious
decision to go with a clean, simple
construct that would cover the great
majority of cases.
Basically, you're best off using the normal for loop.
If you're using these pairs of arrays to simulate a Map, you could always write a class that implements the Map interface with the two arrays; this could let you abstract away much of the looping.
Without looking at your code, I cannot tell you whether this option is the best way forward, but it is something you could consider.
This was a fun exercise. I created an object called ParallelList that takes a variable number of typed lists, and can iterate over the values at each index (returned as a list of values):
public class ParallelList<T> implements Iterable<List<T>> {
private final List<List<T>> lists;
public ParallelList(List<T>... lists) {
this.lists = new ArrayList<List<T>>(lists.length);
this.lists.addAll(Arrays.asList(lists));
}
public Iterator<List<T>> iterator() {
return new Iterator<List<T>>() {
private int loc = 0;
public boolean hasNext() {
boolean hasNext = false;
for (List<T> list : lists) {
hasNext |= (loc < list.size());
}
return hasNext;
}
public List<T> next() {
List<T> vals = new ArrayList<T>(lists.size());
for (int i=0; i<lists.size(); i++) {
vals.add(loc < lists.get(i).size() ? lists.get(i).get(loc) : null);
}
loc++;
return vals;
}
public void remove() {
for (List<T> list : lists) {
if (loc < list.size()) {
list.remove(loc);
}
}
}
};
}
}
Example usage:
List<Integer> list1 = Arrays.asList(new Integer[] {1, 2, 3, 4, 5});
List<Integer> list2 = Arrays.asList(new Integer[] {6, 7, 8});
ParallelList<Integer> list = new ParallelList<Integer>(list1, list2);
for (List<Integer> ints : list) {
System.out.println(String.format("%s, %s", ints.get(0), ints.get(1)));
}
Which would print out:
1, 6
2, 7
3, 8
4, null
5, null
This object supports lists of variable lengths, but clearly it could be modified to be more strict.
Unfortunately I couldn't get rid of one compiler warning on the ParallelList constructor: A generic array of List<Integer> is created for varargs parameters, so if anyone knows how to get rid of that, let me know :)
You can use a second constraint in your for loop:
for (int i = 0; i < list1.length && i < list2.length; ++i)
{
doStuff(list1[i]);
doStuff(list2[i]);
}//for
One of my preferred methods for traversing collections is the for-each loop, but as the oracle tutorial mentions, when dealing with parallel collections to use the iterator rather than the for-each.
The following was an answer by Martin v. Löwis in a similar post:
it1 = list1.iterator();
it2 = list2.iterator();
while(it1.hasNext() && it2.hasNext())
{
value1 = it1.next();
value2 = it2.next();
doStuff(value1);
doStuff(value2);
}//while
The advantage to the iterator is that it's generic so if you don't know what the collections are being used, use the iterator, otherwise if you know what your collections are then you know the length/size functions and so the regular for-loop with the additional constraint can be used here. (Note I'm being very plural in this post as an interesting possibility would be where the collections used are different e.g. one could be a List and the other an array for instance)
Hope this helped.
With Java 8, I use these to loop in the sexy way:
//parallel loop
public static <A, B> void loop(Collection<A> a, Collection<B> b, IntPredicate intPredicate, BiConsumer<A, B> biConsumer) {
Iterator<A> ait = a.iterator();
Iterator<B> bit = b.iterator();
if (ait.hasNext() && bit.hasNext()) {
for (int i = 0; intPredicate.test(i); i++) {
if (!ait.hasNext()) {
ait = a.iterator();
}
if (!bit.hasNext()) {
bit = b.iterator();
}
biConsumer.accept(ait.next(), bit.next());
}
}
}
//nest loop
public static <A, B> void loopNest(Collection<A> a, Collection<B> b, BiConsumer<A, B> biConsumer) {
for (A ai : a) {
for (B bi : b) {
biConsumer.accept(ai, bi);
}
}
}
Some example, with these 2 lists:
List<Integer> a = Arrays.asList(1, 2, 3);
List<String> b = Arrays.asList("a", "b", "c", "d");
Loop within min size of a and b:
loop(a, b, i -> i < Math.min(a.size(), b.size()), (x, y) -> {
System.out.println(x + " -> " + y);
});
Output:
1 -> a
2 -> b
3 -> c
Loop within max size of a and b (elements in shorter list will be cycled):
loop(a, b, i -> i < Math.max(a.size(), b.size()), (x, y) -> {
System.out.println(x + " -> " + y);
});
Output:
1 -> a
2 -> b
3 -> c
1 -> d
Loop n times ((elements will be cycled if n is bigger than sizes of lists)):
loop(a, b, i -> i < 5, (x, y) -> {
System.out.println(x + " -> " + y);
});
Output:
1 -> a
2 -> b
3 -> c
1 -> d
2 -> a
Loop forever:
loop(a, b, i -> true, (x, y) -> {
System.out.println(x + " -> " + y);
});
Apply to your situation:
loop(list1, list2, i -> i < Math.min(a.size(), b.size()), (e1, e2) -> {
doStuff(e1);
doStuff(e2);
});
Simple answer: No.
You want sexy iteration and Java byte code? Check out Scala:
Scala for loop over two lists simultaneously
Disclaimer: This is indeed a "use another language" answer. Trust me, I wish Java had sexy parallel iteration, but no one started developing in Java because they want sexy code.
ArrayIterator lets you avoid indexing, but you can’t use a for-each loop without writing a separate class or at least function. As #Alexei Blue remarks, official recommendation (at The Collection Interface) is: “Use Iterator instead of the for-each construct when you need to: … Iterate over multiple collections in parallel.”:
import static com.google.common.base.Preconditions.checkArgument;
import org.apache.commons.collections.iterators.ArrayIterator;
// …
checkArgument(array1.length == array2.length);
Iterator it1 = ArrayIterator(array1);
Iterator it2 = ArrayIterator(array2);
while (it1.hasNext()) {
doStuff(it1.next());
doOtherStuff(it2.next());
}
However:
Indexing is natural for arrays – an array is by definition something you index, and a numerical for loop, as in your original code, is perfectly natural and more direct.
Key-value pairs naturally form a Map, as #Isaac Truett remarks, so cleanest would be to create maps for all your parallel arrays (so this loop would only be in the factory function that creates the maps), though this would be inefficient if you just want to iterate over them. (Use Multimap if you need to support duplicates.)
If you have a lot of these, you could (partially) implement ParallelArrayMap<> (i.e., a map backed by parallel arrays), or maybe ParallelArrayHashMap<> (to add a HashMap if you want efficient lookup by key), and use that, which allows iteration in the original order. This is probably overkill though, but allows a sexy answer.
That is:
Map<T, U> map = new ParallelArrayMap<>(array1, array2);
for (Map.Entry<T, U> entry : map.entrySet()) {
doStuff(entry.getKey());
doOtherStuff(entry.getValue());
}
Philosophically, Java style is to have explicit, named types, implemented by classes. So when you say “[I have] parallel arrays [that] store key/value pairs.”, Java replies “Write a ParallelArrayMap class that implements Map (key/value pairs) and that has a constructor that takes parallel arrays, and then you can use entrySet to return a Set that you can iterate over, since Set implements Collection.” – make the structure explicit in a type, implemented by a class.
For iterating over two parallel collections or arrays, you want to iterate over a Iterable<Pair<T, U>>, which less explicit languages allow you to create with zip (which #Isaac Truett calls wrap). This is not idiomatic Java, however – what are the elements of the pair? See Java: How to write a zip function? What should be the return type? for an extensive discussion of how to write this in Java and why it’s discouraged.
This is exactly the stylistic tradeoff Java makes: you know exactly what type everything is, and you have to specify and implement it.
//Do you think I'm sexy?
if(list1.length == list2.length){
for (int i = 0; i < list1.length; ++i) {
doStuff(list1[i]);
doStuff(list2[i]);
}
}
In Java, I have an ArrayList of Strings like:
[,Hi, ,How,are,you]
I want to remove the null and empty elements, how to change it so it is like this:
[Hi,How,are,you]
List<String> list = new ArrayList<String>(Arrays.asList("", "Hi", null, "How"));
System.out.println(list);
list.removeAll(Arrays.asList("", null));
System.out.println(list);
Output:
[, Hi, null, How]
[Hi, How]
Its a very late answer, but you can also use the Collections.singleton:
List<String> list = new ArrayList<String>(Arrays.asList("", "Hi", null, "How"));
// in one line
list.removeAll(Arrays.asList("", null))
// separately
list.removeAll(Collections.singleton(null));
list.removeAll(Collections.singleton(""));
Another way to do this now that we have Java 8 lambda expressions.
arrayList.removeIf(item -> item == null || "".equals(item));
If you are using Java 8 then try this using lambda expression and org.apache.commons.lang.StringUtils, that will also clear null and blank values from array input
public static String[] cleanArray(String[] array) {
return Arrays.stream(array).filter(x -> !StringUtils.isBlank(x)).toArray(String[]::new);
}
ref - https://stackoverflow.com/a/41935895/9696526
If you were asking how to remove the empty strings, you can do it like this (where l is an ArrayList<String>) - this removes all null references and strings of length 0:
Iterator<String> i = l.iterator();
while (i.hasNext())
{
String s = i.next();
if (s == null || s.isEmpty())
{
i.remove();
}
}
Don't confuse an ArrayList with arrays, an ArrayList is a dynamic data-structure that resizes according to it's contents. If you use the code above, you don't have to do anything to get the result as you've described it -if your ArrayList was ["","Hi","","How","are","you"], after removing as above, it's going to be exactly what you need - ["Hi","How","are","you"].
However, if you must have a 'sanitized' copy of the original list (while leaving the original as it is) and by 'store it back' you meant 'make a copy', then krmby's code in the other answer will serve you just fine.
Going to drop this lil nugget in here:
Stream.of("", "Hi", null, "How", "are", "you")
.filter(t -> !Strings.isNullOrEmpty(t))
.collect(ImmutableList.toImmutableList());
I wish with all of my heart that Java had a filterNot.
There are a few approaches that you could use:
Iterate over the list, calling Iterator.remove() for the list elements you want to remove. This is the simplest.
Repeatedly call List.remove(Object). This is simple too, but performs worst of all ... because you repeatedly scan the entire list. (However, this might be an option for a mutable list whose iterator didn't support remove ... for some reason.)
Create a new list, iterate over the old list, adding elements that you want to retain to a new list.
If you can't return the new list, as 3. above and then clear the old list and use addAll to add the elements of the new list back to it.
Which of these is fastest depends on the class of the original list, its size, and the number of elements that need to be removed. Here are some of the factors:
For an ArrayList, each individual remove operation is O(N), where N is the list size. It is expensive to remove multiple elements from a large ArrayList using the Iterator.remove() method (or the ArrayList.remove(element) method).
By contrast, the Iterator.remove method for a LinkedList is O(1).
For an ArrayList, creating and copying a list is O(N) and relatively cheap, especially if you can ensure that the destination list's capacity is large enough (but not too large).
By contrast, creating and copying to a LinkedList is also O(N), but considerably more expensive.
All of this adds up to a fairly complicated decision tree. If the lists are small (say 10 or less elements) you can probably get away with any of the approaches above. If the lists could be large, you need to weigh up all of the issues in the list of the expected list size and expected number of removals. (Otherwise you might end up with quadratic performance.)
This code compiles and runs smoothly.
It uses no iterator so more readable.
list is your collection.
result is filtered form (no null no empty).
public static void listRemove() {
List<String> list = Arrays.asList("", "Hi", "", "How", "are", "you");
List<String> result = new ArrayList<String>();
for (String str : list) {
if (str != null && !str.isEmpty()) {
result.add(str);
}
}
System.out.println(result);
}
If you get UnsupportedOperationException from using one of ther answer above and your List is created from Arrays.asList(), it is because you can't edit such List.
To fix, wrap the Arrays.asList() inside new LinkedList<String>():
List<String> list = new LinkedList<String>(Arrays.asList(split));
Source is from this answer.
Regarding the comment of Andrew Mairose - Although a fine solution, I would just like to add that this solution will not work on fixed size lists.
You could attempt doing like so:
Arrays.asList(new String[]{"a", "b", null, "c", " "})
.removeIf(item -> item == null || "".equals(item));
But you'll encounter an UnsupportedOperationException at java.util.AbstractList.remove(since asList returns a non-resizable List).
A different solution might be this:
List<String> collect =
Stream.of(new String[]{"a", "b", "c", null, ""})
.filter(item -> item != null && !"".equals(item))
.collect(Collectors.toList());
Which will produce a nice list of strings :-)
lukastymo's answer seems the best one.
But it may be worth mentioning this approach as well for it's extensibility:
List<String> list = new ArrayList<String>(Arrays.asList("", "Hi", null, "How"));
list = list.stream()
.filter(item -> item != null && !item.isEmpty())
.collect(Collectors.toList());
System.out.println(list);
What I mean by that (extensibility) is you could then add additional filters, such as:
.filter(item -> !item.startsWith("a"))
... although of course that's not specifically relevant to the question.
List<String> list = new ArrayList<String>(Arrays.asList("", "Hi", "", "How"));
Stream<String> stream = list .stream();
Predicate<String> empty = empt->(empt.equals(""));
Predicate<String> emptyRev = empty.negate();
list= stream.filter(emptyRev).collect(Collectors.toList());
OR
list = list .stream().filter(empty->(!empty.equals(""))).collect(Collectors.toList());
private List cleanInputs(String[] inputArray) {
List<String> result = new ArrayList<String>(inputArray.length);
for (String input : inputArray) {
if (input != null) {
String str = input.trim();
if (!str.isEmpty()) {
result.add(str);
}
}
}
return result;
}