From InputStream to parallel Stream<T> - java

I am getting an InputStream that contains multiple elements, they are scanned, parsed, iterated by a Stream in a serial way (same order they had in the InputStream), and then persisted in a DB. This works fine.
Now, I am trying to iterate the Stream in parallel way, with Stream<T>.parallel(), so while one thread is blocked persisting, other ones can still scanning the InputStream and persisting.
Then, I tried to parallelized the resulting Stream<MyElement> with Stream<T>.parallel(). To check that the parallelization works, I added into the stream a map function that add a random delay. I was expecting that the resulted elements were printed in a random order.
But the result is not the expected one. The elements are still shown in the file order.
Is there a way to properly iterate this stream in parallel?
public class FromInputStreamToParallelStream {
public static Stream<MyElement> getStream(InputStream is) {
try (var scanner = new Scanner(is)) {
return scanner//
.useDelimiter("DELIMITER")
.tokens()
.parallel()
.map(MyElementParser::parse);
}
}
#Test
public void test() throws IOException {
try (InputStream in = Files.newInputStream(Paths.get("my-file.xml"));) {
getStream(in)
.map(FromInputStreamToParallelStream::sleepRandom)
.forEach(System.out::println);
}
}
private static MyElement sleepRandom(MyElement element) {
var randomNumber = new Random().nextInt(10);
System.out.println("wait. " + randomNumber);
try {
TimeUnit.SECONDS.sleep(randomNumber);
} catch (InterruptedException e) {
e.printStackTrace();
}
return element;
}
}
I guess I gonna need to implement my own Spliterator<T>.
Thanks in advance.

Related

How to flatten a list inside a stream of completable futures?

I have this:
Stream<CompletableFuture<List<Item>>>
how can I convert it to
Stream<CompletableFuture<Item>>
Where: the second stream is comprised of each and all the Items inside each of the lists in the first stream.
I looked into thenCompose but that solves a completely different problem which is also referred to as "flattening".
How can this be done efficiently, in a streaming fashion, without blocking or prematurely consuming more stream items than necessary?
Here is my best attempt so far:
ExecutorService pool = Executors.newFixedThreadPool(PARALLELISM);
Stream<CompletableFuture<List<IncomingItem>>> reload = ... ;
#SuppressWarnings("unchecked")
CompletableFuture<List<IncomingItem>> allFutures[] = reload.toArray(CompletableFuture[]::new);
CompletionService<List<IncomingItem>> queue = new ExecutorCompletionService<>(pool);
for(CompletableFuture<List<IncomingItem>> item: allFutures) {
queue.submit(item::get);
}
List<IncomingItem> THE_END = new ArrayList<IncomingItem>();
CompletableFuture<List<IncomingItem>> ender = CompletableFuture.allOf(allFutures).thenApply(whatever -> {
queue.submit(() -> THE_END);
return THE_END;
});
queue.submit(() -> ender.get());
Iterable<List<IncomingItem>> iter = () -> new Iterator<List<IncomingItem>>() {
boolean checkNext = true;
List<IncomingItem> next = null;
#Override
public boolean hasNext() {
if(checkNext) {
try {
next = queue.take().get();
} catch (InterruptedException | ExecutionException e) {
throw new RuntimeException(e);
}
checkNext = false;
}
if(next == THE_END || next == null) {
return false;
}
else {
return true;
}
}
#Override
public List<IncomingItem> next() {
if(checkNext) {
hasNext();
}
if(!hasNext()) {
throw new IllegalStateException();
}
checkNext = true;
return next;
}
};
Stream<IncomingItem> flat = StreamSupport.stream(iter.spliterator(), false).flatMap(List::stream);
This works at first, unfortunately, it has a fatal bug: the resulting stream seems to terminate prematurely, before retrieving all the items.
As I wrote in my comment, this is impossible.
Consider a some arbitrary service, which will return a CompletableFuture<Integer>:
CompletableFuture<Integer> getDiceRoll();
I can now convert this CompletableFuture<Integer> to a Stream<CompletableFuture<List<Object>>> without any problem:
Stream<CompletableFuture<List<Object>>> futureList = Stream.of(getDiceRoll().thenApply(n -> List.of(new Object[n])));
Let's suppose there would be a general way to turn a Stream<CompletableFuture<List<T>>> into a Stream<CompletableFuture<T>>:
<T> Stream<CompletableFuture<T> magic(Stream<CompletableFuture<List<T>>> arg);
Then I can do the following:
int diceRoll = magic(Stream.of(getDiceRoll().thenApply(n -> List.of(new Object[n])))).count();
Wait, what?
I am now able to get an arbitrary integer out of a CompletableFuture.
Which means, with some engineering effort I can get all the information out of a CompletableFuture - after all, memory is just some numbers.
So we have to conclude that a method like magic can not exist, without violating the time fabric.
And this is the answer: There is no such method, because it can not exist.
Agreed with Johannes Kuhn. You can't know Futures's state while it's still executing and thus can not convert from Stream<CompletableFuture<List>>
to Stream<CompletableFuture> .
Although the output of stream can be merged using following piece of code -
java Stream<CompletableFuture<List<Item>>> to java List<Item> or
java List<CompletableFuture<List<AuditRecord>>> to java List<Item>
List<Item> output = input.map(CompletableFuture::join).collect(toList()).stream()
.flatMap(Collection::stream).collect(toList());

Is it possible to make a catch block that waits until the whole try block is executed?

What I'm doing
I'm trying to make a cleaner version of nested try catch blocks and I'm solving a very basic exception problem while doing so. I'm making a calculator that will do great things. Before then however, it must take in user inputs as strings and convert them to either floats or integers. I'm doing this by simply calling the in built parseInt and parseFloat functions of java. Right now I'm using a nested try catch block to do this:
String stringToParse = "1.0"
try{Integer.parseInt(stringToParse);}
catch(NumberFormatException n){
try{Float.parseFloat(stringToParse);}
catch(NumberFormatException n){
System.out.println(n)
}
}
Why is that a problem?
This is messy to me and I'd rather have a try block that collects the errors but doesn't immediately go to the catch block, rather it executes the entire try and catches any errors after the try has been executed. I've made a runnable example of this myself that shows what I desire:
String num = "1.0";
int i = 0;
ArrayList<Object> listofResults = new ArrayList<>();
ArrayList<Integer> listOfErrorIndices = new ArrayList<>();
try {
listofResults.add(Integer.parseInt(num));
i++;
listofResults.add(Float.parseFloat(num));
i++;
listofResults.add(Integer.parseInt(num));
} catch (NumberFormatException n) {
listOfErrorIndices.add(i);
}
for (Integer element:listOfErrorIndices) {
System.out.println(element);
//this currently prints out 0 and I want it to print out both 0 and
//2 so that it catches both errors.
}
My idea of how to solve the problem/What I've tried otherwise
My plan is to gather a list of all the NumberFormatException indices (i) thrown in the try. Each time I try to parse the string, an element is added to the resultsList. My goal is to then use this theoretical try catch block to obtain the indices of all the exceptions and then remove them from the resultsList if they threw an error. TLDR; Right now the above code prints out 0 and I want it to print out 0 and 2. Basically, Instead of having nested try catch blocks I use list comprehension and Exception handling indicies with i to remove the error results and only keep the good ones. I don't know if this is possible hence this question. I've looked at the "better ways to implement nested try catch blocks" question however it wasn't useful to me because It provided a solution in delphi and I didn't understand exactly how it worked or if it even worked the way I want mine to work. I at first thought the finally block might be what I needed but that only runs after the catch is executed or if there is no exception, after the try. I need something that postpones the catch block untill the try is complete and I can't think of/find anything that does that.
What are you, crazy?
right now you may be asking, what the hell is the point of this? Well imagine if you had the above problem but instead of two ways to parse the string you had 10 or 100. Pretty quickly, exception handling that with nested try catch blocks would be nigh impossible. I've seen solutions where the catch block calls a custom exception method that then at least takes care of the bad formatting. It looked like this:
try{
//bad code
}
catch{
trysomethingelse();
}
trysomethingelse(){
//equally bad code
catch{
//ya done screwed up son
}
}
However I'm not satisfied because it means that you need a million different method names just to potentially handle one error. Imagine the error would always be the same you just need to try 100 different string parsing methods. Its always going to be a numberformatException if you're trying to convert a string to a number so why have a million catch blocks just for the same error? I want to try to do this with one theoretical catch block that specifies one error that happens many times over in the try block.
You build a list/array of parsers, then iterate that list, catching exception for each.
With Java 8 method references, this is real easy. First, define a Parser functional interface that allows exceptions to be thrown:
#FunctionalInterface
public interface Parser {
Object parse(String text) throws Exception;
}
Next, build your array of parsers to try:
Parser[] parsers = {
Integer::valueOf,
Double::valueOf,
BigInteger::new,
BigDecimal::new
};
Finally, try them one at a time:
String text = "45.8";
Object[] results = new Object[parsers.length];
for (int i = 0; i < parsers.length; i++) {
try {
results[i] = parsers[i].parse(text);
} catch (Exception e) {
results[i] = e;
}
}
Now you can go through the results:
for (Object result : results) {
if (result instanceof Exception)
System.out.println("Error: " + result);
else
System.out.println("Parsed as " + result.getClass().getSimpleName() + ": " + result);
}
Output
Error: java.lang.NumberFormatException: For input string: "45.8"
Parsed as Double: 45.8
Error: java.lang.NumberFormatException: For input string: "45.8"
Parsed as BigDecimal: 45.8
Or put the parsed objects and the exceptions into two different lists. Up to you.
You can do something like this:
interface Parser {
Number parse(String);
}
class IntegerParser implements Parser {
#Override
public Number parse(String) {
// implementation here
}
}
class FloatParser implements Parser {
}
List<Parser> parsers = asList(new FloatParser(), new IntegerParser(), ...);
Number result = null;
List<NumberFormatException> exceptions = new ArrayList<>();
for (Parser parser : parsers) {
try {
result = parser.parse(stringToParse);
break;
} catch (NumberFormatException e) {
exceptions.add(e);
}
}
if (result != null) {
// parsed ok with some parser
// probably discard exceptions
} else {
// show exceptions from the list
}
Try this:
public static void test() {
final String num = "1.0";
final ArrayList<Object> listofResults = new ArrayList<>();
final java.util.function.Function<String, ?>[] parseMethods = new java.util.function.Function[3];
parseMethods[0] = Integer::parseInt;
parseMethods[1] = Float::parseFloat;
parseMethods[2] = Integer::parseInt;
int[] badIndeces = IntStream.range(0, parseMethods.length).map(i -> {
try {
listofResults.add(parseMethods[i].apply(num));
return -i-1;
} catch (NumberFormatException exc) {
return i;
}
}).filter(i -> i >= 0).toArray();
for (int element : badIndeces) {
System.out.println(element);
}
}

Vector throws ConcurrentModificationException despite being synchronized

I had an ArrayList that was being operated on by multiple threads, which wasn't working as the ArrayList isn't synchronized. I switched the list to a Vector as instructed by my professor. Vector is synchronized, but I'm having exceptions thrown related to synchronization.
Why is this happening, and how can I avoid concurrency exceptions in my code? I don't want to just play around until something works, I want to do the best thing. Thanks!
Exception:
Exception in thread "Thread-3" java.util.ConcurrentModificationException
at java.util.Vector$Itr.checkForComodification(Vector.java:1184)
at java.util.Vector$Itr.next(Vector.java:1137)
at BytePe4D$ReadInts.run(BytePe4D.java:64)
Code:
import java.io.*;
import java.util.Vector;
public class BytePe4D {
private Vector<Integer> numbers;
public static void main(String[] args) {
new BytePe4D();
}
public BytePe4D() {
// Create ArrayList and reset sum
numbers = new Vector<Integer>();
// Call addInts 8 times, with filenames integer1.dat through integer8.dat
for (int i = 1; i <= 8; i++) {
File file = new File("PE Data/integer" + i + ".dat");
ReadInts thread = new ReadInts(file);
thread.start();
}
}
/** Represents a Thread instance */
class ReadInts extends Thread {
File file;
public ReadInts(File _file) {
file = _file;
}
#Override
public void run() {
int count = 0; // track number of records read
int sum = 0;
try {
// Open stream to binary data file integer1.dat
FileInputStream in = new FileInputStream(file);
// Buffer the stream
BufferedInputStream bin = new BufferedInputStream(in);
// Access the primitive data
DataInputStream din = new DataInputStream(bin);
try {
// Read file until end reached
while (true) {
numbers.add(din.readInt());
count++;
}
} catch (EOFException eof) {
// System.out.println("End of file reached.");
} finally {
// Close streams
din.close();
}
} catch (FileNotFoundException fnf) {
System.out.println("File does not exist: " + file.getName());
return;
} catch (IOException ioe) {
ioe.printStackTrace();
}
// Calculate sum of numbers read
for (int num : numbers) {
sum += num;
}
// Write info
System.out.println(
String.format("%s%s%-5s%s%-8d%-5s%s%-12d%-5s%s%d",
"Filename = ", file.getName(), "",
"Count = ", count, "",
"Sum = ", sum, "",
"In List = ", numbers.size()));
}
}
}
From the docs:
if the vector is structurally modified at any time after the iterator
is created, in any way except through the iterator's own remove or add
methods, the iterator will throw a ConcurrentModificationException.
The following code creates an iterator under the covers:
for (int num : numbers) {
sum += num;
}
So when one threads modifies the vector (by adding elements) while another vector is iterating it - you'll see a ConcurrentModificationException
There are different options to solve it, one way could be to read from a file into another vector and when the reading is done assign this other vector to numbers (since assignment is an atomic operation). Keep in mind that in order for the change to be visible to other threads you'll need to declare numbers as volatile.
Your code seems wrong.
I don't see why you need a shared vector, if each thread is to calculate the sum of records from an individual file. On the other hand, if you want to calculate the sum of records from all files, you should do it after every thread has completed.
Depending on which you want, you can either 1) create a vector for each thread and calculate the sum for each file or, 2) in the main thread, wait for all threads to complete then calculate the sum for all files.

Java 8 Stream exception handling while adding objects from function return value to list

I'm having a hard time understanding how to handle exceptions when using a Java 8 stream. I would like to add objects to an empty list, and each object gets returned from a function that can potentially throw an exception depending on what the input is. If an exception is thrown for only certain values on the input list, I want it to continue to loop through the remainder of the input list. It seems easy to do with a for loop:
List<Item> itemList = new ArrayList<>();
List<Input> inputs = getInputs(); //returns a list of inputs
int exceptionCount = 0;
// If an exception is thrown
for (Input input : inputs){
try {
itemList.add(getItem(input));
} catch (Exception e ) {
// handle exception from getItem(input)
exceptionCount = exceptionCount + 1;
}
}
It seems like it may be possible to achieve this with a Java stream, but I'm not quite sure how to handle the exception that could be thrown by the getItem() function.
Here's what I have so far:
final List<Item> itemList = new ArrayList<>();
try {
itemList = getInputs().stream()
.map(this::getItem)
.collect(Collectors.toList());
} catch (Exception e ) {
// handle exception from getItem(input)
exceptionCount = exceptionCount + 1;
}
The above obviously won't work because as soon as there is one exception thrown from getItem, the loop will not continue and only one exception will be thrown for the entire stream. Is there any way I can achieve the same implementation as my basic for loop with Java 8 streams?
You should catch the exception within the map operation:
class ExceptionCounter {
private int count = 0;
void inc() { count++; }
int getCount() { return count; }
}
ExceptionCounter counter = new ExceptionCounter();
List<Item> items = getInputs().stream()
.map(input -> {
try {
return getItem(input);
} catch (Exception e) {
// handle exception here
counter.inc();
}
})
.collect(Collectors.toList());
While this works as expected for sequential streams, it won't work for parallel streams. Even if the stream is not parallel, we still need the ExceptionCounter holder class because variables referenced from within arguments of stream operations (such as map) must be effectively final. (You can use an array of one element or an AtomicInteger instead of a holder class).
If we add synchronized to the inc method of ExceptionCounter class, then the solution above would support parallel streams. However, there would be a lot of thread contention on the inc method's lock, thus losing the advantages of parallelization. This (along with attempting to not create error-prone code) is the reason why side-effects are discouraged on streams. And counting the number of exceptions is in fact a side-effect.
For this particular case, you can avoid the side effect if you use a custom collector:
class Result<R> {
List<R> values = new ArrayList<>();
int exceptionCount = 0;
// TODO getters
}
static <T, R> Result<R> mappingToListCountingExceptions(Function<T, R> mapper) {
class Acc {
Result<R> result = new Result<>();
void add(T t) {
try {
R res = mapper.apply(t);
result.value.add(res);
} catch (Exception e) {
result.exceptionCount++;
}
}
Acc merge(Acc another) {
result.values.addAll(another.values);
result.exceptionCount += another.exceptionCount;
}
}
return Collector.of(Acc::new, Acc::add, Acc::merge, acc -> acc.result);
}
You can use this custom collector as follows:
Result<Item> items = getInputs().stream()
.collect(mappingToListCountingExceptions(this::getItem));
Now it's up to you to decide whether this approach is better than a traditional for loop.
Event there are several ways to handle exceptions in the stream:
to catch the exception in the map
files.stream()
.parallel()
.map(file-> {
try {
return file.getInputStream();
} catch (IOException e) {
e.printStackTrace();
return null;
}
})
.forEach(inputStream -> carService.saveCars(inputStream));
to extract the function argument to map into a method of its own:
files.stream()
.parallel()
.map(file-> extractInputStream(file))
.forEach(inputStream -> carService.saveCars(inputStream));
and
private InputStream extractInputStream(MultipartFile file) {
try {
return file.getInputStream();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
create another functional interface, similar to Function, whose apply method did declare that it throws an exception:
#FunctionalInterface
interface FunctionWithException<T, R, E extends Exception> {
R apply(T t) throws E;
}
and
private <T, R, E extends Exception>
Function<T, R> wrapper(FunctionWithException<T, R, E> fun) {
return arg -> {
try {
return fun.apply(arg);
} catch (Exception e) {
throw new RuntimeException(e);
}
};
}
then use it like this
files.stream()
.parallel()
.map(wrapper(file->file.getInputStream()))
.forEach(inputStream -> carService.saveCars(inputStream));
But if you want to do efficiently with all Functional interfaces, I suggest using this library
<dependency>
<groupId>com.pivovarit</groupId>
<artifactId>throwing-function</artifactId>
<version>1.5.0</version>
</dependency>
All explained with an example in this post Java 8, How to handle exceptions in a stream?

List of Thread and accessing another list

I've already made another question close to this one several minutes ago, and there were good answers, but it was not what I was looking for, so I tried to be a bit clearer.
Let's say I have a list of Thread in a class :
class Network {
private List<Thread> tArray = new ArrayList<Thread>();
private List<ObjectInputStream> input = new ArrayList<ObjectInputStream>();
private void aMethod() {
for(int i = 0; i < 10; i++) {
Runnable r = new Runnable() {
public void run() {
try {
String received = (String) input.get(****).readObject(); // I don't know what to put here instead of the ****
showReceived(received); // random method in Network class
} catch (IOException ioException) {
ioException.printStackTrace();
}
}
}
tArray.add(new Thread(r));
tArray.get(i).start();
}
}
}
What should I put instead of ** ?
The first thread of the tArray list must only access the first input of the input list for example.
EDIT : Let's assume my input list has already 10 elements
It would work if you put i. You also need to add an ObjectInputStream to the list for each thread. I recommend you use input.add for that purpose. You also need to fill the tArray list with some threads, use add again there.
Here's the solution:
private void aMethod() {
for(int i = 0; i < 10; i++) {
final int index = i; // Captures the value of i in a final varialbe.
Runnable r = new Runnable() {
public void run() {
try {
String received = input.get(index).readObject().toString(); // Use te final variable to access the list.
showReceived(received); // random method in Network class
} catch (Exception exception) {
exception.printStackTrace();
}
}
};
tArray.add(new Thread(r));
tArray.get(i).start();
}
}
As you want each thread to access one element from the input array you can use the value of the i variable as an index into the list. The problem with using i directly is that an inner class cannot access non-final variables from the enclosing scope. To overcome this we assign i to a final variable index. Being final index is accessible by the code of your Runnable.
Additional fixes:
readObject().toString()
catch(Exception exception)
tArray.add(new Thread(r))

Categories

Resources