I have a flow of units of work, lets call them "Work Items" that are processed sequentially (for now). I'd like to speed up processing by doing the work multithreaded.
Constraint: Those work items come in a specific order, during processing the order is not relevant - but once processing is finished the order must be restored.
Something like this:
|.|
|.|
|4|
|3|
|2| <- incoming queue
|1|
/ | \
2 1 3 <- worker threads
\ | /
|3|
|2| <- outgoing queue
|1|
I would like to solve this problem in Java, preferably without Executor Services, Futures, etc., but with basic concurrency methods like wait(), notify(), etc.
Reason is: My Work Items are very small and fine grained, they finish processing in about 0.2 milliseconds each. So I fear using stuff from java.util.concurrent.* might introduce way to much overhead and slow my code down.
The examples I found so far all preserve the order during processing (which is irrelevant in my case) and didn't care about order after processing (which is crucial in my case).
This is how I solved your problem in a previous project (but with java.util.concurrent):
(1) WorkItem class does the actual work/processing:
public class WorkItem implements Callable<WorkItem> {
Object content;
public WorkItem(Object content) {
super();
this.content = content;
}
public WorkItem call() throws Exception {
// getContent() + do your processing
return this;
}
}
(2) This class puts Work Items in a queue and initiates processing:
public class Producer {
...
public Producer() {
super();
workerQueue = new ArrayBlockingQueue<Future<WorkItem>>(THREADS_TO_USE);
completionService = new ExecutorCompletionService<WorkItem>(Executors.newFixedThreadPool(THREADS_TO_USE));
workerThread = new Thread(new Worker(workerQueue));
workerThread.start();
}
public void send(Object o) throws Exception {
WorkItem workItem = new WorkItem(o);
Future<WorkItem> future = completionService.submit(workItem);
workerQueue.put(future);
}
}
(3) Once processing is finished the Work Items are dequeued here:
public class Worker implements Runnable {
private ArrayBlockingQueue<Future<WorkItem>> workerQueue = null;
public Worker(ArrayBlockingQueue<Future<WorkItem>> workerQueue) {
super();
this.workerQueue = workerQueue;
}
public void run() {
while (true) {
Future<WorkItem> fwi = workerQueue.take(); // deqeueue it
fwi.get(); // wait for it till it has finished processing
}
}
}
(4) This is how you would use the stuff in your code and submit new work:
public class MainApp {
public static void main(String[] args) throws Exception {
Producer p = new Producer();
for (int i = 0; i < 10000; i++)
p.send(i);
}
}
If you allow BlockingQueue, why would you ignore the rest of the concurrency utils in java?
You could use e.g. Stream (if you have java 1.8) for the above:
List<Type> data = ...;
List<Other> out = data.parallelStream()
.map(t -> doSomeWork(t))
.collect(Collectors.toList());
Because you started from an ordered Collection (List), and collect also to a List, you will have results in the same order as the input.
Just ID each of the objects for processing, create a proxy which would accept done work and allow to return it only when the ID pushed was sequential. A sample code below. Note how simple it is, utilizing an unsynchronized auto-sorting collection and just 2 simple methods as API.
public class SequentialPushingProxy {
static class OrderedJob implements Comparable<OrderedJob>{
static AtomicInteger idSource = new AtomicInteger();
int id;
public OrderedJob() {
id = idSource.incrementAndGet();
}
public int getId() {
return id;
}
#Override
public int compareTo(OrderedJob o) {
return Integer.compare(id, o.getId());
}
}
int lastId = OrderedJob.idSource.get();
public Queue<OrderedJob> queue;
public SequentialPushingProxy() {
queue = new PriorityQueue<OrderedJob>();
}
public synchronized void pushResult(OrderedJob job) {
queue.add(job);
}
List<OrderedJob> jobsToReturn = new ArrayList<OrderedJob>();
public synchronized List<OrderedJob> getFinishedJobs() {
while (queue.peek() != null) {
// only one consumer at a time, will be safe
if (queue.peek().getId() == lastId+1) {
jobsToReturn.add(queue.poll());
lastId++;
} else {
break;
}
}
if (jobsToReturn.size() != 0) {
List<OrderedJob> toRet = jobsToReturn;
jobsToReturn = new ArrayList<OrderedJob>();
return toRet;
}
return Collections.emptyList();
}
public static void main(String[] args) {
final SequentialPushingProxy proxy = new SequentialPushingProxy();
int numProducerThreads = 5;
for (int i=0; i<numProducerThreads; i++) {
new Thread(new Runnable() {
#Override
public void run() {
while(true) {
proxy.pushResult(new OrderedJob());
}
}
}).start();
}
int numConsumerThreads = 1;
for (int i=0; i<numConsumerThreads; i++) {
new Thread(new Runnable() {
#Override
public void run() {
while(true) {
List<OrderedJob> ret = proxy.getFinishedJobs();
System.out.println("got "+ret.size()+" finished jobs");
try {
Thread.sleep(200);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}).start();
}
try {
Thread.sleep(5000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.exit(0);
}
}
This code could be easily improved to
allow pushing more than one job result at once, to reduce the synchronization costs
introduce a limit to returned collection to get done jobs in smaller chunks
extract an interface for those 2 public methods and switch implementations to perform tests
You could have 3 input and 3 output queues - one of each type for each worker thread.
Now when you want to insert something into the input queue you put it into only one of the 3 input queues. You change the input queues in a round robin fashion. The same applies to the output, when you want to take something from the output you choose the first of the output queues and once you get your element you switch to the next queue.
All the queues need to be blocking.
Pump all your Futures through a BlockingQueue. Here's all the code you need:
public class SequentialProcessor implements Consumer<Task> {
private final ExecutorService executor = Executors.newCachedThreadPool();
private final BlockingDeque<Future<Result>> queue = new LinkedBlockingDeque<>();
public SequentialProcessor(Consumer<Result> listener) {
new Thread(() -> {
while (true) {
try {
listener.accept(queue.take().get());
} catch (InterruptedException | ExecutionException e) {
// handle the exception however you want, perhaps just logging it
}
}
}).start();
}
public void accept(Task task) {
queue.add(executor.submit(callableFromTask(task)));
}
private Callable<Result> callableFromTask(Task task) {
return <how to create a Result from a Task>; // implement this however
}
}
Then to use, create a SequentialProcessor (once):
SequentialProcessor processor = new SequentialProcessor(whatToDoWithResults);
and pump tasks to it:
Stream<Task> tasks; // given this
tasks.forEach(processor); // simply this
I created the callableFromTask() method for illustration, but you can dispense with it if getting a Result from a Task is simple by using a lambda instead or method reference instead.
For example, if Task had a getResult() method, do this:
queue.add(executor.submit(task::getResult));
or if you need an expression (lambda):
queue.add(executor.submit(() -> task.getValue() + "foo")); // or whatever
Reactive programming could help. During my brief experience with RxJava I found it to be intuitive and easy to work with than core language features like Future etc. Your mileage may vary. Here are some helpful starting points https://www.youtube.com/watch?v=_t06LRX0DV0
The attached example also shows how this could be done. In the example below we have Packet's which need to be processed. They are taken through a simple trasnformation and fnally merged into one list. The output appended to this message shows that the Packets are received and transformed at different points in time but in the end they are output in the order they have been received
import static java.time.Instant.now;
import static rx.schedulers.Schedulers.io;
import java.time.Instant;
import java.util.List;
import java.util.Random;
import rx.Observable;
import rx.Subscriber;
public class RxApp {
public static void main(String... args) throws InterruptedException {
List<ProcessedPacket> processedPackets = Observable.range(0, 10) //
.flatMap(i -> {
return getPacket(i).subscribeOn(io());
}) //
.map(Packet::transform) //
.toSortedList() //
.toBlocking() //
.single();
System.out.println("===== RESULTS =====");
processedPackets.stream().forEach(System.out::println);
}
static Observable<Packet> getPacket(Integer i) {
return Observable.create((Subscriber<? super Packet> s) -> {
// simulate latency
try {
Thread.sleep(new Random().nextInt(5000));
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("packet requested for " + i);
s.onNext(new Packet(i.toString(), now()));
s.onCompleted();
});
}
}
class Packet {
String aString;
Instant createdOn;
public Packet(String aString, Instant time) {
this.aString = aString;
this.createdOn = time;
}
public ProcessedPacket transform() {
System.out.println(" Packet being transformed " + aString);
try {
Thread.sleep(new Random().nextInt(5000));
} catch (Exception e) {
e.printStackTrace();
}
ProcessedPacket newPacket = new ProcessedPacket(this, now());
return newPacket;
}
#Override
public String toString() {
return "Packet [aString=" + aString + ", createdOn=" + createdOn + "]";
}
}
class ProcessedPacket implements Comparable<ProcessedPacket> {
Packet p;
Instant processedOn;
public ProcessedPacket(Packet p, Instant now) {
this.p = p;
this.processedOn = now;
}
#Override
public int compareTo(ProcessedPacket o) {
return p.createdOn.compareTo(o.p.createdOn);
}
#Override
public String toString() {
return "ProcessedPacket [p=" + p + ", processedOn=" + processedOn + "]";
}
}
Deconstruction
Observable.range(0, 10) //
.flatMap(i -> {
return getPacket(i).subscribeOn(io());
}) // source the input as observables on multiple threads
.map(Packet::transform) // processing the input data
.toSortedList() // sorting to sequence the processed inputs;
.toBlocking() //
.single();
On one particular run Packets were received in the order 2,6,0,1,8,7,5,9,4,3 and processed in order 2,6,0,1,3,4,5,7,8,9 on different threads
packet requested for 2
Packet being transformed 2
packet requested for 6
Packet being transformed 6
packet requested for 0
packet requested for 1
Packet being transformed 0
packet requested for 8
packet requested for 7
packet requested for 5
packet requested for 9
Packet being transformed 1
packet requested for 4
packet requested for 3
Packet being transformed 3
Packet being transformed 4
Packet being transformed 5
Packet being transformed 7
Packet being transformed 8
Packet being transformed 9
===== RESULTS =====
ProcessedPacket [p=Packet [aString=2, createdOn=2016-04-14T13:48:52.060Z], processedOn=2016-04-14T13:48:53.247Z]
ProcessedPacket [p=Packet [aString=6, createdOn=2016-04-14T13:48:52.130Z], processedOn=2016-04-14T13:48:54.208Z]
ProcessedPacket [p=Packet [aString=0, createdOn=2016-04-14T13:48:53.989Z], processedOn=2016-04-14T13:48:55.786Z]
ProcessedPacket [p=Packet [aString=1, createdOn=2016-04-14T13:48:54.109Z], processedOn=2016-04-14T13:48:57.877Z]
ProcessedPacket [p=Packet [aString=8, createdOn=2016-04-14T13:48:54.418Z], processedOn=2016-04-14T13:49:14.108Z]
ProcessedPacket [p=Packet [aString=7, createdOn=2016-04-14T13:48:54.600Z], processedOn=2016-04-14T13:49:11.338Z]
ProcessedPacket [p=Packet [aString=5, createdOn=2016-04-14T13:48:54.705Z], processedOn=2016-04-14T13:49:06.711Z]
ProcessedPacket [p=Packet [aString=9, createdOn=2016-04-14T13:48:55.227Z], processedOn=2016-04-14T13:49:16.927Z]
ProcessedPacket [p=Packet [aString=4, createdOn=2016-04-14T13:48:56.381Z], processedOn=2016-04-14T13:49:02.161Z]
ProcessedPacket [p=Packet [aString=3, createdOn=2016-04-14T13:48:56.566Z], processedOn=2016-04-14T13:49:00.557Z]
You could launch a DoTask thread for every WorkItem. This thread processes the work.
When the work is done, you try to post the item, synchronized on the controlling object, in which you check if it's the right ID and wait if not.
The post implementation can be something like:
synchronized(controllingObject) {
try {
while(workItem.id != nextId) controllingObject.wait();
} catch (Exception e) {}
//Post the workItem
nextId++;
object.notifyAll();
}
I think that you need an extra queue to hold the incoming order.
IncomingOrderQueue.
When you consume the objects you put them in some storage, for example Map and then from another thread which consumes from the IncomingOrderQueue you pick the ids(hashes) of the objects and then you collect them from this HashMap.
This solution can easily be implemented without execution service.
Preprocess: add an order value to each item, prepare an array if it is not allocated.
Input: queue (concurrent sampling with order values 1,2,3,4 but doesnt matter which tread gets which sample)
Output: array (writing to indexed elements, using a synch point to wait for all threads in the end, doesn't need collision checks since it writes different positions for every thread)
Postprocess: convert array to a queue.
Needs n element-array for n-threads. Or some multiple of n to do postprocessing only once.
Related
I know this question was answered many times, but I'm struggling to understand how it works.
So in my application the user must be able to select items which will be added to a queue (displayed in a ListView using an ObservableList<Task>) and each item needs to be processed sequentially by an ExecutorService.
Also that queue should be editable (change the order and remove items from the list).
private void handleItemClicked(MouseEvent event) {
if (event.getClickCount() == 2) {
File item = listView.getSelectionModel().getSelectedItem();
Task<Void> task = createTask(item);
facade.getTaskQueueList().add(task); // this list is bound to a ListView, where it can be edited
Future result = executor.submit(task);
// where executor is an ExecutorService of which type?
try {
result.get();
} catch (Exception e) {
// ...
}
}
}
Tried it with executor = Executors.newFixedThreadPool(1) but I don't have control over the queue.
I read about ThreadPoolExecutor and queues, but I'm struggling to understand it as I'm quite new to Concurrency.
I need to run that method handleItemClicked in a background thread, so that the UI does not freeze, how can I do that the best way?
Summed up: How can I implement a queue of tasks, which is editable and sequentially processed by a background thread?
Please help me figure it out
EDIT
Using the SerialTaskQueue class from vanOekel helped me, now I want to bind the List of tasks to my ListView.
ListProperty<Runnable> listProperty = new SimpleListProperty<>();
listProperty.set(taskQueue.getTaskList()); // getTaskList() returns the LinkedList from SerialTaskQueue
queueListView.itemsProperty().bind(listProperty);
Obviously this doesn't work as it's expecting an ObservableList. There is an elegant way to do it?
The simplest solution I can think of is to maintain the task-list outside of the executor and use a callback to feed the executor the next task if it is available. Unfortunately, it involves synchronization on the task-list and an AtomicBoolean to indicate a task executing.
The callback is simply a Runnable that wraps the original task to run and then "calls back" to see if there is another task to execute, and if so, executes it using the (background) executor.
The synchronization is needed to keep the task-list in order and at a known state. The task-list can be modified by two threads at the same time: via the callback running in the executor's (background) thread and via handleItemClicked method executed via the UI foreground thread. This in turn means that it is never exactly known when the task-list is empty for example. To keep the task-list in order and at a known fixed state, synchronization of the task-list is needed.
This still leaves an ambiguous moment to decide when a task is ready for execution. This is where the AtomicBoolean comes in: a value set is always immediatly availabe and read by any other thread and the compareAndSet method will always ensure only one thread gets an "OK".
Combining the synchronization and the use of the AtomicBoolean allows the creation of one method with a "critical section" that can be called by both foreground- and background-threads at the same time to trigger the execution of a new task if possible. The code below is designed and setup in such a way that one such method (runNextTask) can exist. It is good practice to make the "critical section" in concurrent code as simple and explicit as possible (which, in turn, generally leads to an efficient "critical section").
import java.util.*;
import java.util.concurrent.*;
import java.util.concurrent.atomic.AtomicBoolean;
public class SerialTaskQueue {
public static void main(String[] args) {
ExecutorService executor = Executors.newSingleThreadExecutor();
// all operations on this list must be synchronized on the list itself.
SerialTaskQueue tq = new SerialTaskQueue(executor);
try {
// test running the tasks one by one
tq.add(new SleepSome(10L));
Thread.sleep(5L);
tq.add(new SleepSome(20L));
tq.add(new SleepSome(30L));
Thread.sleep(100L);
System.out.println("Queue size: " + tq.size()); // should be empty
tq.add(new SleepSome(10L));
Thread.sleep(100L);
} catch (Exception e) {
e.printStackTrace();
} finally {
executor.shutdownNow();
}
}
// all lookups and modifications to the list must be synchronized on the list.
private final List<Runnable> tasks = new LinkedList<Runnable>();
// atomic boolean used to ensure only 1 task is executed at any given time
private final AtomicBoolean executeNextTask = new AtomicBoolean(true);
private final Executor executor;
public SerialTaskQueue(Executor executor) {
this.executor = executor;
}
public void add(Runnable task) {
synchronized(tasks) { tasks.add(task); }
runNextTask();
}
private void runNextTask() {
// critical section that ensures one task is executed.
synchronized(tasks) {
if (!tasks.isEmpty()
&& executeNextTask.compareAndSet(true, false)) {
executor.execute(wrapTask(tasks.remove(0)));
}
}
}
private CallbackTask wrapTask(Runnable task) {
return new CallbackTask(task, new Runnable() {
#Override public void run() {
if (!executeNextTask.compareAndSet(false, true)) {
System.out.println("ERROR: programming error, the callback should always run in execute state.");
}
runNextTask();
}
});
}
public int size() {
synchronized(tasks) { return tasks.size(); }
}
public Runnable get(int index) {
synchronized(tasks) { return tasks.get(index); }
}
public Runnable remove(int index) {
synchronized(tasks) { return tasks.remove(index); }
}
// general callback-task, see https://stackoverflow.com/a/826283/3080094
static class CallbackTask implements Runnable {
private final Runnable task, callback;
public CallbackTask(Runnable task, Runnable callback) {
this.task = task;
this.callback = callback;
}
#Override public void run() {
try {
task.run();
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
callback.run();
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
// task that just sleeps for a while
static class SleepSome implements Runnable {
static long startTime = System.currentTimeMillis();
private final long sleepTimeMs;
public SleepSome(long sleepTimeMs) {
this.sleepTimeMs = sleepTimeMs;
}
#Override public void run() {
try {
System.out.println(tdelta() + "Sleeping for " + sleepTimeMs + " ms.");
Thread.sleep(sleepTimeMs);
System.out.println(tdelta() + "Slept for " + sleepTimeMs + " ms.");
} catch (Exception e) {
e.printStackTrace();
}
}
private String tdelta() { return String.format("% 4d ", (System.currentTimeMillis() - startTime)); }
}
}
Update: if groups of tasks need to be executed serial, have a look at the adapted implementation here.
Usually SO existent topics help me to get over a problem, but now I found myself stuck.
I want to implement a Prod/Cons using concurrency in Java. Without using existing APIs because is for learning purposes.
My Producers are blocking the Consumers to consume the messages from the queue (Holder) but I want Producer and Consumers to use the queue simultaneous.
You can run my sample and you will see that, while the Producer is adding, the Consumer waits for the lock. But I want the consumer to do his job right after a message is added, not when the producer tells him.
I'm surprised that all those examples I found searching the P/C pattern works as mine (producer blocks the consumer, which doesn't make sense to me)
import java.util.LinkedList;
import java.util.Queue;
import java.util.Random;
import java.util.concurrent.Executor;
import java.util.concurrent.Executors;
class Holder<T> {
private int capacity;
private Queue<T> items = new LinkedList<T>();
public Holder(int capacity) {
this.capacity = capacity;
}
public synchronized void addItem(T item) throws InterruptedException {
Thread.sleep(new Random().nextInt(2000));
while (isFull()) {
System.out.println("Holder FULL. adding operation is waiting... [" + item + "]");
this.wait();
}
System.out.println(items.size() + " -- holder +++ added " + item);
items.add(item);
this.notifyAll();
}
public T getItem() throws InterruptedException {
synchronized (this) {
while (isEmpty()) {
System.out.println("Holder EMPTY. getting operation is waiting...");
this.wait();
}
T next = items.poll();
System.out.println(items.size() + " -- holder --- removed " + next + " - remaining: " + items.size());
this.notifyAll();
return next;
}
}
private synchronized boolean isEmpty() {
return items.isEmpty();
}
private synchronized boolean isFull() {
return items.size() >= capacity;
}
}
class Producer implements Runnable {
public static final int GENERATED_ITEMS_COUNT = 10;
private int id;
private Holder<String> holder;
public Producer(int id, Holder<String> holder) {
this.id = id;
this.holder = holder;
}
#Override
public void run() {
try {
for (int i = 0; i < GENERATED_ITEMS_COUNT; i++) {
String produced = "Message " + i + " from [P" + id + "] " + System.nanoTime();
holder.addItem(produced);
}
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
class Consumer implements Runnable {
private Holder<String> holder;
public Consumer(Holder<String> hodler) {
this.holder = hodler;
}
#Override
public void run() {
while (true) {
try {
String consumed = holder.getItem();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
}
public class ConsumerProducerApp {
public static void main(String[] args) throws InterruptedException {
Holder<String> coada = new Holder<String>(10);
Thread consumer = new Thread(new Consumer(coada));
consumer.start();
Executor executor = Executors.newCachedThreadPool();
for (int i = 1; i <= 9; i++) {
executor.execute(new Producer(i, coada));
}
}
}
EDIT:
So presuming we exclude the Thread.sleep from this equation. What if I have 100000 Producers, and they each produce messages. Are not they blocking my Consumer ? because of that common lock on the Holder.
Isn't any way, maybe another pattern that let my Consumer do his job individually ? From what I understand until now, my implementation is correct and I may try to achieve the impossible ?
To be thread-safe, the consumer and the producer may not use the queur concurrently. But adding or removing from the queue should be superfast. In a realistic example, what takes time is to produce the item (fetch a web page for example), and to consume it (parse it for example).
Your sleep() call should be outside of the synchronized block:
to avoid blocking the consumer while the producer is not using the queue;
to avoid blocking other producers while the producer is not using the queue.
.
public void addItem(T item) throws InterruptedException {
// simulating long work, not using the queue
Thread.sleep(new Random().nextInt(2000));
// long work done, now use the queue
synchronized (this) {
while (isFull()) {
System.out.println("Holder FULL. adding operation is waiting... [" + item + "]");
this.wait();
}
System.out.println(items.size() + " -- holder +++ added " + item);
items.add(item);
this.notifyAll();
}
}
In any practical scenario, you need to have a balanced number of producers and consumers as otherwise, with significantly more producers, the application will collapse sooner or later due to the heap messed up with produced items which have not consumed yet.
One solution to this is to have a bounded queue like ArrayBlockingQueue. Consumers and producers are blocked during queue access for a tiny time fraction, but if the producers are going wild, the queue’s capacity will become exhausted and the producers will go into the wait state, hence consumers can catch up then.
If you have a lot of concurrent access to a single queue, and think the small blocked times sum up to become relevant, you may use a non-blocking queue like ConcurrentLinkedQueue— it’s not recommended to try to implement such a data structure yourself. Here, consumers and producers can access the queue concurrently, however, nothing protects you from filling your heap to collapse if your producers produce faster than the consumers process the items…
I'm trying to implement a mechanism where the runnables are both producer and consumer;
Situation is-
I need to read records from the DB in batches, and process the same. I'm trying this using producer consumer pattern. I get a batch, I process. Get a batch, process. This gets a batch whenever it sees queue is empty. One of the thread goes and fetches things. But the problem is that I can't mark the records that get fetched for processing, and that's my limitation. So, if we fetch the next batch before entirely committing the previous, I might fetch the same records again. Therefore, I need to be able to submit the previous one entirely before pulling the other one. I'm getting confused as to what should I do here. I've tried keeping the count of the fetched one, and then holding my get until that count is reached too.
What's the best way of handling this situation? Processing records from DB in chunks- the biggest limitation I've here is that I can't mark the records which have been picked up. So, I want batches to go through sequentially. But a batch should use multithreading internally.
public class DealStoreEnricher extends AsyncExecutionSupport {
private static final int BATCH_SIZE = 5000;
private static final Log log = LogFactory.getLog(DealStoreEnricher.class);
private final DealEnricher dealEnricher;
private int concurrency = 10;
private final BlockingQueue<QueryDealRecord> dealsToBeEnrichedQueue;
private final BlockingQueue<QueryDealRecord> dealsEnrichedQueue;
private DealStore dealStore;
private ExtractorProcess extractorProcess;
ExecutorService executor;
public DealStoreEnricher(DealEnricher dealEnricher, DealStore dealStore, ExtractorProcess extractorProcess) {
this.dealEnricher = dealEnricher;
this.dealStore = dealStore;
this.extractorProcess = extractorProcess;
dealsToBeEnrichedQueue = new LinkedBlockingQueue<QueryDealRecord>();
dealsEnrichedQueue = new LinkedBlockingQueue<QueryDealRecord>(BATCH_SIZE * 3);
}
public ExtractorProcess getExtractorProcess() {
return extractorProcess;
}
public DealEnricher getDealEnricher() {
return dealEnricher;
}
public int getConcurrency() {
return concurrency;
}
public void setConcurrency(int concurrency) {
this.concurrency = concurrency;
}
public DealStore getDealStore() {
return dealStore;
}
public DealStoreEnricher withConcurrency(int concurrency) {
setConcurrency(concurrency);
return this;
}
#Override
public void start() {
super.start();
executor = Executors.newFixedThreadPool(getConcurrency());
for (int i = 0; i < getConcurrency(); i++)
executor.submit(new Runnable() {
public void run() {
try {
QueryDealRecord record = null;
while ((record = get()) != null && !isCancelled()) {
try {
update(getDealEnricher().enrich(record));
processed.incrementAndGet();
} catch (Exception e) {
failures.incrementAndGet();
log.error("Failed to process deal: " + record.getTradeId(), e);
}
}
} catch (InterruptedException e) {
setCancelled();
}
}
});
executor.shutdown();
}
protected void update(QueryDealRecord enrichedRecord) {
dealsEnrichedQueue.add(enrichedRecord);
if (batchComplete()) {
List<QueryDealRecord> enrichedRecordsBatch = new ArrayList<QueryDealRecord>();
synchronized (this) {
dealsEnrichedQueue.drainTo(enrichedRecordsBatch);
}
if (!enrichedRecordsBatch.isEmpty())
updateTheDatabase(enrichedRecordsBatch);
}
}
private void updateTheDatabase(List<QueryDealRecord> enrichedRecordsBatch) {
getDealStore().insertEnrichedData(enrichedRecordsBatch, getExtractorProcess());
}
/**
* #return true if processed records have reached the batch size or there's
* nothing to be processed now.
*/
private boolean batchComplete() {
return dealsEnrichedQueue.size() >= BATCH_SIZE || dealsToBeEnrichedQueue.isEmpty();
}
/**
* Gets an item from the queue of things to be enriched
*
* #return {#linkplain QueryDealRecord} to be enriched
* #throws InterruptedException
*/
protected synchronized QueryDealRecord get() throws InterruptedException {
try {
if (!dealsToBeEnrichedQueue.isEmpty()) {
return dealsToBeEnrichedQueue.take();
} else {
List<QueryDealRecord> records = getNextBatchToBeProcessed();
if (!records.isEmpty()) {
dealsToBeEnrichedQueue.addAll(records);
return dealsToBeEnrichedQueue.take();
}
}
} catch (InterruptedException ie) {
throw new UnRecoverableException("Unable to retrieve QueryDealRecord", ie);
}
return null;
}
private List<QueryDealRecord> getNextBatchToBeProcessed() {
List<QueryDealRecord> recordsThatNeedEnriching = getDealStore().getTheRecordsThatNeedEnriching(getExtractorProcess());
return recordsThatNeedEnriching;
}
#Override
public void stop() {
super.stop();
if (executor != null)
executor.shutdownNow();
}
#Override
public boolean await() throws InterruptedException {
return executor.awaitTermination(Long.MAX_VALUE, TimeUnit.SECONDS) && !isCancelled() && complete();
}
#Override
public boolean await(long timeout, TimeUnit unit) throws InterruptedException {
return executor.awaitTermination(timeout, unit) && !isCancelled() && complete();
}
private boolean complete() {
setCompleted();
return true;
}
}
You're already using a BlockingQueue - it does all that work for you.
However, you're using the wrong method addAll() to add new elements to the queue. That method will throw an exception if the queue is not able to accept elements. Rather you should use put() because that's the blocking method corresponding to take(), which you are using correctly.
Regarding your statement in the post title:
second batch shouldn't come until the previous batch is complete
You need not be concerned about the timing of the incoming versus outgoing batches if you use BlockingQueue correctly.
It looks like a Semaphore will work perfectly for you. Have the producing thread acquire the semaphore while the consuming thread releases the semaphore when it completes the batch.
BlockingQueue blockingQueue = ...;
Semapore semaphore = new Semaphore(1);
Producing-Thread
Batch batch = db.getBatch();
semaphore.acquire(); // wait until previous batch completes
blockingQueue.add(batch);
Consuming Thread
for(;;){
Batch batch = blockingQueue.take();
doBatchUpdate(batch);
semaphore.release(); // tell next batch to run
}
I need to send multiple requests to many different web services and receive the results. The problem is that, if I send the requests one by one it takes so long as I need to send and process all individually.
I am wondering how I can send all the requests at once and receive the results.
As the following code shows, I have three major methods and each has its own sub methods.
Each sub method sends request to its associated web service and receive the results;therefore, for example, to receive the results of web service 9 I have to wait till all web services from 1 to 8 get completed, it takes a long time to send all the requests one by one and receive their results.
As shown below none of the methods nor sub-methods are related to each other, so I can call them all and receive their results in any order, the only thing which is important is to receive the results of each sub-method and populate their associated lists.
private List<StudentsResults> studentsResults = new ArrayList();
private List<DoctorsResults> doctorsResults = new ArrayList();
private List<PatientsResults> patientsResults = new ArrayList();
main (){
retrieveAllLists();
}
retrieveAllLists(){
retrieveStudents();
retrieveDoctors();
retrievePatients();
}
retrieveStudents(){
this.studentsResults = retrieveStdWS1(); //send request to Web Service 1 to receive its list of students
this.studentsResults = retrieveStdWS2(); //send request to Web Service 2 to receive its list of students
this.studentsResults = retrieveStdWS3(); //send request to Web Service 3 to receive its list of students
}
retrieveDoctors(){
this.doctorsResults = retrieveDocWS4(); //send request to Web Service 4 to receive its list of doctors
this.doctorsResults = retrieveDocWS5(); //send request to Web Service 5 to receive its list of doctors
this.doctorsResults = retrieveDocWS6(); //send request to Web Service 6 to receive its list of doctors
}
retrievePatients(){
this.patientsResults = retrievePtWS7(); //send request to Web Service 7 to receive its list of patients
this.patientsResults = retrievePtWS8(); //send request to Web Service 8 to receive its list of patients
this.patientsResults = retrievePtWS9(); //send request to Web Service 9 to receive its list of patients
}
That is a simple fork-join approach, but for clarity, you can start any number of threads and retrieve the results later as they are available, such as this approach.
ExecutorService pool = Executors.newFixedThreadPool(10);
List<Callable<String>> tasks = new ArrayList<>();
tasks.add(new Callable<String>() {
public String call() throws Exception {
Thread.sleep((new Random().nextInt(5000)) + 500);
return "Hello world";
}
});
List<Future<String>> results = pool.invokeAll(tasks);
for (Future<String> future : results) {
System.out.println(future.get());
}
pool.shutdown();
UPDATE, COMPLETE:
Here's a verbose, but workable solution. I wrote it ad hoc, and have not compiled it.
Given the three lists have diffent types, and the WS methods are individual, it is not
really modular, but try to use your best programming skills and see if you can modularize it a bit better.
ExecutorService pool = Executors.newFixedThreadPool(10);
List<Callable<List<StudentsResults>>> stasks = new ArrayList<>();
List<Callable<List<DoctorsResults>>> dtasks = new ArrayList<>();
List<Callable<List<PatientsResults>>> ptasks = new ArrayList<>();
stasks.add(new Callable<List<StudentsResults>>() {
public List<StudentsResults> call() throws Exception {
return retrieveStdWS1();
}
});
stasks.add(new Callable<List<StudentsResults>>() {
public List<StudentsResults> call() throws Exception {
return retrieveStdWS2();
}
});
stasks.add(new Callable<List<StudentsResults>>() {
public List<StudentsResults> call() throws Exception {
return retrieveStdWS3();
}
});
dtasks.add(new Callable<List<DoctorsResults>>() {
public List<DoctorsResults> call() throws Exception {
return retrieveDocWS4();
}
});
dtasks.add(new Callable<List<DoctorsResults>>() {
public List<DoctorsResults> call() throws Exception {
return retrieveDocWS5();
}
});
dtasks.add(new Callable<List<DoctorsResults>>() {
public List<DoctorsResults> call() throws Exception {
return retrieveDocWS6();
}
});
ptasks.add(new Callable<List<PatientsResults>>() {
public List<PatientsResults> call() throws Exception {
return retrievePtWS7();
}
});
ptasks.add(new Callable<List<PatientsResults>>() {
public List<PatientsResults> call() throws Exception {
return retrievePtWS8();
}
});
ptasks.add(new Callable<List<PatientsResults>>() {
public List<PatientsResults> call() throws Exception {
return retrievePtWS9();
}
});
List<Future<List<StudentsResults>>> sresults = pool.invokeAll(stasks);
List<Future<List<DoctorsResults>>> dresults = pool.invokeAll(dtasks);
List<Future<List<PatientsResults>>> presults = pool.invokeAll(ptasks);
for (Future<List<StudentsResults>> future : sresults) {
this.studentsResults.addAll(future.get());
}
for (Future<List<DoctorsResults>> future : dresults) {
this.doctorsResults.addAll(future.get());
}
for (Future<List<PatientsResults>> future : presults) {
this.patientsResults.addAll(future.get());
}
pool.shutdown();
Each Callable returns a list of results, and is called in its own separate thread.
When you invoke the Future.get() method you get the result back onto the main thread.
The result is NOT available until the Callable have finished, hence there is no concurrency issues.
So just for fun I am providing two working examples. The first one shows the old school way of doing this before java 1.5. The second shows a much cleaner way using tools available within java 1.5:
import java.util.ArrayList;
public class ThreadingExample
{
private ArrayList <MyThread> myThreads;
public static class MyRunnable implements Runnable
{
private String data;
public String getData()
{
return data;
}
public void setData(String data)
{
this.data = data;
}
#Override
public void run()
{
}
}
public static class MyThread extends Thread
{
private MyRunnable myRunnable;
MyThread(MyRunnable runnable)
{
super(runnable);
setMyRunnable(runnable);
}
/**
* #return the myRunnable
*/
public MyRunnable getMyRunnable()
{
return myRunnable;
}
/**
* #param myRunnable the myRunnable to set
*/
public void setMyRunnable(MyRunnable myRunnable)
{
this.myRunnable = myRunnable;
}
}
public ThreadingExample()
{
myThreads = new ArrayList <MyThread> ();
}
public ArrayList <String> retrieveMyData ()
{
ArrayList <String> allmyData = new ArrayList <String> ();
if (isComplete() == false)
{
// Sadly we aren't done
return (null);
}
for (MyThread myThread : myThreads)
{
allmyData.add(myThread.getMyRunnable().getData());
}
return (allmyData);
}
private boolean isComplete()
{
boolean complete = true;
// wait for all of them to finish
for (MyThread x : myThreads)
{
if (x.isAlive())
{
complete = false;
break;
}
}
return (complete);
}
public void kickOffQueries()
{
myThreads.clear();
MyThread a = new MyThread(new MyRunnable()
{
#Override
public void run()
{
// This is where you make the call to external services
// giving the results to setData("");
setData("Data from list A");
}
});
myThreads.add(a);
MyThread b = new MyThread (new MyRunnable()
{
#Override
public void run()
{
// This is where you make the call to external services
// giving the results to setData("");
setData("Data from list B");
}
});
myThreads.add(b);
for (MyThread x : myThreads)
{
x.start();
}
boolean done = false;
while (done == false)
{
if (isComplete())
{
done = true;
}
else
{
// Sleep for 10 milliseconds
try
{
Thread.sleep(10);
}
catch (InterruptedException e)
{
e.printStackTrace();
}
}
}
}
public static void main(String [] args)
{
ThreadingExample example = new ThreadingExample();
example.kickOffQueries();
ArrayList <String> data = example.retrieveMyData();
if (data != null)
{
for (String s : data)
{
System.out.println (s);
}
}
}
}
This is the much simpler working version:
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
public class ThreadingExample
{
public static void main(String [] args)
{
ExecutorService service = Executors.newCachedThreadPool();
Set <Callable<String>> callables = new HashSet <Callable<String>> ();
callables.add(new Callable<String>()
{
#Override
public String call() throws Exception
{
return "This is where I make the call to web service A, and put its results here";
}
});
callables.add(new Callable<String>()
{
#Override
public String call() throws Exception
{
return "This is where I make the call to web service B, and put its results here";
}
});
callables.add(new Callable<String>()
{
#Override
public String call() throws Exception
{
return "This is where I make the call to web service C, and put its results here";
}
});
try
{
List<Future<String>> futures = service.invokeAll(callables);
for (Future<String> future : futures)
{
System.out.println (future.get());
}
}
catch (InterruptedException e)
{
e.printStackTrace();
}
catch (ExecutionException e)
{
e.printStackTrace();
}
}
}
You can ask your jax-ws implementation to generate asynchronous bindings for the web service.
This has two advantages that I can see:
As discussed in Asynchronous web services calls with JAX-WS: Use wsimport support for asynchrony or roll my own? , jax-ws will generate well-tested (and possibly fancier) code for you, you need not instantiate the ExecutorService yourself. So less work for you! (but also less control over the threading implementation details)
The generated bindings include a method where you specify a callback handler, which may suit your needs better than synchronously get() ting all response lists on the thread calling retrieveAllLists(). It allows for per-service-call error handling and will process the results in parallel, which is nice if processing is non-trivial.
An example for Metro can be found on the Metro site. Note the contents of the custom bindings file custom-client.xml :
<bindings ...>
<bindings node="wsdl:definitions">
<enableAsyncMapping>true</enableAsyncMapping>
</bindings>
</bindings>
When you specify this bindings file to wsimport, it'll generate a client which returns an object that implements javax.xml.ws.Response<T>. Response extends the Future interface that others also suggest you use when rolling your own implementation.
So, unsurprisingly, if you go without the callbacks, the code will look similar to the other answers:
public void retrieveAllLists() throws ExecutionException{
// first fire all requests
Response<List<StudentsResults>> students1 = ws1.getStudents();
Response<List<StudentsResults>> students2 = ws2.getStudents();
Response<List<StudentsResults>> students3 = ws3.getStudents();
Response<List<DoctorsResults>> doctors1 = ws4.getDoctors();
Response<List<DoctorsResults>> doctors2 = ws5.getDoctors();
Response<List<DoctorsResults>> doctors3 = ws6.getDoctors();
Response<List<PatientsResults>> patients1 = ws7.getPatients();
Response<List<PatientsResults>> patients2 = ws8.getPatients();
Response<List<PatientsResults>> patients3 = ws9.getPatients();
// then await and collect all the responses
studentsResults.addAll(students1.get());
studentsResults.addAll(students2.get());
studentsResults.addAll(students3.get());
doctorsResults.addAll(doctors1.get());
doctorsResults.addAll(doctors2.get());
doctorsResults.addAll(doctors3.get());
patientsResults.addAll(patients1.get());
patientsResults.addAll(patients2.get());
patientsResults.addAll(patients3.get());
}
If you create callback handers such as
private class StudentsCallbackHandler
implements AsyncHandler<Response<List<StudentsResults>>> {
public void handleResponse(List<StudentsResults> response) {
try {
studentsResults.addAll(response.get());
} catch (ExecutionException e) {
errors.add(new CustomError("Failed to retrieve Students.", e.getCause()));
} catch (InterruptedException e) {
log.error("Interrupted", e);
}
}
}
you can use them like this:
public void retrieveAllLists() {
List<Future<?>> responses = new ArrayList<Future<?>>();
// fire all requests, specifying callback handlers
responses.add(ws1.getStudents(new StudentsCallbackHandler()));
responses.add(ws2.getStudents(new StudentsCallbackHandler()));
responses.add(ws3.getStudents(new StudentsCallbackHandler()));
...
// await completion
for( Future<?> response: responses ) {
response.get();
}
// or do some other work, and poll response.isDone()
}
Note that the studentResults collection needs to be thread safe now, since results will get added concurrently!
Looking at the problem, you need to integrate your application with 10+ different webservices.While making all the calls asynchronous. This can be done easily with Apache Camel. It is a prominent framework for enterprise integration and also supports async processing. You can use its CXF component for calling webservices and its routing engine for invocation and processing results. Look at the following page regarding camel's async routing capability. They have also provided a complete example invoking webservices async using CXF, it available at its maven repo. Also see the following page for more details.
You might consider the following paradigm in which you create work (serially), but the actual work is done in parallel. One way to do this is to: 1) have your "main" create a queue of work items; 2) create a "doWork" object that queries the queue for work to do; 3) have "main" start some number of "doWork" threads (can be same number as number of different services, or a smaller number); have the "doWork" objects put add their results to an object list (whatever construct works Vector, list...).
Each "doWork" object would mark their queue item complete, put all results in the passed container and check for new work (if no more on the queue, it would sleep and try again).
Of course you will want to see how well you can construct your class model. If each of the webservices is quite different for parsing, then you may want to create an Interface that each of your "retrieveinfo" classes promises to implement.
It has got various option to develop this.
JMS : quality of service and management, e.g. redelivery attempt, dead message queue, load management, scalability, clustering, monitoring, etc.
Simply using the Observer pattern for this. For more details OODesign and How to solve produce and consumer follow this Kodelog**
I have to write this produce consumer application using multithreading. I wrote the following java code but havn;t been able to figure out where it is getting wrong. Also i want to know whether my class design is apt or if my coding style is appropriate.
Thanks in Advance!!!
EDIT
I have modified the produce consumer code: But it still has some problem.
import java.util.*;
import java.lang.Thread;
public class pc_example {
public static void main (String [] args) {
Store store = new Store( 10 );
produce p = new produce(store);
consume c = new consume (store);
p.start();
c.start();
}
}
class Store {
public Queue<Integer> Q;
public int max_capacity;
Store( int max_capacity ) {
Q = new LinkedList<Integer>();
this.max_capacity = max_capacity;
}
}
class produce extends Thread {
private Store store;
private int element;
produce ( Store store ) {
this.store = store;
this.element = 0;
}
public void put() {
synchronized (store) {
if (store.Q.size() > store.max_capacity) {
try {
wait();
} catch (InterruptedException e) {}
}
else {
element ++;
System.out.println( "Producer put: " + element );
store.Q.add(element);
notify();
}
}
}
}
class consume extends Thread {
private int cons;
private Store store;
consume (Store store) {
this.store = store;
this.cons = 0;
}
public void get() {
synchronized (store) {
if (store.Q.size() == 0) {
try {
wait();
} catch (InterruptedException e) {}
}
else {
int a = store.Q.remove();
System.out.println( "Consumer put: " + a );
cons++;
if (store.Q.size() < store.max_capacity)
notify();
}
}
}
}
You are creating two instances of Producer_Consumer which are having their own queues, so there's no sharing between. You should not instantiate the queue in the classes, but provide it outside as a constructor argument.
class Producer_Consumer extends Thread {
private final Queue<Integer> queue;
Producer_Consumer(int mode, Queue<Integer> queue)
{
this.queue = queue;
}
public static void main(String[] args)
{
Queue<Integer> queue = new LinkedQueue<Integer>();
Producer_Consumer produce = new Producer_Consumer(queue, 2);
Producer_Consumer consume = new Producer_Consumer(queue, 1);
produce.start();
consume.start();
}
}
Further improvements could be done as suggested using a blocking queue from java.util.concurrent package. There's really no need of using Object's methods wait() and notify() for this kind of tasks.
For a complete example see the producer-consumer example in the java api for BlockingQueue.
There are several errors in the code. For the first the producer and the consumer are not using the same queue e.g. there are two instances of the queues. Secondly notify and wait methods are also operating on different objects.
Getting your example to work needs several things:
Only one queue
Thread safe handling of the queue
Handling notification and waiting on the same object
The producer code could be rewritten to:
public void produce() {
int i = 0;
while (i < 100) {
synchronized(Q) {
if (Q.size() < max_capacity) {
Q.add(i);
System.out.println("Produced Item" + i);
i++;
Q.notify();
} else {
try {
Q.wait();
} catch (InterruptedException e) {
System.out.println("Exception");
}
}
}
}
}
1, Use appropriate types. Your mode is much better off as en enumeration instead as an int.
2, Your conduit between the threads, Q, isn't actually shared since it is not declared static.
You would have problems anyway since linkedlist isn't synchronized.
Synchronizing produce() and consume()makes no difference.
This is what a BlockingQueue is for.
Each of your objects is working on a a different instance of the
Queue<Integer> Q
so the producer puts stuff into one, but the consumer never looks in that one - it's trying to get items from a Q that never gets anything put into it.
However, once you address that you need to make sure that the Queue<> object is handled in a threadsafe manner. While the produce() and consume() methods are each synchronized, the synchronization at this level won't help since you're dealing with two distinct Producer_Consumer objects. They need to synchronize their access to the shared resource some other way.
I suggest to look at the classes in java.util.concurrent (available from Java 1.5). In particular, instead of a Queue, you might use a BlockingQueue.
It allows you to produce:
try {
while(true) { queue.put(produce()); }
} catch (InterruptedException ex) { ... handle ...}
and consume:
try {
while(true) { consume(queue.take()); }
} catch (InterruptedException ex) { ... handle ...}
Otherwize, (if this is an exercise on java synchronization), you should
improve the visibility of fields (why only max_capacity is private?)
improve the design (I prefer to create two separate classes for producers and consumers)
ensure that producers and consumers wait and notify on the SAME object
make producers and consumers work on the same queue
Run methods are missing in your Thread classes. So your threads did start and finish doing nothing. Rename the put and get methods to run and use while loop. Also note that you need to call the notify and wait on the store (monitor).
public void run() {
while(true){
synchronized (store) {
if (store.Q.size() > store.max_capacity) {
try {
store.wait();
} catch (InterruptedException e) {}
}
else {
element ++;
System.out.println( "Producer put: " + element );
store.Q.add(element);
store.notify();
}
}
}
}