producer - consume; how does the consumer stop? - java

So I have simulated my producer consumer problem and I have the code below. My question is this: how does the consumer stops if he's in constant while(true).
In the code below, I've added
if (queue.peek()==null)
Thread.currentThread().interrupt();
which works nicely in this example. But in my real world design, this doesn't work (sometimes it takes longer time to the producer to 'put' the data so the exception thrown in the consumer is incorrect. In general, I know I can put a 'poison' data such as Object is XYZ and I can check it in the consumer. But this poison makes the code really look bad. Wonder if anyone has a different approach.
public class ConsumerThread implements Runnable
{
private BlockingQueue<Integer> queue;
private String name;
private boolean isFirstTimeConsuming = true;
public ConsumerThread(String name, BlockingQueue<Integer> queue)
{
this.queue=queue;
this.name=name;
}
#Override
public void run()
{
try
{
while (true)
{
if (isFirstTimeConsuming)
{
System.out.println(name+" is initilizing...");
Thread.sleep(4000);
isFirstTimeConsuming=false;
}
try{
if (queue.peek()==null)
Thread.currentThread().interrupt();
Integer data = queue.take();
System.out.println(name+" consumed ------->"+data);
Thread.sleep(70);
}catch(InterruptedException ie)
{
System.out.println("InterruptedException!!!!");
break;
}
}
System.out.println("Comsumer " + this.name + " finished its job; terminating.");
}catch (InterruptedException e)
{
e.printStackTrace();
}
}
}

A: There is simply no guarantee that just because peek returns null, the producer has stopped producing. What if the producer simply got slowed down? Now, the consumer quits, and the producer keeps producing. So the 'peek' -> 'break' idea basically fails.
B: Setting a 'done/run' flag from consumer and reading it in producer also fails, if:
consumer checks the flag, finds it should keep running, then does a 'take'
in meanwhile, producer was setting the flag to 'dont run'
Now consumer blocks forever waiting for a ghost packet
The opposite can also happen, and one packet gets left out un-consumed.
Then to get around this, you will want to do additional synchronization with mutexes over and above the 'BlockingQueue'.
C:
I find 'Rosetta Code' to be fine source of deciding what is good practice, in situations like this:
http://rosettacode.org/wiki/Synchronous_concurrency#Java
The producer and consumer must agree upon an object (or an attribute in the object) that represents end of input. Then the producer sets that attribute in the last packet, and the consumer stops consuming it. i.e. what you referred to in your question as 'poison'.
In the Rosetta Code example above, this 'object' is simply an empty String called 'EOF':
final String EOF = new String();
// Producer
while ((line = br.readLine()) != null)
queue.put(line);
br.close();
// signal end of input
queue.put(EOF);
// Consumer
while (true)
{
try
{
String line = queue.take();
// Reference equality
if (line == EOF)
break;
System.out.println(line);
linesWrote++;
}
catch (InterruptedException ie)
{
}
}

Do NOT use interrupt on Thread, but rather break the loop when not needed anymore :
if (queue.peek()==null)
break;
Or you can also using a variable to mark closing operation pending and then break the loop and close the loop after :
if (queue.peek()==null)
closing = true;
//Do further operations ...
if(closing)
break;

In the real world, most messaging comes with a header of some sort that defines a message type / sub-type or perhaps different objects.
You can create a command and control object or message type that tells the thread to do something when it gets the message (like shutdown, reload a table, add a new listener, etc.).
This way, you can have say a command and control thread just send messages into the normal message flow. You can have the CNC thread talking to an operational terminal in a large scale system, etc.

If your queue can empty before you'd like your consumer to terminate then you'll need a flag to tell the thread when to stop. Add a setter method so the producer can tell the consumer to shutdown. Then modify your code so that instead of:
if (queue.isEmpty())
break;
have your code check
if (!run)
{
break;
}
else if (queue.isEmpty())
{
Thread.sleep(200);
continue;
}

You can use this typesafe pattern with poison pills:
public sealed interface BaseMessage {
final class ValidMessage<T> implements BaseMessage {
#Nonnull
private final T value;
public ValidMessage(#Nonnull T value) {
this.value = value;
}
#Nonnull
public T getValue() {
return value;
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
ValidMessage<?> that = (ValidMessage<?>) o;
return value.equals(that.value);
}
#Override
public int hashCode() {
return Objects.hash(value);
}
#Override
public String toString() {
return "ValidMessage{value=%s}".formatted(value);
}
}
final class PoisonedMessage implements BaseMessage {
public static final PoisonedMessage INSTANCE = new PoisonedMessage();
private PoisonedMessage() {
}
#Override
public String toString() {
return "PoisonedMessage{}";
}
}
}
public class Producer implements Callable<Void> {
#Nonnull
private final BlockingQueue<BaseMessage> messages;
Producer(#Nonnull BlockingQueue<BaseMessage> messages) {
this.messages = messages;
}
#Override
public Void call() throws Exception {
messages.put(new BaseMessage.ValidMessage<>(1));
messages.put(new BaseMessage.ValidMessage<>(2));
messages.put(new BaseMessage.ValidMessage<>(3));
messages.put(BaseMessage.PoisonedMessage.INSTANCE);
return null;
}
}
public class Consumer implements Callable<Void> {
#Nonnull
private final BlockingQueue<BaseMessage> messages;
private final int maxPoisons;
public Consumer(#Nonnull BlockingQueue<BaseMessage> messages, int maxPoisons) {
this.messages = messages;
this.maxPoisons = maxPoisons;
}
#Override
public Void call() throws Exception {
int poisonsReceived = 0;
while (poisonsReceived < maxPoisons && !Thread.currentThread().isInterrupted()) {
BaseMessage message = messages.take();
if (message instanceof BaseMessage.ValidMessage<?> vm) {
Integer value = (Integer) vm.getValue();
System.out.println(value);
} else if (message instanceof BaseMessage.PoisonedMessage) {
++poisonsReceived;
} else {
throw new IllegalArgumentException("Invalid BaseMessage type: " + message);
}
}
return null;
}
}

Related

Multithreaded execution where order of finished Work Items is preserved

I have a flow of units of work, lets call them "Work Items" that are processed sequentially (for now). I'd like to speed up processing by doing the work multithreaded.
Constraint: Those work items come in a specific order, during processing the order is not relevant - but once processing is finished the order must be restored.
Something like this:
|.|
|.|
|4|
|3|
|2| <- incoming queue
|1|
/ | \
2 1 3 <- worker threads
\ | /
|3|
|2| <- outgoing queue
|1|
I would like to solve this problem in Java, preferably without Executor Services, Futures, etc., but with basic concurrency methods like wait(), notify(), etc.
Reason is: My Work Items are very small and fine grained, they finish processing in about 0.2 milliseconds each. So I fear using stuff from java.util.concurrent.* might introduce way to much overhead and slow my code down.
The examples I found so far all preserve the order during processing (which is irrelevant in my case) and didn't care about order after processing (which is crucial in my case).
This is how I solved your problem in a previous project (but with java.util.concurrent):
(1) WorkItem class does the actual work/processing:
public class WorkItem implements Callable<WorkItem> {
Object content;
public WorkItem(Object content) {
super();
this.content = content;
}
public WorkItem call() throws Exception {
// getContent() + do your processing
return this;
}
}
(2) This class puts Work Items in a queue and initiates processing:
public class Producer {
...
public Producer() {
super();
workerQueue = new ArrayBlockingQueue<Future<WorkItem>>(THREADS_TO_USE);
completionService = new ExecutorCompletionService<WorkItem>(Executors.newFixedThreadPool(THREADS_TO_USE));
workerThread = new Thread(new Worker(workerQueue));
workerThread.start();
}
public void send(Object o) throws Exception {
WorkItem workItem = new WorkItem(o);
Future<WorkItem> future = completionService.submit(workItem);
workerQueue.put(future);
}
}
(3) Once processing is finished the Work Items are dequeued here:
public class Worker implements Runnable {
private ArrayBlockingQueue<Future<WorkItem>> workerQueue = null;
public Worker(ArrayBlockingQueue<Future<WorkItem>> workerQueue) {
super();
this.workerQueue = workerQueue;
}
public void run() {
while (true) {
Future<WorkItem> fwi = workerQueue.take(); // deqeueue it
fwi.get(); // wait for it till it has finished processing
}
}
}
(4) This is how you would use the stuff in your code and submit new work:
public class MainApp {
public static void main(String[] args) throws Exception {
Producer p = new Producer();
for (int i = 0; i < 10000; i++)
p.send(i);
}
}
If you allow BlockingQueue, why would you ignore the rest of the concurrency utils in java?
You could use e.g. Stream (if you have java 1.8) for the above:
List<Type> data = ...;
List<Other> out = data.parallelStream()
.map(t -> doSomeWork(t))
.collect(Collectors.toList());
Because you started from an ordered Collection (List), and collect also to a List, you will have results in the same order as the input.
Just ID each of the objects for processing, create a proxy which would accept done work and allow to return it only when the ID pushed was sequential. A sample code below. Note how simple it is, utilizing an unsynchronized auto-sorting collection and just 2 simple methods as API.
public class SequentialPushingProxy {
static class OrderedJob implements Comparable<OrderedJob>{
static AtomicInteger idSource = new AtomicInteger();
int id;
public OrderedJob() {
id = idSource.incrementAndGet();
}
public int getId() {
return id;
}
#Override
public int compareTo(OrderedJob o) {
return Integer.compare(id, o.getId());
}
}
int lastId = OrderedJob.idSource.get();
public Queue<OrderedJob> queue;
public SequentialPushingProxy() {
queue = new PriorityQueue<OrderedJob>();
}
public synchronized void pushResult(OrderedJob job) {
queue.add(job);
}
List<OrderedJob> jobsToReturn = new ArrayList<OrderedJob>();
public synchronized List<OrderedJob> getFinishedJobs() {
while (queue.peek() != null) {
// only one consumer at a time, will be safe
if (queue.peek().getId() == lastId+1) {
jobsToReturn.add(queue.poll());
lastId++;
} else {
break;
}
}
if (jobsToReturn.size() != 0) {
List<OrderedJob> toRet = jobsToReturn;
jobsToReturn = new ArrayList<OrderedJob>();
return toRet;
}
return Collections.emptyList();
}
public static void main(String[] args) {
final SequentialPushingProxy proxy = new SequentialPushingProxy();
int numProducerThreads = 5;
for (int i=0; i<numProducerThreads; i++) {
new Thread(new Runnable() {
#Override
public void run() {
while(true) {
proxy.pushResult(new OrderedJob());
}
}
}).start();
}
int numConsumerThreads = 1;
for (int i=0; i<numConsumerThreads; i++) {
new Thread(new Runnable() {
#Override
public void run() {
while(true) {
List<OrderedJob> ret = proxy.getFinishedJobs();
System.out.println("got "+ret.size()+" finished jobs");
try {
Thread.sleep(200);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}).start();
}
try {
Thread.sleep(5000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.exit(0);
}
}
This code could be easily improved to
allow pushing more than one job result at once, to reduce the synchronization costs
introduce a limit to returned collection to get done jobs in smaller chunks
extract an interface for those 2 public methods and switch implementations to perform tests
You could have 3 input and 3 output queues - one of each type for each worker thread.
Now when you want to insert something into the input queue you put it into only one of the 3 input queues. You change the input queues in a round robin fashion. The same applies to the output, when you want to take something from the output you choose the first of the output queues and once you get your element you switch to the next queue.
All the queues need to be blocking.
Pump all your Futures through a BlockingQueue. Here's all the code you need:
public class SequentialProcessor implements Consumer<Task> {
private final ExecutorService executor = Executors.newCachedThreadPool();
private final BlockingDeque<Future<Result>> queue = new LinkedBlockingDeque<>();
public SequentialProcessor(Consumer<Result> listener) {
new Thread(() -> {
while (true) {
try {
listener.accept(queue.take().get());
} catch (InterruptedException | ExecutionException e) {
// handle the exception however you want, perhaps just logging it
}
}
}).start();
}
public void accept(Task task) {
queue.add(executor.submit(callableFromTask(task)));
}
private Callable<Result> callableFromTask(Task task) {
return <how to create a Result from a Task>; // implement this however
}
}
Then to use, create a SequentialProcessor (once):
SequentialProcessor processor = new SequentialProcessor(whatToDoWithResults);
and pump tasks to it:
Stream<Task> tasks; // given this
tasks.forEach(processor); // simply this
I created the callableFromTask() method for illustration, but you can dispense with it if getting a Result from a Task is simple by using a lambda instead or method reference instead.
For example, if Task had a getResult() method, do this:
queue.add(executor.submit(task::getResult));
or if you need an expression (lambda):
queue.add(executor.submit(() -> task.getValue() + "foo")); // or whatever
Reactive programming could help. During my brief experience with RxJava I found it to be intuitive and easy to work with than core language features like Future etc. Your mileage may vary. Here are some helpful starting points https://www.youtube.com/watch?v=_t06LRX0DV0
The attached example also shows how this could be done. In the example below we have Packet's which need to be processed. They are taken through a simple trasnformation and fnally merged into one list. The output appended to this message shows that the Packets are received and transformed at different points in time but in the end they are output in the order they have been received
import static java.time.Instant.now;
import static rx.schedulers.Schedulers.io;
import java.time.Instant;
import java.util.List;
import java.util.Random;
import rx.Observable;
import rx.Subscriber;
public class RxApp {
public static void main(String... args) throws InterruptedException {
List<ProcessedPacket> processedPackets = Observable.range(0, 10) //
.flatMap(i -> {
return getPacket(i).subscribeOn(io());
}) //
.map(Packet::transform) //
.toSortedList() //
.toBlocking() //
.single();
System.out.println("===== RESULTS =====");
processedPackets.stream().forEach(System.out::println);
}
static Observable<Packet> getPacket(Integer i) {
return Observable.create((Subscriber<? super Packet> s) -> {
// simulate latency
try {
Thread.sleep(new Random().nextInt(5000));
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("packet requested for " + i);
s.onNext(new Packet(i.toString(), now()));
s.onCompleted();
});
}
}
class Packet {
String aString;
Instant createdOn;
public Packet(String aString, Instant time) {
this.aString = aString;
this.createdOn = time;
}
public ProcessedPacket transform() {
System.out.println(" Packet being transformed " + aString);
try {
Thread.sleep(new Random().nextInt(5000));
} catch (Exception e) {
e.printStackTrace();
}
ProcessedPacket newPacket = new ProcessedPacket(this, now());
return newPacket;
}
#Override
public String toString() {
return "Packet [aString=" + aString + ", createdOn=" + createdOn + "]";
}
}
class ProcessedPacket implements Comparable<ProcessedPacket> {
Packet p;
Instant processedOn;
public ProcessedPacket(Packet p, Instant now) {
this.p = p;
this.processedOn = now;
}
#Override
public int compareTo(ProcessedPacket o) {
return p.createdOn.compareTo(o.p.createdOn);
}
#Override
public String toString() {
return "ProcessedPacket [p=" + p + ", processedOn=" + processedOn + "]";
}
}
Deconstruction
Observable.range(0, 10) //
.flatMap(i -> {
return getPacket(i).subscribeOn(io());
}) // source the input as observables on multiple threads
.map(Packet::transform) // processing the input data
.toSortedList() // sorting to sequence the processed inputs;
.toBlocking() //
.single();
On one particular run Packets were received in the order 2,6,0,1,8,7,5,9,4,3 and processed in order 2,6,0,1,3,4,5,7,8,9 on different threads
packet requested for 2
Packet being transformed 2
packet requested for 6
Packet being transformed 6
packet requested for 0
packet requested for 1
Packet being transformed 0
packet requested for 8
packet requested for 7
packet requested for 5
packet requested for 9
Packet being transformed 1
packet requested for 4
packet requested for 3
Packet being transformed 3
Packet being transformed 4
Packet being transformed 5
Packet being transformed 7
Packet being transformed 8
Packet being transformed 9
===== RESULTS =====
ProcessedPacket [p=Packet [aString=2, createdOn=2016-04-14T13:48:52.060Z], processedOn=2016-04-14T13:48:53.247Z]
ProcessedPacket [p=Packet [aString=6, createdOn=2016-04-14T13:48:52.130Z], processedOn=2016-04-14T13:48:54.208Z]
ProcessedPacket [p=Packet [aString=0, createdOn=2016-04-14T13:48:53.989Z], processedOn=2016-04-14T13:48:55.786Z]
ProcessedPacket [p=Packet [aString=1, createdOn=2016-04-14T13:48:54.109Z], processedOn=2016-04-14T13:48:57.877Z]
ProcessedPacket [p=Packet [aString=8, createdOn=2016-04-14T13:48:54.418Z], processedOn=2016-04-14T13:49:14.108Z]
ProcessedPacket [p=Packet [aString=7, createdOn=2016-04-14T13:48:54.600Z], processedOn=2016-04-14T13:49:11.338Z]
ProcessedPacket [p=Packet [aString=5, createdOn=2016-04-14T13:48:54.705Z], processedOn=2016-04-14T13:49:06.711Z]
ProcessedPacket [p=Packet [aString=9, createdOn=2016-04-14T13:48:55.227Z], processedOn=2016-04-14T13:49:16.927Z]
ProcessedPacket [p=Packet [aString=4, createdOn=2016-04-14T13:48:56.381Z], processedOn=2016-04-14T13:49:02.161Z]
ProcessedPacket [p=Packet [aString=3, createdOn=2016-04-14T13:48:56.566Z], processedOn=2016-04-14T13:49:00.557Z]
You could launch a DoTask thread for every WorkItem. This thread processes the work.
When the work is done, you try to post the item, synchronized on the controlling object, in which you check if it's the right ID and wait if not.
The post implementation can be something like:
synchronized(controllingObject) {
try {
while(workItem.id != nextId) controllingObject.wait();
} catch (Exception e) {}
//Post the workItem
nextId++;
object.notifyAll();
}
I think that you need an extra queue to hold the incoming order.
IncomingOrderQueue.
When you consume the objects you put them in some storage, for example Map and then from another thread which consumes from the IncomingOrderQueue you pick the ids(hashes) of the objects and then you collect them from this HashMap.
This solution can easily be implemented without execution service.
Preprocess: add an order value to each item, prepare an array if it is not allocated.
Input: queue (concurrent sampling with order values 1,2,3,4 but doesnt matter which tread gets which sample)
Output: array (writing to indexed elements, using a synch point to wait for all threads in the end, doesn't need collision checks since it writes different positions for every thread)
Postprocess: convert array to a queue.
Needs n element-array for n-threads. Or some multiple of n to do postprocessing only once.

Producer consumer in batches; second batch shouldn't come until the previous batch is complete

I'm trying to implement a mechanism where the runnables are both producer and consumer;
Situation is-
I need to read records from the DB in batches, and process the same. I'm trying this using producer consumer pattern. I get a batch, I process. Get a batch, process. This gets a batch whenever it sees queue is empty. One of the thread goes and fetches things. But the problem is that I can't mark the records that get fetched for processing, and that's my limitation. So, if we fetch the next batch before entirely committing the previous, I might fetch the same records again. Therefore, I need to be able to submit the previous one entirely before pulling the other one. I'm getting confused as to what should I do here. I've tried keeping the count of the fetched one, and then holding my get until that count is reached too.
What's the best way of handling this situation? Processing records from DB in chunks- the biggest limitation I've here is that I can't mark the records which have been picked up. So, I want batches to go through sequentially. But a batch should use multithreading internally.
public class DealStoreEnricher extends AsyncExecutionSupport {
private static final int BATCH_SIZE = 5000;
private static final Log log = LogFactory.getLog(DealStoreEnricher.class);
private final DealEnricher dealEnricher;
private int concurrency = 10;
private final BlockingQueue<QueryDealRecord> dealsToBeEnrichedQueue;
private final BlockingQueue<QueryDealRecord> dealsEnrichedQueue;
private DealStore dealStore;
private ExtractorProcess extractorProcess;
ExecutorService executor;
public DealStoreEnricher(DealEnricher dealEnricher, DealStore dealStore, ExtractorProcess extractorProcess) {
this.dealEnricher = dealEnricher;
this.dealStore = dealStore;
this.extractorProcess = extractorProcess;
dealsToBeEnrichedQueue = new LinkedBlockingQueue<QueryDealRecord>();
dealsEnrichedQueue = new LinkedBlockingQueue<QueryDealRecord>(BATCH_SIZE * 3);
}
public ExtractorProcess getExtractorProcess() {
return extractorProcess;
}
public DealEnricher getDealEnricher() {
return dealEnricher;
}
public int getConcurrency() {
return concurrency;
}
public void setConcurrency(int concurrency) {
this.concurrency = concurrency;
}
public DealStore getDealStore() {
return dealStore;
}
public DealStoreEnricher withConcurrency(int concurrency) {
setConcurrency(concurrency);
return this;
}
#Override
public void start() {
super.start();
executor = Executors.newFixedThreadPool(getConcurrency());
for (int i = 0; i < getConcurrency(); i++)
executor.submit(new Runnable() {
public void run() {
try {
QueryDealRecord record = null;
while ((record = get()) != null && !isCancelled()) {
try {
update(getDealEnricher().enrich(record));
processed.incrementAndGet();
} catch (Exception e) {
failures.incrementAndGet();
log.error("Failed to process deal: " + record.getTradeId(), e);
}
}
} catch (InterruptedException e) {
setCancelled();
}
}
});
executor.shutdown();
}
protected void update(QueryDealRecord enrichedRecord) {
dealsEnrichedQueue.add(enrichedRecord);
if (batchComplete()) {
List<QueryDealRecord> enrichedRecordsBatch = new ArrayList<QueryDealRecord>();
synchronized (this) {
dealsEnrichedQueue.drainTo(enrichedRecordsBatch);
}
if (!enrichedRecordsBatch.isEmpty())
updateTheDatabase(enrichedRecordsBatch);
}
}
private void updateTheDatabase(List<QueryDealRecord> enrichedRecordsBatch) {
getDealStore().insertEnrichedData(enrichedRecordsBatch, getExtractorProcess());
}
/**
* #return true if processed records have reached the batch size or there's
* nothing to be processed now.
*/
private boolean batchComplete() {
return dealsEnrichedQueue.size() >= BATCH_SIZE || dealsToBeEnrichedQueue.isEmpty();
}
/**
* Gets an item from the queue of things to be enriched
*
* #return {#linkplain QueryDealRecord} to be enriched
* #throws InterruptedException
*/
protected synchronized QueryDealRecord get() throws InterruptedException {
try {
if (!dealsToBeEnrichedQueue.isEmpty()) {
return dealsToBeEnrichedQueue.take();
} else {
List<QueryDealRecord> records = getNextBatchToBeProcessed();
if (!records.isEmpty()) {
dealsToBeEnrichedQueue.addAll(records);
return dealsToBeEnrichedQueue.take();
}
}
} catch (InterruptedException ie) {
throw new UnRecoverableException("Unable to retrieve QueryDealRecord", ie);
}
return null;
}
private List<QueryDealRecord> getNextBatchToBeProcessed() {
List<QueryDealRecord> recordsThatNeedEnriching = getDealStore().getTheRecordsThatNeedEnriching(getExtractorProcess());
return recordsThatNeedEnriching;
}
#Override
public void stop() {
super.stop();
if (executor != null)
executor.shutdownNow();
}
#Override
public boolean await() throws InterruptedException {
return executor.awaitTermination(Long.MAX_VALUE, TimeUnit.SECONDS) && !isCancelled() && complete();
}
#Override
public boolean await(long timeout, TimeUnit unit) throws InterruptedException {
return executor.awaitTermination(timeout, unit) && !isCancelled() && complete();
}
private boolean complete() {
setCompleted();
return true;
}
}
You're already using a BlockingQueue - it does all that work for you.
However, you're using the wrong method addAll() to add new elements to the queue. That method will throw an exception if the queue is not able to accept elements. Rather you should use put() because that's the blocking method corresponding to take(), which you are using correctly.
Regarding your statement in the post title:
second batch shouldn't come until the previous batch is complete
You need not be concerned about the timing of the incoming versus outgoing batches if you use BlockingQueue correctly.
It looks like a Semaphore will work perfectly for you. Have the producing thread acquire the semaphore while the consuming thread releases the semaphore when it completes the batch.
BlockingQueue blockingQueue = ...;
Semapore semaphore = new Semaphore(1);
Producing-Thread
Batch batch = db.getBatch();
semaphore.acquire(); // wait until previous batch completes
blockingQueue.add(batch);
Consuming Thread
for(;;){
Batch batch = blockingQueue.take();
doBatchUpdate(batch);
semaphore.release(); // tell next batch to run
}

multiple execution for real-time message processing

I've implemented a thread pool executor on messages that are coming in real-time.
Here is some relevant example code:
class MessageProcessor implements SomeListener{
StateInfo stateInfo;
ExecutorService pool;
MessageProcessor(StateInfo stateInfo) {
pool = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors() + 1);
this.stateInfo = stateInfo;
}
#Override
void processMessage(final String messageComesInRealTime) {
Runnable runner = new Runnable() {
public void run() {
if(!stateInfo.in_state) {
if(stateInfo.state == 1) {
stateInfo.in_state = true;
//do something with message
stateInfo.state = 2;
}
else if(stateInfo.state == 2) {
stateInfo.in_state = true;
//do something with message
stateInfo.state = 3;
}
//etc...
}
}
};
pool.execute(runner);
//etc...
}
}
In processMessage method, messages come in real-time at a high rate and multiple messages are handled at the same time. But when stateInfo.state becomes true, I don't want other message processes to be evaluated the same way. Is it just better to remove thread altogether for this scenario? Or can there be a way around this behavior while maintaining thread execution? Thanks for any response.
Based on your comments, it sounds like you need to synchronize access and assignment to your in_state variable.
You can do this simply like so:
private final Object lock = new Object();
//...
public void run(){
boolean inState = false;
synchronized(lock){
inState = inState();
if(inState){ setInState(false);}
}
}
boolean inState(){
return this.stateInfo.in_state;
}
void setInState(boolean value){
this.stateInfo.in_state=value;
}
Also be sure to declare the in_state variable in StateInfo to be volatile.

Request Queue implementation

I am currently involved in doing POC for an RPC layer. I have written the following method to throttle requests on the client side. Is this a good pattern to follow? I did not choose queueing the additional requests into a threadpool because I am interested only in synchronous invocations and I want the caller thread to block until it is woken up for executing the RPC request and also because threadpool seems additional overhead because of creation of additional threads.
I thought I can manage with the threads which are already issuing the requests. This works well, but the CPU usage is a bit unfair to other processes because as soon as a call ends, another call goes out. I load tested it with a huge number of requests and memory and CPU usage are stable. Can I somehow use ArrayBlockingQueue with poll to achieve the same? Is poll() too much of a CPU hog?
Note: I recognise a few concurrency issues with requestEnd method where it might not wake up all registered items correctly and I am thinking of a performant way to maintain atomicity there.
public class RequestQueue {
// TODO The capacity should come from the consumer which in turn comes from
// config
private static final int _OUTBOUND_REQUEST_QUEUE_MAXSIZE = 40000;
private static final int _CURRENT_REQUEST_QUEUE_INCREMENT = 1;
private static final int _CURRENT_REQUEST_POOL_MAXSIZE = 40;
private AtomicInteger currentRequestsCount = new AtomicInteger(0);
private ConcurrentLinkedQueue<RequestWaitItem> outboundRequestQueue = null;
public RequestQueue() {
outboundRequestQueue = new ConcurrentLinkedQueue<RequestWaitItem>();
}
public void registerForFuture(RequestWaitItem waitObject) throws Exception {
if (outboundRequestQueue.size() < _OUTBOUND_REQUEST_QUEUE_MAXSIZE) {
outboundRequestQueue.add(waitObject);
} else {
throw new Exception("Queue is full" + outboundRequestQueue.size());
}
}
public void requestStart() {
currentRequestsCount.addAndGet(_CURRENT_REQUEST_QUEUE_INCREMENT);
}
//Verify correctness
public RequestWaitItem requestEnd() {
int currentRequests = currentRequestsCount.decrementAndGet();
if (this.outboundRequestQueue.size() > 0 && currentRequests < _CURRENT_REQUEST_POOL_MAXSIZE) {
try {
RequestWaitItem waitObject = (RequestWaitItem)this.outboundRequestQueue.remove();
waitObject.setRequestReady(true);
synchronized (waitObject) {
waitObject.notify();
}
return waitObject;
} catch (NoSuchElementException ex) {
//Queue is empty so this is not an exception condition
}
}
return null;
}
public boolean isFull() {
return currentRequestsCount.get() > _CURRENT_REQUEST_POOL_MAXSIZE;
}
}
public class RequestWaitItem {
private boolean requestReady;
private RpcDispatcher dispatcher;
public RequestWaitItem() {
this.requestReady = false;
}
public RequestWaitItem(RpcDispatcher dispatcher) {
this();
this.dispatcher = dispatcher;
}
public boolean isRequestReady() {
return requestReady;
}
public void setRequestReady(boolean requestReady) {
this.requestReady = requestReady;
}
public RpcDispatcher getDispatcher() {
return dispatcher;
}
}
if (requestQueue.isFull()) {
try {
RequestWaitItem waitObject = new RequestWaitItem(dispatcher);
requestQueue.registerForFuture(waitObject);
//Sync
// Config and centralize this timeout
synchronized (waitObject) {
waitObject.wait(_REQUEST_QUEUE_TIMEOUT);
}
if (waitObject.isRequestReady() == false) {
throw new Exception("Request Issuing timedout");
}
requestQueue.requestStart();
try {
return waitObject.getDispatcher().dispatchRpcRequest();
}finally {
requestQueue.requestEnd();
}
} catch (Exception ex) {
// TODO define exception type
throw ex;
}
} else {
requestQueue.requestStart();
try {
return dispatcher.dispatchRpcRequest();
}finally {
requestQueue.requestEnd();
}
}
If I understood correctly, you want to throttle requests to remote service, by having at most 40 (say) concurrent requests. You can do this easily, without extra threads or services, with a semaphore.
Semaphore s = new Semaphore(40);
...
s.acquire();
try {
dispatcher.dispatchRpcRequest(); // Or whatever your remote call looks like
} finally {
s.release();
}
Use ExecutorService service = Executors.newFixedThreadPool(10); for this.
This will create at the max 10 threads and further requests will wait in the queue. I guess this should help.
Fixed Thread Pool

Creating multithreading java class to process data

I would like to realize class in Java, which will be wait for new data from different threads and when he got it, this class will process it and again go to wait new data. I want to realize this using only synchronized, wait, notifyAll commands. I tried some variants:
1) using one thread, which wait by command lockObject.wait(). But when all active threads finish their work, this thread will be waiting forever. Of course, I can make method stopProcess(), but it is not safety, because another programmer can forget to call it.
2) using one daemon-thread, it will not work, because when all active threads finish their work, my daemon-thread die, but he can have some data which he must to process
3)when new data is coming - create new thread, which will process data. while thread is alive(he process given data), he will receive new data. when it is no data coming and all old data was processed, thread finish to work. Minus of this variant is - when data is coming through some period (when thread have time to process old data and die), a new thread will be created. I think it's bad for performance or/and memory. Am I right?
Is it possible to solve my problem using only one or two(may be using daemon and active thread in combination) threads and not using stopProcess() method??
Here some code
My realize of blocking queue
public class BlockingQueue<T> {
private Queue<T> queue = new LinkedList<T>();
public void add(T el){
synchronized (queue){
queue.add(el);
}
}
public T getFirst(){
synchronized (queue){
return queue.poll();
}
}
public int getSize(){
synchronized (queue){
return queue.size();
}
}
}
Data class
public class Data {
//some data
public void process(){
//process this data
}
}
First variant of code
public class ProcessData {
private BlockingQueue<Data> queue = new BlockingQueue<Data>();
private boolean run = false;
private Thread processThread;
private Object lock = new Object();
public synchronized void addData(Data data) throws Exception {
if (run){
if (data != null){
queue.add(data);
wakeUpToProcess();
}
}else{
throw new Exception("");
}
}
public synchronized void start() {
if (!run){
run = true;
processThread = new Thread(new Runnable() {
public void run() {
while (run || queue.getSize()!=0){
while(queue.getSize() == 0 && run){
//if stopProcess was not called
//and no active threads
//it will not die
waitForNewData();
}
Data cur;
while(queue.getSize() > 0){
cur = queue.getFirst();
cur.process();
}
}
}
});
processThread.start();
}
}
public synchronized void stopProcess() {
if (run){
run = false;
wakeUpToProcess();
}
}
private void waitForNewData(){
try{
synchronized (lock){
lock.wait();
}
}catch (InterruptedException ex){
ex.printStackTrace();
}
}
private void wakeUpToProcess(){
synchronized (lock){
lock.notifyAll();
}
}
}
In second variant I make processThread as daemon. But when active threads die, processThread finish to work, but there are some data in queue, which i have to process.
Third variant
public class ProcessData {
private BlockingQueue<Data> queue = new BlockingQueue<Data>();
private boolean run = false;
private Thread processThread = null;
public synchronized void addData(Data data) throws Exception {
if (run){
if (data != null){
queue.add(data);
wakeExecutor();
}
}else{
throw new Exception("ProcessData is stopped!");
}
}
public synchronized void start() {
if (!run){
run = true;
}
}
public synchronized void stopProcess() {
if (run){
run = false;
}
}
public boolean isRunning(){
return this.run;
}
protected void wakeExecutor(){
if (processThread ==null || !processThread.isAlive()){
processThread = new Thread(new Runnable() {
#Override
public void run() {
Data cur;
while(queue.getSize() > 0){
cur = queue.getFirst();
cur.process();
}
}
});
processThread.start();
}
}
}
It is important, that data must to process in the order, in which it come from threads.
You are seriously reinventing the wheel here. All you want is available in the JDK in the java.util.concurrent package.
Implement a producer-consumer pattern via a BlockingQueue, with your producers calling offer() and your consumer thread calling take(), which blocks until something's available.
That's it. You don't need, and you shouldn't be writing, all those classes you have written. These concurrent classes do all the locking and synchronization for you, and do it correctly too (which is not to be underestimated)
If you're not allowed to use anything from java.util.concurrent then you'll have to implement your own task queue based on something like a LinkedList. I would encapsulate the blocking behaviour in the queue, e.g. (pseudocode)
synchronized Data nextTask() {
while(the linked list is empty) {
wait()
}
remove and return head of the queue
}
synchronized void addTask(Data d) {
add d to the queue
notifyAll()
}
Then you can just have a consumer thread that continuously does something like this
while(true) {
taskQueue.nextTask().process()
}
and the producer threads call taskQueue.addTask to add each task to the queue. If you need a graceful shutdown at the end then you'll either need some "sentinel value" to tell the consumer thread to finish, or find some way of calling Thread.interrupt() at the right time.

Categories

Resources