Spring managed KafkaTemplate provides
template.send(record).addCallback(...
template.executeInTransaction(...
Now let's say I have a method doWork() which is triggered on a event (say a TCP/IP message).
#Autowired
KafkaTemplate template;
// This method is triggered on a event
doWork(EventType event){
switch(event){
case Events.Type1 :
template.send(record); break;
case Events.Type2 :
// Question : How do I achieve a commit of all my previous sends here?
default : break;
}
}
Basically, I need to achieve a transaction by adding #Transaction over doWork() or a
template.executeInTransaction(...
in code. But I want to batch a couple of [template.send()]s and do a commit after a couple of calls to the doWork() method, how do I achieve that?
My producer configurations has transactions enabled and a KafkaTransactionManager wired to the producer factory.
kafkaTemplate.executeInTransaction(t -> {
boolean stayIntransaction = true;
while (stayInTransaction) {
Event event = readTcp()
doWork(event);
stayInTransaction = transactionDone(event);
}
}
As long as the doWork() method uses the same template, and it runs within the scope of the callback, the work will run in the transaction.
Or
#Transactional
public void doIt() {
boolean stayIntransaction = true;
while (stayInTransaction) {
Event event = readTcp()
doWork(event);
stayInTransaction = transactionDone(event);
}
}
When using declarative transactions.
If the TCP events are async, you will somehow need to hand them off to the thread running the transaction, such as using a BlockingQueue<?>.
Related
I'm working on an application which uses Kafka to consume messages from multiple topics, persisting data as it goes.
To that end I use a #Service class, with a couple of methods annotated with #kafkaListener. Consider this:
#Transactional
#KafkaListener(topics = MyFirstMessage.TOPIC, autoStartup = "false", containerFactory = "myFirstKafkaListenerContainerFactory")
public void handleMyFirstMessage(ConsumerRecord<String, MyFirstMessage> record, Acknowledgment acknowledgment) throws Exception {
MyFirstMessage message = consume(record, acknowledgment);
try {
doHandle(record.key(), message);
} catch (Exception e) {
TransactionInterceptor.currentTransactionStatus().setRollbackOnly();
} finally {
acknowledgment.acknowledge();
}
}
#Transactional
#KafkaListener(topics = MySecondMessage.TOPIC, autoStartup = "false", containerFactory = "mySecondKafkaListenerContainerFactory")
public void handleMySecondMessage(ConsumerRecord<String, MySecondMessage> record, Acknowledgment acknowledgment) throws Exception {
MySecondMessage message = consume(record, acknowledgment);
try {
doHandle(record.key(), message);
} catch (Exception e) {
TransactionInterceptor.currentTransactionStatus().setRollbackOnly();
} finally {
acknowledgment.acknowledge();
}
}
Please disregard the stuff about setRollbackOnly, it's not relevant to this question.
What IS relevant is that the doHandle() methods in each listener perform inserts in a table, which occasionally fail because autogenerated keys turn out to be non-unique once the final commit is done.
What happens is that each doHandle() method will increment the key column in their own little transactions, and only one of them will "win" that race. The other will fail during commit, with a non-unique constraint violation.
What is best practice to handle this? How do I "synchronize" transactions to execute like pearls on a string in stead of all at once?
I'm thinking of using some kind of semaphor or lock, to serialize things but that smells like a solution with many pitfalls. If there was a general pattern or framework to help with this problem I would be much more comfortable implementing it.
See the documentation.
Using #Transactional for the DB and a KafkaTransactionManager in the listener container is similar to using a ChainedKafkaTransactionManager (configured with both TMs) in the container. The DB tx is committed, followed by Kafka, when the listener exits normall.
When the listener throws an exception, both transactions are rolled back in the same order.
The setRollbackOnly is definitely relevant to this question since you are not rolling back the kafka transaction when you do that.
I'm using hazel cast IMGD for my app. I have used queues for internal communication. I added an item listener to queue and it works great. Whenever a queue gets a message, listener wakes up and needed processing is done.
Problem is its single threaded. Sometimes, a message takes 30 seconds to process and messages in queue just have to wait until previous message is done processing. I'm told to use Java executor service to have a pool of threads and add an item listener to every thread so that multiple messages can be processed at same time.
Is there any better way to do it ? may be configure some kind of MDB or make the processing asynchronous so that my listener can process the messages faster
#PostConstruct
public void init() {
logger.info(LogFormatter.format(BG_GUID, "Starting up GridMapper Queue reader"));
HazelcastInstance hazelcastInstance = dc.getInstance();
queue = hazelcastInstance.getQueue(FactoryConstants.QUEUE_GRIDMAPPER);
queue.addItemListener(new Listener(), true);
}
class Listener implements ItemListener<QueueMessage> {
#Override
public void itemAdded(ItemEvent<QueueMessage> item) {
try {
QueueMessage message = queue.take();
processor.process(message.getJobId());
} catch (Exception ex) {
logger.error(LogFormatter.format(BG_GUID, ex));
}
}
#Override
public void itemRemoved(ItemEvent<QueueMessage> item) {
logger.info("Item removed: " + item.getItem().getJobId());
}
}
Hazelcast IQueue does not support asynchronous interface. Anyway, asynchronous access would not be faster. MDB requires JMS, which is pure overhead.
What you really need is multithreaded executor. You can use default executor:
private final ExecutorService execService = ForkJoinPool.commonPool();
I have a function
#Scheduled(fixedDelayString = "2000")
public void processPendingDownloadRequests() {
List<DownloadRequest> downloadRequests = downloadRequestService.getPendingDownloadRequests();
for (int i = 0; i < downloadRequests.size(); i++) {
DownloadRequest downloadRequest = downloadRequests.get(i);
processDownloadRequest(downloadRequest);
}
}
}
This will retrieve all download requests from the DB that are in the Pending state. Which is just a enum in the downloadrequest table.
#Async
public void processDownloadRequest(DownloadRequest downloadRequest) {
accountProfileProcessor.run(downloadRequest);
}
Inside the accountProfileProcessor is where the state of the downloadRequest changes to InProgress.
The race conditions comes when the #Scheduled function runs and picks up downloadRequests that have been submitted for Async jobs but the status hasn't been switched to inProgress yet. How can I avoid this?
I tried to only run the code inside the #Scheduled function if the #Async taskexecutor queue was empty but could not get it to work
The following will prevent two concurrent attempts to download the same resource.
Note that if there is a need to make sure that subsequent attempts to execute the same download are not repeated, some form of tracking that completion for a longer time is needed, with the need to somehow prevent memory leak (i.e. don't keep all complete id's in memory indefinitely).
private Set<String> activeDownloads = new HashSet<>();
#Async
public void processDownloadRequest(DownloadRequest downloadRequest) {
synchronized(this.activeDownloads) {
if (this.activeDownloads.contains(downloadRequest.getId()) {
// Duplicate download attempt - log it?
return;
} else {
this.activeDownloads.put(downloadRequest.getId());
}
}
try {
accountProfileProcessor.run(downloadRequest);
} finally {
synchronized(this.activeDownloads) {
this.activeDownloads.remove(downloadRequest.getId());
}
}
}
I've been developing Rest API's in Java. I want to convert them it into Async. The Two options I see is DeferredResult and CompletableFeature.
I don't seem to be find the difference between these two, and when to chose over another.
Any real time examples would be appreciated.
DeferredResult is spring class and it is just a container of the result (as its name implies) so we need to explicitly use some kind of thread pool (ForkJoinPool for example) to run our processing asynchronously. CompletableFuture is part of java.util.concurrent and allow to run the processing asynchronously. It implements Future and basically have the ability to compose, combine and execute asynchronous computation steps.
Simple example of both options:
#GetMapping(value = "/deferredResult")
public DeferredResult<Boolean> useDeferredResult() {
DeferredResult<Boolean> deferredResult = new DeferredResult<>();
deferredResult.onCompletion(() -> logResult((Boolean)deferredResult.getResult()));
ForkJoinPool.commonPool().submit(() -> {
deferredResult.setResult(processRequest());
});
return deferredResult;
}
#GetMapping(value = "/completableFuture")
public CompletableFuture<Boolean> useCompletableFuture() {
return CompletableFuture.supplyAsync(this::processRequest)
.thenApplyAsync(this::logResult);
}
private boolean logResult(Boolean result) {
System.out.println("Result: " + result);
return true;
}
private boolean processRequest() {
try {
Thread.sleep(500);
} catch (InterruptedException e) {
e.printStackTrace();
}
return true;
}
Notes:
By default, Spring will execute the CompletableFuture actions by
ForkJoinPool (can be configured).
In the case of DeferredResult, the logResult will be executed by the servlet container (for example Tomcat) worker thread - not nessecrally the one got the request at the beginning.
You can (while I don't see any
reason to) run processing asynchronously with CompletableFuture and
return DeferredResult.
With DeferredResult you can register more callbacks, like onCompleted - For example onError, etc. See here.
CompletableFuture has a lot options to compose actions. See here.
IMHO, CompletableFuture is more elegant and has more capabilities.
Also, here you have a working example project.
I am writing an integration test in JUnit for a Message Driven Pojo (MDP):
#JmsListener(destination = "jms/Queue", containerFactory = "cf")
public void processMessage(TextMessage message) throws JMSException {
repo.save(new Entity("ID"));
}
where repo is a spring-data repository
my unit test:
#Test
public void test() {
//sendMsg
sendJMSMessage();
//verify DB state
Entity e = repo.findOne("ID");
assertThat(e, is(notNullValue()) );
}
Now, the thing is that the processMessage() method is executed in a different thread than the test() method, so I figured out that I need to somehow wait for the processMessage() method to complete before verifying the state of the DB. The best solution I could find was based on CountDownLatch. so now the methods look like this:
#JmsListener(destination = "jms/Queue", containerFactory = "cf")
public void processMessage(TextMessage message) throws JMSException {
repo.save(new Entity("ID"));
latch.countDown();
}
and the test
#Test
public void test() {
//set the countdownlatch
CountDownLatch latch = new CountDownLatch(1);
JMSProcessor.setLatch(latch);
//sendMsg
sendJMSMessage();
try {
countDownLatch.await();
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
//verify DB state
Entity e = repo.findOne("ID");
assertThat(e, is(notNullValue()) );
}
So I was very proud of myself and then I run the test and it failed. The repo.findOne("ID") returned null. In the first reaction I set up a breakpoint at that line and proceed with debugging. During the debugging session the repo.findOne("ID") actually returned the entity inserted by the #JMSListenerlistener method.
After scratching my head for a while here's the current theory: Since the spring-data repository is accessed in two different threads, it gets two different instances of EntityManager and therefore the two threads are in a differen't transaction. Eventhough there's some sort of synchronization using the CountDownLatch, the transaction bound to the thread executing the #JMSListener annotated method has not committed yet when the JUnit #Test annotated method starts a new transaction and tries to retrieve the entity.
So my question is:
Is there a way for one thread to wait for the commit of the other.
Can two threads share one transaction in such a synchronized context (ie, the two threads would not access the EntityManager simultaneously)
Is my testing approach a nonsense and there is a better way of doing this