I have below code to get the data from Redis asynchronously. By default get() call in lettuce library uses nio-event thread pool.
Code 1:
StatefulRedisConnection<String, String> connection = redisClient.connect();
RedisAsyncCommands<String, String> command = connection.async();
CompletionStage<String> result = command.get(id)
.thenAccept(code ->
logger.log(Level.INFO, "Thread Id " + Thread.currentThread().getName());
//Sample code to print thread ID
Thread Id printed is lettuce-nioEventLoop-6-2.
Code 2:
CompletionStage<String> result = command.get(id)
.thenAcceptAsync(code -> {
logger.log(Level.INFO, "Thread Id " + Thread.currentThread().getName());
//my original code
}, executors);
Thread Id printed is pool-1-thread-1.
My questions:
Is there a way to pass my executors?
Is it recommended approach to use nio-event thread pool to get(using get() call) the data from redis?
Lettuce version: 5.2.2.RELEASE
class io.lettuce.core.RedisClient has a creator method:
public static RedisClient create(ClientResources clientResources, String uri) {
LettuceAssert.notEmpty(uri, "URI must not be empty");
return create(clientResources, RedisURI.create(uri));
You can build your ClientResources by ClientResources#builder(), and pass anything you want. Refer the JavaDoc, there is something you can customize:
EventLoopGroupProvider to obtain particular EventLoopGroups
EventExecutorGroup to perform internal computation tasks
Timer for scheduling
EventBus for client event dispatching
CommandLatencyCollector to collect latency details. Requires the HdrHistogram library.
DnsResolver to collect latency details. Requires the LatencyUtils library.
Reconnect Delay.
Tracing to trace Redis commands.
I wanted to enable the manual commit for my consumer and for that i have below code + configuration. Here i am trying to manually commit the offset in case signIn client throws exception and till manually comitting offet itw works fine but with this code the message which failed to process is not being consumed again so for that what i want to do is calling seek method and consume same failed offset again -
But the actual problem is here how do i get partition and offset details from. If somehow i can get ConsumerRecord object along with message then it will work.
And Below is the consumer code through StreamListener
public void handleCommFeedConsumer(
#Payload Account consumerRecords,
#Header(KafkaHeaders.CONSUMER) Consumer<?, ?> consumer,
#Header(KafkaHeaders.ACKNOWLEDGMENT) Acknowledgment acknowledgment) {
consumerRecords.forEach(communityFeed -> {
log.debug("Calling Client for Id : "
+ communityEvent.getId());
}catch(RuntimeException ex){
//consumer.seek(new TopicPartition(communityTopic,communityFeed.partition()),communityFeed.offset());
See https://docs.spring.io/spring-kafka/docs/current/reference/html/#consumer-record-metadata
#Header(KafkaHeaders.PARTITION_ID) int partition
#Header(KafkaHeaders.OFFSET) long offset
Seeking the consumer yourself might not do what you want because the container may already have other records after this one; it's best to throw an exception and the error handler will do the seeks for you.
Below code returned a timeout in client (Elasticsearch Client) when number of records are higher.
CompletableFuture<BulkByScrollResponse> future = new CompletableFuture<>();
client.reindexAsync(request, RequestOptions.DEFAULT, new ActionListener<BulkByScrollResponse>() {
public void onResponse(BulkByScrollResponse bulkByScrollResponse) {
public void onFailure(Exception e) {
BulkByScrollResponse response = future.get(10, TimeUnit.MINUTES); // client timeout occured before this timeout
Below is the client config.
connectTimeout: 60000
socketTimeout: 600000
maxRetryTimeoutMillis: 600000
Is there a way to wait indefinitely until the re-indexing complete?
submit the reindex request as a task:
TaskSubmissionResponse task = esClient.submitReindexTask(reindex, RequestOptions.DEFAULT);
acquire the task id:
TaskId taskId = new TaskId(task.getTask());
then check the task status periodically:
GetTaskRequest taskQuery = new GetTaskRequest(taskId.getNodeId(), taskId.getId());
GetTaskResponse taskStatus;
do {
taskStatus = esClient.tasks()
.get(taskQuery, RequestOptions.DEFAULT)
.orElseThrow(() -> new IllegalStateException("Reindex task not found. id=" + taskId));
} while (!taskStatus.isCompleted());
Elasticsearch java api doc about task handling just sucks.
I don't think its a better choice to wait indefinitely to complete the re-indexing process and give very high value for timeout as this is not a proper fix and will cause more harm than good.
Instead you should examine the response, add more debugging logging to find the root-cause and address them. Also please have a look at my tips to improve re-indexing speed, which should fix some of your underlying issues.
We are trying to implement Kafka as our message broker solution. We are deploying our Spring Boot microservices in IBM BLuemix, whose internal message broker implementation is Kafka version 0.10. Since my experience is more on the JMS, ActiveMQ end, I was wondering what should be the ideal way to handle system level errors in the java consumers?
Here is how we have implemented it currently
Consumer properties
We are using the default properties for
Kafka Consumer
We are spinning up 3 threads per topic all having the same groupId, i.e one KafkaConsumer instance per thread. We have only one partition as of now. The consumer code looks like this in the constructor of the thread class
kafkaConsumer = new KafkaConsumer<String, String>(properties);
final List<String> topicList = new ArrayList<String>();
kafkaConsumer.subscribe(topicList, new ConsumerRebalanceListener() {
public void onPartitionsRevoked(final Collection<TopicPartition> partitions) {
public void onPartitionsAssigned(final Collection<TopicPartition> partitions) {
try {
logger.info("Partitions assigned, consumer seeking to end.");
for (final TopicPartition partition : partitions) {
final long position = kafkaConsumer.position(partition);
logger.info("current Position: " + position);
logger.info("Seeking to end...");
logger.info("Seek from the current position: " + kafkaConsumer.position(partition));
kafkaConsumer.seek(partition, position);
logger.info("Consumer can now begin consuming messages.");
} catch (final Exception e) {
logger.error("Consumer can now begin consuming messages.");
The actual reading happens in the run method of the thread
try {
// Poll on the Kafka consumer every second.
final ConsumerRecords<String, String> records = kafkaConsumer.poll(1000);
// Iterate through all the messages received and print their
// content.
for (final TopicPartition partition : records.partitions()) {
final List<ConsumerRecord<String, String>> partitionRecords = records.records(partition);
logger.info("consumer is alive and is processing "+ partitionRecords.size() +" records");
for (final ConsumerRecord<String, String> record : partitionRecords) {
logger.info("processing topic "+ record.topic()+" for key "+record.key()+" on offset "+ record.offset());
final Class<? extends Event> resourceClass = eventProcessors.getResourceClass();
final Object obj = converter.convertToObject(record.value(), resourceClass);
if (obj != null) {
logger.info("Event: " + obj + " acquired by " + Thread.currentThread().getName());
final CommsEvent event = resourceClass.cast(converter.convertToObject(record.value(), resourceClass));
final MessageResults results = eventProcessors.processEvent(event
if ("Success".equals(results.getStatus())) {
// commit the processed message which changes
// the offset
logger.info("Message processed sucessfully");
} else {
kafkaConsumer.seek(new TopicPartition(record.topic(), record.partition()), record.offset());
logger.error("Error processing message : {} with error : {},resetting offset to {} ", obj,results.getError().getMessage(),record.offset());
// TODO add return
} catch (final Exception e) {
logger.error("Consumer has failed with exception: " + e, e);
You will notice the EventProcessor which is a service class which processes each record, in most cases commits the record in database. If the processor throws an error (System Exception or ValidationException) we do not commit but programatically set the seek to that offset, so that subsequent poll will return from that offset for that group id.
The doubt now is that, is this the right approach? If we get an error and we set the offset then until that is fixed no other message is processed. This might work for system errors like not able to connect to DB, but if the problem is only with that event and not others to process this one record we wont be able to process any other record. We thought of the concept of ErrorTopic where when we get an error the consumer will publish that event to the ErrorTopic and in the meantime it will keep on processing other subsequent events. But it looks like we are trying to bring in the design concepts of JMS (due to my previous experience) into kafka and there may be better way to solve error handling in kafka. Also reprocessing it from error topic may change the sequence of messages which we don't want for some scenarios
Please let me know how anyone has handled this scenario in their projects following the Kafka standards.
if the problem is only with that event and not others to process this one record we wont be able to process any other record
that's correct and your suggestion to use an error topic seems a possible one.
I also noticed that with your handling of onPartitionsAssigned you essentially do not use the consumer committed offset, as you seem you'll always seek to the end.
If you want to restart from the last succesfully committed offset, you should not perform a seek
Finally, I'd like to point out, though it looks like you know that, having 3 consumers in the same group subscribed to a single partition - means that 2 out of 3 will be idle.
EDIT : I am seeing the same exact behavior with the Kafka 9 Consumer API also.
I have a simple Kafaka 8.2.2 Producer with the enable topic creation property set to true. It will create a new topic when an event with a non-existent topic is created, but the event that creates that topic does not end up in Kafka and the RecordMetadata returned has no errors.
public void receiveEvent(#RequestBody EventWrapper events) throws InterruptedException, ExecutionException, TimeoutException {
log.info("Sending " + events.getEvents().size() + " Events ");
for (Event event : events.getEvents()) {
log.info("Sending Event - " + event);
ProducerRecord<String, String> record = new ProducerRecord<>(event.getTopic(), event.getData());
Future<RecordMetadata> ack = eventProducer.send(record);
log.info("ACK - " + ack.get());
I have a program that polls for new topics (I wasn't happy with the dynamic/regex topic code in Kafka 8) and it finds the new queue and subscribes, and it does see subsequent events, but never that first event.
I also tried the kafka-console-consumer script and it sees that exact same. First event never seen, then after that events start flowing.
Turns out there is a property you can set props.put("auto.offset.reset","earliest");
And after setting this, the Consumer does receive the first event put on the topic.
I switched from making sequential HTTP calls to 4 REST services, to making 4 simultaneous calls using a commonj4 work manager task executor. I'm using WebLogic 12c. This new code works on my development environment, but in our test environment under load conditions, and occasionally while not under load, the results map is not populated with all of the results. The logging suggests that each work item did receive back the results though. Could this be a problem with the ConcurrentHashMap? In this example from IBM, they use their own version of Work and there's a getData() method, although it doesn't like that method really exists in their class definition. I had followed a different example that just used the Work class but didn't demonstrate how to get the data out of those threads into the main thread. Should I be using execute() instead of schedule()? The API doesn't appear to be well documented. The stuckthreadtimeout is sufficiently high. component.processInbound() actually contains the code for the HTTP call, but I the problem isn't there because I can switch back to the synchronous version of the class below and not have any issues.
My code:
public class WorkManagerAsyncLinkedComponentRouter implements
MessageDispatcher<Object, Object> {
private List<Component<Object, Object>> components;
protected ConcurrentHashMap<String, Object> workItemsResultsMap;
protected ConcurrentHashMap<String, Exception> componentExceptionsInThreads;
//components is populated at this point with one component for each REST call to be made.
public Object route(final Object message) throws RouterException {
try {
workItemsResultsMap = new ConcurrentHashMap<String, Object>();
componentExceptionsInThreads = new ConcurrentHashMap<String, Exception>();
final String parentThreadID = Thread.currentThread().getName();
List<WorkItem> producerWorkItems = new ArrayList<WorkItem>();
for (final Component<Object, Object> component : this.components) {
producerWorkItems.add(workManagerTaskExecutor.schedule(new Work() {
public void run() {
//ExecuteThread th = (ExecuteThread) Thread.currentThread();
LOG.info("Child thread " + Thread.currentThread().getName() +" Parent thread: " + parentThreadID + " Executing work item for: " + component.getName());
try {
Object returnObj = component.processInbound(message);
if (returnObj == null)
LOG.info("Object returned to work item is null, not adding to producer components results map, for this producer: "
+ component.getName());
else {
LOG.info("Added producer component thread result for: "
+ component.getName());
workItemsResultsMap.put(component.getName(), returnObj);
LOG.info("Finished executing work item for: " + component.getName());
} catch (Exception e) {
componentExceptionsInThreads.put(component.getName(), e);
} // end loop over producer components
// Block until all items are done
workManagerTaskExecutor.waitForAll(producerWorkItems, stuckThreadTimeout);
LOG.info("Finished waiting for all producer component threads.");
if (componentExceptionsInThreads != null
&& componentExceptionsInThreads.size() > 0) {
List<Object> resultsList = new ArrayList<Object>(workItemsResultsMap.values());
if (resultsList.size() == 0)
throw new RouterException(
"The producer thread results are all empty. The threads were likely not created. In testing this was observed when either 1)the system was almost out of memory (Perhaps the there is not enough memory to create a new thread for each producer, for this REST request), or 2)Timeouts were reached for all producers.");
//** The problem is identified here. The results in the ConcurrentHashMap aren't the number expected .
if (workItemsResultsMap.size() != this.components.size()) {
StringBuilder sb = new StringBuilder();
for (String str : workItemsResultsMap.keySet()) {
sb.append(str + " ");
throw new RouterException(
"Did not receive results from all threads within the thread timeout period. Only retrieved:"
+ sb.toString());
LOG.info("Returning " + String.valueOf(resultsList.size()) + " results.");
LOG.debug("List of returned feeds: " + String.valueOf(resultsList));
return resultsList;
I ended up cloning the DOM document used as a parameter. There must be some downstream code that has side effects on the parameter.