java.lang.IllegalStateException: This consumer has already been closed - java

Set up kafka consumer with this configuration
kafkaconfig:
acks: 1
autoCommit: true
bootstrapServers: example.com:9092
topic: item
groupId: EWok-group
keyDeserializer: org.apache.kafka.common.serialization.StringDeserializer
valueDeserializer: org.apache.kafka.common.serialization.StringDeserializer
maxPollRecords: 1
pollMillisTime: 15
retries: 5
heartBeatInterval: 300
sessionTimeout: 100000
maxPollInterval: 30000
code
while (true) {
try {
ConsumerRecords<String, String> consumerRecords = eWokIntegrationConsumer.poll(Duration.of(kafkaCommConfig.getPollMillisTime(), ChronoUnit.SECONDS));
if (!consumerRecords.isEmpty()) {
LOG.info("Consumed Record Count: {}", consumerRecords.count());
consumerRecords.forEach(record -> {
System.out.printf("offset = %d, key = %s, value = %s\n", record.offset(), record.key(), record.value());
eWokMessageProcessor.onMessage(record.value());
eWokIntegrationConsumer.commitSync();
});
} else {
LOG.info("Polling returned without any records.");
}
} catch (Exception exception) {
LOG.error("Consumer was interrupted. But still continue to poll. Exception:", exception);
eWokIntegrationConsumer.close();
}
}
10000 ms is taking for processing the data which we have received from kafka consumer.but getting
exception saying
java.lang.IllegalStateException: This consumer has already been closed.
Exception Logs
java.lang.IllegalStateException: This consumer has already been closed.
at org.apache.kafka.clients.consumer.KafkaConsumer.acquireAndEnsureOpen(KafkaConsumer.java:2202)
at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1332)
at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1298)
Kafka version : kafka-clients-2.0.1
Could you pls suggest any one how should configurations Kafka consume looks like.

I have put System.exit(0) in other place in the source code.That is why consumer has leave the group and mark as closed.
I have remove System.exit(0) from source code.Now it's working fine.

Related

Reactive Kafka consumer is rebalancing continuosly

We are using 60 Partitions and have 70 consumers(10 more than partitions),There is no consumer Lag in any of the partitions, but rebalancing is happening in loop continuosly.
Processing time of Each kafkaEvent is 10ms
The following is the CommitFailed Exception which i get
org.apache.kafka.common.errors.RebalanceInProgressException: Offset commit cannot be completed since the consumer is undergoing a rebalance for auto partition assignment. You can try completing the rebalance by calling poll() and then retry the operation.
I am using reactor-kafka:1.3.12 , reactor-core:3.4.18 , kafka-clients:3.0.1 , Java 17 version and Tomcat Webserver
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,"CommentedbootStrapServersUrl");
props.put(ConsumerConfig.GROUP_ID_CONFIG,"user-groupid-1");
props.put(ConsumerConfig.CLIENT_ID_CONFIG,"client-1");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,true);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,"earliest");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,StringDeserializer.class.getName());
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG,360000);
props.put(ConsumerConfig.REQUEST_TIMEOUT_MS_CONFIG,300500);
props.put(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG,30500);
props.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG,300000);
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG,5);
props.put(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG, "org.apache.kafka.clients.consumer.CooperativeStickyAssignor");
ReceiverOptions<String,String> receiverOptions = ReceiverOptions.<String,String>create(props)
.addAssignListener(partitions ->LOGGER.info("onPartitionAssigned {}",partitions))
.addRevokeListener(partitions -> LOGGER.info("onPartitionRevoked {}",partitions))
.commitRetryInterval(Duration.ofMillis(3000))
.commitBatchSize(3)
.commitInterval(Duration.ofMillis(2000))
.maxDelayRebalance(Duration.ofSeconds(60))
.subscription(Collections.singleton("commentedTopicName"));
Flux<ReceiverRecord<String,String>> inboundFlux = KafkaReceiver.create(receiverOptions)
.receive()
.onErrorContinue((throwable,o) -> {
LOGGER.error("Error while consuming in this pod");
})
.retryWhen(Retry.backoff(3, Duration.ofSeconds(3)).transientErrors(true))
.repeat();
try {
Scheduler schedulers = Schedulers.newBoundedElastic(2,2,"UThread");
inboundFlux
.publishOn(schedulers)
.doOnNext(kafkaEvent -> {
callBusinessLogic(kafkaEvent);
kafkaEvent.receiverOffset().acknowledge();
})
.doOnError(e->e.printStackTrace())
.subscribe();
} catch (Exception e){
e.printStackTrace();
}
Needed help with fixing rebalancing issue, consumers are stable at times , but if a rebalance starts , then it never stops.

How to track committed offset with Spark job for kafka batch

I have a use case where i am writing to a Kafka topic in batches using spark job (no streaming).Initially i pump-in suppose 10 records to Kafka topic and run the spark job which does some processing and finally write to another Kafka topic.
Next time when i push another 5 records and run the spark job, my requirement is to start processing these 5 records only not from starting offset. I need to maintain the committed offset so that spark job should run on next offset position and do the processing.
Here is code from kafka side to fetch the offset:
private static List<TopicPartition> getPartitions(KafkaConsumer consumer, String topic) {
List<PartitionInfo> partitionInfoList = consumer.partitionsFor(topic);
return partitionInfoList.stream().map(x -> new TopicPartition(topic, x.partition())).collect(Collectors.toList());
}
public static void getOffSet(KafkaConsumer consumer) {
List<TopicPartition> topicPartitions = getPartitions(consumer, topic);
consumer.assign(topicPartitions);
consumer.seekToBeginning(topicPartitions);
topicPartitions.forEach(x -> {
System.out.println("Partition-> " + x + " startingOffSet-> " + consumer.position(x));
});
consumer.assign(topicPartitions);
consumer.seekToEnd(topicPartitions);
topicPartitions.forEach(x -> {
System.out.println("Partition-> " + x + " endingOffSet-> " + consumer.position(x));
});
topicPartitions.forEach(x -> {
consumer.poll(1000) ;
OffsetAndMetadata offsetAndMetadata = consumer.committed(x);
long position = consumer.position(x);
System.out.printf("Committed: %s, current position %s%n", offsetAndMetadata == null ? null : offsetAndMetadata
.offset(), position);
});
}
Below code is for spark to load the messages from topic which is not working :
Dataset<Row> kafkaDataset = session.read().format("kafka")
.option("kafka.bootstrap.servers", "localhost:9092")
.option("subscribe", topic)
.option("group.id", "test-consumer-group")
.option("startingOffsets","{\"Topic1\":{\"0\":2}}")
.option("endingOffsets", "{\"Topic1\":{\"0\":3}}")
.option("enable.auto.commit","true")
.load();
After above code executes i am again trying to get the offset by calling
getoffset(consumer)
from the topic which always reads from 0 offset and committed offset fetched initially keeps on increasing. I am new to kafka and still figuring out how to handle such scenarion.Please help here.
Initially i had 10 records in my topic, i published another 2 records and here is the o/p:
Output post getoffset method executes :
Partition-> Topic00-0 startingOffSet-> 0 Partition->
Topic00-0 endingOffSet-> 12 Committed: 12, current position
12
Output post spark code executes for loading messages.
Partition-> Topic00-0 startingOffSet-> 0 Partition->
Topic00-0 endingOffSet-> 12 Committed: 12, current position
12
I see no diff and . Please take a look and suggest resolution for this sceanario.

Kafka listener isn't listening to Kafka

I'm using Java 11 and kafka-client 2.0.0.
I'm using the following code to generate a consumer :
public Consumer createConsumer(Properties properties,String regex) {
log.info("Creating consumer and listener..");
Consumer consumer = new KafkaConsumer<>(properties);
ConsumerRebalanceListener listener = new ConsumerRebalanceListener() {
#Override
public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
log.info("The following partitions were revoked from consumer : {}", Arrays.toString(partitions.toArray()));
}
#Override
public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
log.info("The following partitions were assigned to consumer : {}", Arrays.toString(partitions.toArray()));
}
};
consumer.subscribe(Pattern.compile(regex), listener);
log.info("consumer subscribed");
return consumer;
}
}
My poll loop is in a different place in the code :
public <K, V> void startWorking(Consumer<K, V> consumer) {
try {
while (true) {
ConsumerRecords<K, V> records = consumer.poll(600);
if (records.count() > 0) {
log.info("Polled {} records", records.count());
} else {
log.info("polled 0 records.. going to sleep..");
Thread.sleep(200);
}
}
} catch (WakeupException | InterruptedException e) {
log.error("Consumer is shutting down", e);
} finally {
consumer.close();
}
}
When I run the code and use this function, the consumer is created and the log contains the following messages :
Creating consumer and listener..
consumer subscribed
polled 0 records.. going to sleep..
polled 0 records.. going to sleep..
polled 0 records.. going to sleep..
The log doesn't contain any info regarding the partition assignment/revocation.
In addition I'm able to see in the log the properties that the consumer uses (group.id is set) :
2020-07-09 14:31:07.959 DEBUG 7342 --- [ main] o.a.k.clients.consumer.ConsumerConfig : ConsumerConfig values:
auto.commit.interval.ms = 5000
auto.offset.reset = latest
bootstrap.servers = [server1:9092]
check.crcs = true
client.id =
group.id=mygroupid
key.deserializer=..
value.deserializer=..
So I tried to use the kafka-console-consumer with the same configuration in order to consume from one of the topics that the regex(mytopic.*) should catch (in this case I used the topic mytopic-1) :
/usr/bin/kafka-console-consumer.sh --bootstrap-server server1:9092 --topic mytopic-1 --property print.timestamp=true --consumer.config /data/scripts/kafka-consumer.properties --from-begining
I have a poll loop in other part of my code that is timing out every 10m .So the bottom line - the problem is that partitions aren't assigned to the Java consumer. The prints inside the listener never happen and the consumer doesn't have any partitions to listen to.
It seems that I was missing the ssl property in my properties file. Don't forget to specify the security.protocol=ssl if you use ssl. It seems that kafka-client API doesn't throw exception if the Kafka uses ssl and you try to access it without ssl parameter configured.

Can produce to Kafka but cannot consume

I'm using the Kafka JDK client ver 0.10.2.1 . I am able to produce simple messages to Kafka for a "heartbeat" test, but I cannot consume a message from that same topic using the sdk. I am able to consume that message when I go into the Kafka CLI, so I have confirmed the message is there. Here's the function I'm using to consume from my Kafka server, with the props - I pass the message I produced to the topic only after I have indeed confirmed the produce() was succesful, I can post that function later if requested:
private def consumeFromKafka(topic: String, expectedMessage: String): Boolean = {
val props: Properties = initProps("consumer")
val consumer = new KafkaConsumer[String, String](props)
consumer.subscribe(List(topic).asJava)
var readExpectedRecord = false
try {
val records = {
val firstPollRecs = consumer.poll(MAX_POLLTIME_MS)
// increase timeout and try again if nothing comes back the first time in case system is busy
if (firstPollRecs.count() == 0) firstPollRecs else {
logger.info("KafkaHeartBeat: First poll had 0 records- trying again - doubling timeout to "
+ (MAX_POLLTIME_MS * 2)/1000 + " sec.")
consumer.poll(MAX_POLLTIME_MS * 2)
}
}
records.forEach(rec => {
if (rec.value() == expectedMessage) readExpectedRecord = true
})
} catch {
case e: Throwable => //log error
} finally {
consumer.close()
}
readExpectedRecord
}
private def initProps(propsType: String): Properties = {
val prop = new Properties()
prop.put("bootstrap.servers", kafkaServer + ":" + kafkaPort)
propsType match {
case "producer" => {
prop.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer")
prop.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer")
prop.put("acks", "1")
prop.put("producer.type", "sync")
prop.put("retries", "3")
prop.put("linger.ms", "5")
}
case "consumer" => {
prop.put("group.id", groupId)
prop.put("enable.auto.commit", "false")
prop.put("auto.commit.interval.ms", "1000")
prop.put("session.timeout.ms", "30000")
prop.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
prop.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
prop.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest")
// poll just once, should only be one record for the heartbeat
prop.put("max.poll.records", "1")
}
}
prop
}
Now when I run the code, here's what it outputs in the console:
13:04:21 - Discovered coordinator serverName:9092 (id: 2147483647
rack: null) for group 0b8947e1-eb68-4af3-ac7b-be3f7c02e76e. 13:04:23
INFO o.a.k.c.c.i.ConsumerCoordinator - Revoking previously assigned
partitions [] for group 0b8947e1-eb68-4af3-ac7b-be3f7c02e76e 13:04:24
INFO o.a.k.c.c.i.AbstractCoordinator - (Re-)joining group
0b8947e1-eb68-4af3-ac7b-be3f7c02e76e 13:04:25 INFO
o.a.k.c.c.i.AbstractCoordinator - Successfully joined group
0b8947e1-eb68-4af3-ac7b-be3f7c02e76e with generation 1 13:04:26 INFO
o.a.k.c.c.i.ConsumerCoordinator - Setting newly assigned partitions
[HeartBeat_Topic.Service_5.2018-08-03.13_04_10.377-0] for group
0b8947e1-eb68-4af3-ac7b-be3f7c02e76e 13:04:27 INFO
c.p.p.l.util.KafkaHeartBeatUtil - KafkaHeartBeat: First poll had 0
records- trying again - doubling timeout to 60 sec.
And then nothing else, no errors thrown -so no records are polled. Does anyone have any idea what's preventing the 'consume' from happening? The subscriber seems to be successful, as I'm able to successfully call the listTopics and list partions no problem.
Your code has a bug. It seems your line:
if (firstPollRecs.count() == 0)
Should say this instead
if (firstPollRecs.count() > 0)
Otherwise, you're passing in an empty firstPollRecs, and then iterating over that, which obviously returns nothing.

Kafka 0.9-Java : consumer skipping offsets during application restart

I have a java application with below properties
kafkaProperties = new Properties();
kafkaProperties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaBrokersList);
kafkaProperties.put(ConsumerConfig.GROUP_ID_CONFIG, consumerGroupName);
kafkaProperties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
kafkaProperties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
kafkaProperties.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, consumerSessionTimeoutMs);
kafkaProperties.put(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, maxPartitionFetchBytes);
kafkaProperties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
I've created 15 consumer threads and let them process the below runnable .I don't have any other consumer with this consumer group name consuming .
#Override
public void run() {
try {
logger.info("Starting ConsumerWorker, consumerId={}", consumerId);
consumer.subscribe(Arrays.asList(kafkaTopic), offsetLoggingCallback);
while (true) {
boolean isPollFirstRecord = true;
logger.debug("consumerId={}; about to call consumer.poll() ...", consumerId);
ConsumerRecords<String, String> records = consumer.poll(pollIntervalMs);
Map<Integer,Long> partitionOffsetMap = new HashMap<>();
for (ConsumerRecord<String, String> record : records) {
if (isPollFirstRecord) {
isPollFirstRecord = false;
logger.info("Start offset for partition {} in this poll : {}", record.partition(), record.offset());
}
messageProcessor.processMessage(record.value(), record.offset());
partitionOffsetMap.put(record.partition(),record.offset());
}
if (!records.isEmpty()) {
logger.info("Invoking commit for partition/offset : {}", partitionOffsetMap);
consumer.commitAsync(offsetLoggingCallback);
}
}
} catch (WakeupException e) {
logger.warn("ConsumerWorker [consumerId={}] got WakeupException - exiting ... Exception: {}",
consumerId, e.getMessage());
} catch (Exception e) {
logger.error("ConsumerWorker [consumerId={}] got Exception - exiting ... Exception: {}",
consumerId, e.getMessage());
} finally {
logger.warn("ConsumerWorker [consumerId={}] is shutting down ...", consumerId);
consumer.close();
}
}
I also have a OffsetCommitCallbackImpl like below . It basically maintains the partition's and their commited offset as map .It also logs whenever offset is committed .
#Override
public void onComplete(Map<TopicPartition, OffsetAndMetadata> offsets, Exception exception) {
if (exception == null) {
offsets.forEach((topicPartition, offsetAndMetadata) -> {
partitionOffsetMap.put(topicPartition, offsetAndMetadata);
logger.info("Offset position during the commit for consumerId : {}, partition : {}, offset : {}", Thread.currentThread().getName(), topicPartition.partition(), offsetAndMetadata.offset());
});
} else {
offsets.forEach((topicPartition, offsetAndMetadata) -> logger.error("Offset commit error, and partition offset info : {}, partition : {}, offset : {}", exception.getMessage(), topicPartition.partition(), offsetAndMetadata.offset()));
}
}
Problem/Issue :
I noticed that i miss events/messages whenever i (restart) bring the application down and bring it back up . So when i closely looked at the logging . by comparing the offsets that are committed(using offsetcommitcallback logging) before shutdown vs offsets that are picked up for processing after restart, i see that for certain partition we did not pickup the offset where we left before shutdown. sometimes the start offset for certain partition's are like 1000 more than the committed offset .
NOTE : This happens to like 8 out of 40 partitions
If you closely look at the logging in run method there is one log statement where i actually print the offset before invoking async commit . For example if that last log before shutdown shows that as 10 for partition 1 . After restart the first offset we are processing for partition 1 would be like 100 . And i validated that we are exactly missing 90 messages .
Can any one think of a reason why this would be happening ?

Categories

Resources