What is 'Simple Consumer Group' in Apache Kafka? - java

I used the code below to find Kafka consumer groups.
ListConsumerGroupsResult listConsumerGroups = admin.listConsumerGroups();
listConsumerGroups.all().get().forEach(v -> {
logger.info("{}", v);
});
The results are as follows.
[main] INFO KafkaAdminClient - (groupId='test-consumer-group', isSimpleConsumerGroup=false)
[main] INFO KafkaAdminClient - (groupId='wordcount-example', isSimpleConsumerGroup=false)
I want to know what isSimpleConsumerGroup is.
What is simple consumer group?

It is an implementation of consumer-group in kafka, and the main reason of using it is greater control over partition consumption than other consumer group(more implementation available).
For detailed answer you can have a look at this link https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example

Related

Kafka Stream StateStore infinite loop

We have a KStream app that uses in-memory KV StateStore but with changelog disabled.
String stateStoreName = "statestore-v1";
StoreBuilder<KeyValueStore<String, Event>> keyValueStoreBuilder =
Stores.keyValueStoreBuilder(Stores.inMemoryKeyValueStore(stateStoreName),
Serdes.String(), new JsonSerde<>(Event.class));
keyValueStoreBuilder.withLoggingDisabled();
streamsBuilder.addStateStore(keyValueStoreBuilder);
We now want to enable the changelog with different configuration and different name.
String stateStoreName = "statestore-v2";
StoreBuilder<KeyValueStore<String, Event>> keyValueStoreBuilder =
Stores.keyValueStoreBuilder(Stores.inMemoryKeyValueStore(stateStoreName),
Serdes.String(), new JsonSerde<>(Event.class));
Map<String, String> changelogConfig = new HashMap<>();
changelogConfig.put("retention.ms", "43200000"); // 12 hours
changelogConfig.put("cleanup.policy", "delete");
changelogConfig.put("auto.offset.reset", "latest");
keyValueStoreBuilder.withLoggingEnabled(changelogConfig);
streamsBuilder.addStateStore(keyValueStoreBuilder);
When we run our application, we got into infinite loop with these messages:
2022-10-11 13:02:32.761 app=myapp INFO 54561 --- [-StreamThread-3]
o.a.k.s.p.i.StoreChangelogReader : stream-thread [myapp-StreamThread-3]
End offset for changelog myapp-statestore-v2-changelog-4 cannot be found;
will retry in the next time.
2022-10-11 13:02:32.761 app=myapp INFO 54561 --- [-StreamThread-3]
o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=myapp-StreamThread-3-restore-consumer, groupId=null]
Unsubscribed all topics or patterns and assigned partitions
It does not appear that the changelog topic is ever created... At least kafka-topics does not show it.
I am using io.confluent packages version 7.2.2-ccs, which I think translates to Apache Kafka version 3.2.x
Any ideas on how to fix the infinite loop and get the changelog topics created?
Thanks!
The infinite loop was caused because we were doing Blue/Green deployment. We learned that we can not do this if we are changing anything with the StateStore (configuration or disabling/re-enabling changelogs).
We just did a complete shutdown of the old version, then deployed the new version. That worked fine.
Another option would be to use the kafka-streams-application-reset tool as OneKricketeer suggested.

How to solve InvalidTopicException with multiplexed input topics in Spring Cloud Stream Kafka Streams Binder?

I wrote a Spring Cloud Streams Kafka Streams Binder application that has multiple Kafka input topics multiplexed to one stream with:
spring:
cloud:
stream:
bindings:
process-in-0:
destination: test.topic-a,test.topic-b
(Source: https://spring.io/blog/2019/12/03/stream-processing-with-spring-cloud-stream-and-apache-kafka-streams-part-2-programming-model-continued)
But whenever I set up more than one topic in the input destination (separated by comma), the following error occurs:
2022-06-17 14:07:07.648 INFO --- [-StreamThread-1] o.a.k.clients.consumer.KafkaConsumer : [Consumer clientId=test-processor-2ba8d1d3-5bbe-45d3-a832-6a24cf2f5549-StreamThread-1-consumer, groupId=test-processor] Subscribed to topic(s): test-processor-KTABLE-AGGREGATE-STATE-STORE-0000000005-repartition, test.topic-a,test.topic-b
2022-06-17 14:07:07.660 WARN --- [-StreamThread-1] org.apache.kafka.clients.NetworkClient : [Consumer clientId=test-processor-2ba8d1d3-5bbe-45d3-a832-6a24cf2f5549-StreamThread-1-consumer, groupId=test-processor] Error while fetching metadata with correlation id 2 : {test-processor-KTABLE-AGGREGATE-STATE-STORE-0000000005-repartition=UNKNOWN_TOPIC_OR_PARTITION, test.topic-a,test.topic-b=INVALID_TOPIC_EXCEPTION}
2022-06-17 14:07:07.660 ERROR --- [-StreamThread-1] org.apache.kafka.clients.Metadata : [Consumer clientId=test-processor-2ba8d1d3-5bbe-45d3-a832-6a24cf2f5549-StreamThread-1-consumer, groupId=test-processor] Metadata response reported invalid topics [test.topic-a,test.topic-b]
2022-06-17 14:07:07.660 INFO --- [-StreamThread-1] org.apache.kafka.clients.Metadata : [Consumer clientId=test-processor-2ba8d1d3-5bbe-45d3-a832-6a24cf2f5549-StreamThread-1-consumer, groupId=test-processor] Cluster ID: XYZ
2022-06-17 14:07:07.663 ERROR --- [-StreamThread-1] org.apache.kafka.streams.KafkaStreams : stream-client [test-processor-2ba8d1d3-5bbe-45d3-a832-6a24cf2f5549] Encountered the following exception during processing and Kafka Streams opted to SHUTDOWN_CLIENT. The streams client is going to shut down now.
org.apache.kafka.streams.errors.StreamsException: org.apache.kafka.common.errors.InvalidTopicException: Invalid topics: [test.topic-a,test.topic-b]
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:627) ~[kafka-streams-3.2.0.jar:na]
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:551) ~[kafka-streams-3.2.0.jar:na]
Caused by: org.apache.kafka.common.errors.InvalidTopicException: Invalid topics: [test.topic-a,test.topic-b]
I tried with the following dependencies:
implementation 'org.apache.kafka:kafka-clients:3.2.0'
implementation 'org.apache.kafka:kafka-streams:3.2.0'
implementation "org.springframework.cloud:spring-cloud-stream"
implementation "org.springframework.cloud:spring-cloud-stream-binder-kafka"
implementation "org.springframework.cloud:spring-cloud-stream-binder-kafka-streams"
implementation "org.springframework.kafka:spring-kafka"
When I only set one input topic, everything works fine.
I am not able to determine what causes the InvalidTopicException, because I only use permitted characters in topic names and also the comma separator seems correct (else different exceptions occur).
Actually right after posting the question I found out the/one solution/workaround myself. So here it is for future help:
Apparently, I am not allowed to multiplex input topics when my processor topology expects a KTable as input type. When I change the processor signature to KStream, it suddenly works:
Not working:
#Bean
public Function<KTable<String, Object>, KStream<String, Object>> process() {
return stringObjectKTable ->
stringObjectKTable
.mapValues(...
Working:
#Bean
public Function<KStream<String, Object>, KStream<String, Object>> process() {
return stringObjectKStream ->
stringObjectKStream
.toTable()
.mapValues(...
I am not sure, if this is expected behaviour or if there is something else wrong, so I appreciate any hints, if there is more underlying.

Error in kafka consumer with spring boot (disconnected)

I am getting this error in consumer when using spring boot:
2021-10-11 19:42:41.388 WARN 64415 --- [ntainer#0-0-C-1] org.apache.kafka.clients.NetworkClient : [Consumer clientId=consumer-something-1, groupId=some_group_id] Bootstrap broker my_address:9093 (id: -1 rack: null) disconnected
My application.properties are:
spring.kafka.consumer.bootstrap-servers = my_address:9093
spring.kafka.consumer.group-id= some_group_id
spring.kafka.consumer.auto-offset-reset = earliest
Java code will call following properties in consumer.properties
bootstrap.servers=my_address:9093
schema.registry.url=https://URL.net
security.protocol=SSL
ssl.truststore.location=some_path.jks
ssl.truststore.password=pinPass
ssl.keystore.location=some_path.jks
ssl.keystore.password=pinPass
ssl.key.password=pinPass
in java:
properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,io.confluent.kafka.serializers.KafkaAvroDeserializer.class);
properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,io.confluent.kafka.serializers.KafkaAvroDeserializer.class);
Please let me know if any more inputs needed.
Note: same code works fine for localhost..Also there is no issue with server as i have another code without spring and it works fine!
Your "security.protocol=SSL" means you will use SSL for identity authentication, please check your SSL Keys and Certificates, otherwise using "security.protocol=PLAINTEXT" will use the default, Un-authenticated, non-encrypted channel.
refer to:
https://docs.confluent.io/platform/current/kafka/authentication_ssl.html

Error while fetching metadata with correlation id 92 : {myTest=UNKNOWN_TOPIC_OR_PARTITION}

I have created a sample application to check my producer's code. My application runs fine when I'm sending data without a partitioning key. But, on specifying a key for data partitioning I'm getting the error:
[kafka-producer-network-thread | producer-1] WARN org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Error while fetching metadata with correlation id 37 : {myTest=UNKNOWN_TOPIC_OR_PARTITION}
[kafka-producer-network-thread | producer-1] WARN org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Error while fetching metadata with correlation id 38 : {myTest=UNKNOWN_TOPIC_OR_PARTITION}
[kafka-producer-network-thread | producer-1] WARN org.apache.kafka.clients.NetworkClient - [Producer clientId=producer-1] Error while fetching metadata with correlation id 39 : {myTest=UNKNOWN_TOPIC_OR_PARTITION}
for both consumer and producer. I have searched a lot on the internet, they have suggested to verify kafka.acl settings. I'm using kafka on HDInsight and I have no idea how to verify it and solve this issue.
My cluster has following configuration:
Head Node: 2
Worker Node:4
Zookeeper: 3
MY producer code:
public static void produce(String brokers, String topicName) throws IOException{
// Set properties used to configure the producer
Properties properties = new Properties();
// Set the brokers (bootstrap servers)
properties.setProperty("bootstrap.servers", brokers);
properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
// specify the protocol for Domain Joined clusters
//To create an Idempotent Producer
properties.setProperty(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, "true");
properties.setProperty(ProducerConfig.ACKS_CONFIG, "all");
properties.setProperty(ProducerConfig.RETRIES_CONFIG, Integer.toString(Integer.MAX_VALUE));
properties.setProperty(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "test-transactional-id");
KafkaProducer<String, String> producer = new KafkaProducer<>(properties);
producer.initTransactions();
// So we can generate random sentences
Random random = new Random();
String[] sentences = new String[] {
"the cow jumped over the moon",
"an apple a day keeps the doctor away",
"four score and seven years ago",
"snow white and the seven dwarfs",
"i am at two with nature",
};
for(String sentence: sentences){
// Send the sentence to the test topic
try
{
String key=sentence.substring(0,2);
producer.beginTransaction();
producer.send(new ProducerRecord<String, String>(topicName,key,sentence)).get();
}
catch (Exception ex)
{
System.out.print(ex.getMessage());
throw new IOException(ex.toString());
}
producer.commitTransaction();
}
}
Also, My topic consists of 3 partitions with replication factor=3
I made the replication factor less than the number of partitions and it worked for me. It sounds odd to me but yes, it started working after it.
The error clearly states that the topic (or partition) you are producing to does not exist.
Ultimately, you will need to describe the topic (via CLI kafka-topics --describe --topic <topicName> or other means) to verify if this is true
Kafka on HDInsight and I have no idea how to verify it and solve this issue.
ACLs are only setup if you installed the cluster with them, but I believe you can still list ACLs via zookeper-shell or SSHing into one of Hadoop masters.
I too had the same issue while creating a new topic. And when I described the topic, I could see that leaders were not assigned to the topic partitions.
Topic: xxxxxxxxx Partition: 0 Leader: none Replicas: 3,2,1 Isr:
Topic: xxxxxxxxx Partition: 1 Leader: none Replicas: 1,3,2 Isr:
After some googling, figured out that this could happen when we some issue with controller broker, so restarted the controller broker.
And Everything worked as expected...!
If the topic exists but you're still seeing this error, it could mean that the supplied list of brokers is incorrect. Check the bootstrap.servers value, it should be pointing to the right Kafka cluster where the topic resides.
I saw the same issue and I have multiple Kafka clusters and the topic clearly exists. However, my list of brokers was incorrect.

Spring Cloud Stream KStream Consumer Concurrency has no effect?

My configuration for the consumer is as documented in Spring cloud stream consumer properties documentation.
spring-cloud-dependencies:Finchley.SR1
springBootVersion = '2.0.5.RELEASE'
I have 4 partitions for kstream_test topic and they are filled with messages from producer as seen below:
root#kafka:/# kafka-run-class kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic kstream_test --time -1
kstream_test:2:222
kstream_test:1:203
kstream_test:3:188
kstream_test:0:278
My spring cloud stream kafka binder based configuration is:
spring.cloud.stream.bindings.input:
destination: kstream_test
group: consumer-group-G1_test
consumer:
useNativeDecoding: true
headerMode: raw
startOffset: latest
partitioned: true
concurrency: 3
KStream Listener class
#StreamListener
#SendTo(MessagingStreams.OUTPUT)
public KStream<?, ?> process(#Input(MessagingStreams.INPUT) KStream<?, ?> kstreams) {
......
log.info("Got a message");
......
return kstreams;
}
My producer sends 100 messages in 1 run. But the logs seems to have only 1 thread StreamThread-1 handling the messages, though I have concurrency as 3. What might be wrong here ? Is 100 messages not enough to see the concurrency at play ?
2018-10-18 11:50:01.923 INFO 10228 --- [-StreamThread-1] c.c.c.s.KStreamHandler : Got a message
2018-10-18 11:50:01.923 INFO 10228 --- [-StreamThread-1] c.c.c.s.KStreamHandler : Got a message
2018-10-18 11:50:01.945 INFO 10228 --- [-StreamThread-1] c.c.c.s.KStreamHandler : Got a message
2018-10-18 11:50:01.956 INFO 10228 --- [-StreamThread-1] c.c.c.s.KStreamHandler : Got a message
2018-10-18 11:50:01.972 INFO 10228 --- [-StreamThread-1] c.c.c.s.KStreamHandler : Got a message
UPDATE:
As per the answer, the below num.stream.threads configuration works at the binder level.
spring.cloud.stream.kafka.streams.binder.configuration:
num.stream.threads: 3
It seems that the num.stream.threads needs to be set to increase the concurrency...
/** {#code num.stream.threads} */
#SuppressWarnings("WeakerAccess")
public static final String NUM_STREAM_THREADS_CONFIG = "num.stream.threads";
private static final String NUM_STREAM_THREADS_DOC = "The number of threads to execute stream processing.";
...it defaults to 1.
The binder should really set that based on the ...consumer.concurrency property; please open a github issue to that effect against the binder.
In the meantime, you can just set that property directly in ...consumer.configuration.
CORRECTION
I've just been told that the ...consumer.configuration is not currently applied to the streams binder either; you would have to set it at the binder level.

Categories

Resources