Kafka Transactions - Java Spring - concurrent transactions depends on each other

Kafka Transactions - Java Spring - concurrent transactions depends on each other - java

I have two instances of application which both of them have different transactional.id. When both of them initailize transaction, one transaction depends on the second one (Its like one intsance produce a message on kafka, the second one too, and if one of them commit transaction, all the messages are commited including these from second transaction if there are messages from second transaction between messages from first transaction.
Is it a normal behaviour? If yes, how can i avoid that?
EDIT: I have tried again and its actually working fine, even where there are messages between other transactions, and it works correctly for multiple instances, but when there are concurrency (like concurrency = 3 on #KafkaListener - transaction starts later, listener container doesnt initialize transaction) then one transaction may commit another transaction. What should I do when I have concurrency? Using multiple kafka producers with different transactional.id?
Config (kafka-spring:2.9.2, kafka-broker: 3.3.1) :
Non transactional kafka producers are used in different part of the system
#Bean
AdminClient adminClient(KafkaProperties kafkaProperties) {
return KafkaAdminClient.create(kafkaProperties.buildAdminProperties());
}
#Bean
KafkaTransactionManager<String, Object> kafkaTransactionManager(#Qualifier("transactionalProducer") ProducerFactory<String, Object> producerFactory) {
return new KafkaTransactionManager<>(producerFactory);
}
#Bean
KafkaTemplate<String, Object> transactionalKafkaTemplate(#Qualifier("transactionalProducer") ProducerFactory<String, Object> producerFactory) {
return new KafkaTemplate<>(producerFactory);
}
#Bean
ProducerFactory<String, Object> transactionalProducer(KafkaProperties kafkaProperties) {
final var properties = kafkaProperties.buildProducerProperties();
properties.put(ProducerConfig.TRANSACTION_TIMEOUT_CONFIG, timeoutInMillis);
DefaultKafkaProducerFactory<String, Object> producerFactory = new DefaultKafkaProducerFactory<>(properties);
// tx-${random.uuid}
producerFactory.setTransactionIdPrefix(transactionPrefix);
return producerFactory;
}
#Bean
ProducerFactory<String, Object> nonTransactionalProducer(KafkaProperties kafkaProperties) {
return new DefaultKafkaProducerFactory<>(kafkaProperties.buildProducerProperties());
}
#Bean
KafkaTemplate<String, Object> nonTransactionalKafkaTemplate(#Qualifier("nonTransactionalProducer") ProducerFactory<String, Object> producerFactory) {
return new KafkaTemplate<>(producerFactory);
}
Producer
spring.kafka.producer.key-serializer=org.apache.kafka.common.serialization.StringSerializer
spring.kafka.producer.value-serializer=org.springframework.kafka.support.serializer.JsonSerializer
spring.kafka.producer.properties.spring.json.add.type.headers=false
Consumer
kafka:
consumer:
auto-offset-reset: earliest
key-deserializer: org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
value-deserializer: org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
properties:
spring.deserializer.key.delegate.class: org.apache.kafka.common.serialization.StringDeserializer
spring.deserializer.value.delegate.class: org.apache.kafka.common.serialization.ByteArrayDeserializer
producer:
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: org.springframework.kafka.support.serializer.JsonSerializer

Related

Multi-kafka connections

There is a data stream application. It is necessary to connect and listen from several Kafka brokers (different ip-addresses, more than 2) and to write to the one.
Pls advise how to arrange multi-kafka connection?
Configuration class for a single kafka connection:
#Configuration
public class KafkaProducer {
#Bean
public Map<String, Object> producerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:29092");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
return props;
}
#Bean
public ProducerFactory<String, String> producerFactory() {
return new DefaultKafkaProducerFactory<>(producerConfigs());
}
#Bean
public KafkaTemplate<String, String> kafkaTemplate() {
return new KafkaTemplate<>(producerFactory());
}
}
It is expected several connections to be arranged and listened in the same time.

Bootstrap server config option accepts a CSV list for multiple brokers of one cluster. But you only need to provide multiple options for fault tolerance, as Kafka automatically returns all servers in the same cluster on first connection.
If you need to connect to distinct Kafka clusters, create a Bean with the different bootstrap address

Does default commit strategy when using spring with kafka with default properties loose messages?

We see some messages are lost in consuming messages from Kafka topic, especially during restarting of service when using the default properties
#Bean
public ConsumerFactory<String, String> consumerFactory()
{
// Creating a Map of string-object pairs
Map<String, Object> config = new HashMap<>();
// Adding the Configuration
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
"127.0.0.1:9092");
config.put(ConsumerConfig.GROUP_ID_CONFIG,
"group_id");
config.put(
ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class);
config.put(
ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class);
return new DefaultKafkaConsumerFactory<>(config);
}
// Creating a Listener
public ConcurrentKafkaListenerContainerFactory
concurrentKafkaListenerContainerFactory()
{
ConcurrentKafkaListenerContainerFactory<
String, String> factory
= new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
return factory;
}
}
From the documentation, it was mentioned the default value for ackMode is BATCH which states this
Commit the offset when all the records returned by the poll() have been processed
How does spring know that all the messages are processed is a sample example like in here ? and does it mean, when we restart the service offsets are committed and we haven't processed the messages leads to loosing of the messages
#KafkaListener(topics = "topicName", groupId = "foo")
public void listenGroupFoo(String message) {
System.out.println("Received Message in group foo: " + message);
}

Confluent Cloud Apache Kafka Consumer - Topic(s) [test-1] is/are not present and missingTopicsFatal is true

I'm a newbie trying to make the communication work between two Spring Boot microservices using Confluent Cloud Apache Kafka.
When using Kafka on Confluent Cloud, I'm getting the following error on my consumer(ServiceB) after ServiceA publishes the message to the topic. However, when I login to my Confluent Cloud, I see that the message has been successfully published to the topic.
org.springframework.context.ApplicationContextException: Failed to start bean
'org.springframework.kafka.config.internalKafkaListenerEndpointRegistry'; nested exception is
java.lang.IllegalStateException: Topic(s) [topic-1] is/are not present and
missingTopicsFatal is true
I do not face this issue when I run Kafka on my local server. ServiceA is able to publish the message to the topic on my local Kafka server and ServiceB is successfully able to consume that message.
I have mentioned my local Kafka server configuration in application.properties(as commented out code)
Service A: PRODUCER
application.properties
app.topic=test-1
#Remote
ssl.endpoint.identification.algorithm=https
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
request.timeout.ms=20000
bootstrap.servers=pkc-4kgmg.us-west-2.aws.confluent.cloud:9092
retry.backoff.ms=500
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule
requiredusername="*******"
password="****"
#Local
#ssl.endpoint.identification.algorithm=https
#security.protocol=SASL_SSL
#sasl.mechanism=PLAIN
#request.timeout.ms=20000
#bootstrap.servers=localhost:9092
#retry.backoff.ms=500
#sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule
Sender.java
public class Sender {
#Autowired
private KafkaTemplate<String, String> kafkaTemplate;
#Value("${app.topic}")
private String topic;
public void send(String data){
Message<String> message = MessageBuilder
.withPayload(data)
.setHeader(KafkaHeaders.TOPIC, topic)
.build();
kafkaTemplate.send(message);
}
}
KafkaProducerConfig.java
#Configuration
#EnableKafka
public class KafkaProducerConfig {
#Value("${bootstrap.servers}")
private String bootstrapServers;
#Bean
public Map<String, Object> producerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
return props;
}
#Bean
public ProducerFactory<String, String> producerFactory() {
return new DefaultKafkaProducerFactory(producerConfigs());
}
#Bean
public KafkaTemplate<String, String> kafkaTemplate() {
return new KafkaTemplate(producerFactory());
}
}
Service B: CONSUMER
application.properties
app.topic=test-1
#Remote
ssl.endpoint.identification.algorithm=https
security.protocol=SASL_SSL
sasl.mechanism=PLAIN
request.timeout.ms=20000
bootstrap.servers=pkc-4kgmg.us-west-2.aws.confluent.cloud:9092
retry.backoff.ms=500
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule
requiredusername="*******"
password="****"
#Local
#ssl.endpoint.identification.algorithm=https
#security.protocol=SASL_SSL
#sasl.mechanism=PLAIN
#request.timeout.ms=20000
#bootstrap.servers=localhost:9092
#retry.backoff.ms=500
#sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule
KafkaConsumerConfig.java
#Configuration
#EnableKafka
public class KafkaConsumerConfig {
#Value("${bootstrap.servers}")
private String bootstrapServers;
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.GROUP_ID_CONFIG, "confluent_cli_consumer_040e5c14-0c18-4ae6-a10f-8c3ff69cbc1a"); // confluent cloud consumer group-id
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
return props;
}
#Bean
public ConsumerFactory<String, String> consumerFactory() {
return new DefaultKafkaConsumerFactory(
consumerConfigs(),
new StringDeserializer(), new StringDeserializer());
}
#Bean(name = "kafkaListenerContainerFactory")
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory =
new ConcurrentKafkaListenerContainerFactory();
factory.setConsumerFactory(consumerFactory());
return factory;
}
}
KafkaConsumer.java
#Service
public class KafkaConsumer {
private static final Logger LOG = LoggerFactory.getLogger(KafkaListener.class);
#Value("{app.topic}")
private String kafkaTopic;
#KafkaListener(topics = "${app.topic}", containerFactory = "kafkaListenerContainerFactory")
public void receive(#Payload String data) {
LOG.info("received data='{}'", data);
}
}

The username and password are part of the JAAS config, so put them on one line
sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="kafkaclient1" password="kafkaclient1-secret";
I would also suggest that you verify your property file is correctly loaded into the client

See the Boot documentation.
You can't just put arbitrary kafka properties directly in the application.properties file.
The properties supported by auto configuration are shown in appendix-application-properties.html. Note that, for the most part, these properties (hyphenated or camelCase) map directly to the Apache Kafka dotted properties. Refer to the Apache Kafka documentation for details.
The first few of these properties apply to all components (producers, consumers, admins, and streams) but can be specified at the component level if you wish to use different values. Apache Kafka designates properties with an importance of HIGH, MEDIUM, or LOW. Spring Boot auto-configuration supports all HIGH importance properties, some selected MEDIUM and LOW properties, and any properties that do not have a default value.
Only a subset of the properties supported by Kafka are available directly through the KafkaProperties class. If you wish to configure the producer or consumer with additional properties that are not directly supported, use the following properties:
spring.kafka.properties.prop.one=first
spring.kafka.admin.properties.prop.two=second
spring.kafka.consumer.properties.prop.three=third
spring.kafka.producer.properties.prop.four=fourth
spring.kafka.streams.properties.prop.five=fifth
This sets the common prop.one Kafka property to first (applies to producers, consumers and admins), the prop.two admin property to second, the prop.three consumer property to third, the prop.four producer property to fourth and the prop.five streams property to fifth.
...

#cricket_007's answer is correct. You need to embed the username and password (notably, the cluster API key and API secret) within the sasl.jaas.config property value.
You can double-check how Java clients should connect to Confluent Cloud via this official example here: https://github.com/confluentinc/examples/blob/5.3.1-post/clients/cloud/java/src/main/java/io/confluent/examples/clients/cloud
Thanks,
-- Ricardo

Kafka Listener could not consume message and persist to hbase table

I have an application for logging transactions to hbase tables. I am consuming transactions messages from kafka but my kafka listener logging following lines. And could not persist to hbase table. I am using for consumer spring-kafka 2.1.7 release. How can i resolve this issue? My kafka consumer implementation like that;
#KafkaListener(topics = "${kafka.consumer.topic}")
public void receive(ConsumerRecord<String, String> consumerRecord) throws IOException {
if (StringUtils.hasText(consumerRecord.value())) {
//Some business logic
}
}
Kafka Listener Config
#Configuration
#EnableKafka
public class KafkaListenerConfig {
#Autowired
private KafkaListenerProperties kafkaListenerProperties;
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
// factory.setConcurrency(1);
factory.setConsumerFactory(consumerFactory());
return factory;
}
#Bean
public DefaultKafkaConsumerFactory consumerFactory() {
return new DefaultKafkaConsumerFactory<>(consumerProps(), stringKeyDeserializer(), workUnitJsonValueDeserializer());
}
#Bean
public Map<String, Object> consumerProps() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaListenerProperties.getBootstrap());
props.put(ConsumerConfig.GROUP_ID_CONFIG, kafkaListenerProperties.getGroup());
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, true);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
// props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "100");
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "15000");
return props;
}
#Bean
public Deserializer stringKeyDeserializer() {
return new StringDeserializer();
}
#Bean
public Deserializer workUnitJsonValueDeserializer() {
return new StringDeserializer();
}
}
Kafka Listener-Hbase logging like following;
INFO org.apache.hadoop.hbase.client.AsyncProcess [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] #106, waiting for some tasks to finish. Expected max=0, tasksInProgress=2 hasError=false, tableName=FraudRequest
INFO org.apache.hadoop.hbase.client.AsyncProcess [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] Left over 2 task(s) are processed on server(s): [XXXXX.kfs.local,60020,1552673278405]
INFO org.apache.hadoop.hbase.client.AsyncProcess [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] Regions against which left over task(s) are processed: [FraudRequest,15543323,1554891506303.d3bb6fef4ab349e93729d14d13f730bc.]
INFO org.apache.hadoop.hbase.client.AsyncProcess [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] #107, waiting for some tasks to finish. Expected max=0, tasksInProgress=2 hasError=false, tableName=FraudRequest
INFO org.apache.hadoop.hbase.client.AsyncProcess [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] Left over 2 task(s) are processed on server(s): [XXXXX.kfs.local,60020,1552673278405]
INFO org.apache.hadoop.hbase.client.AsyncProcess [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] Regions against which left over task(s) are processed: [FraudRequest,15543323,1554891506303.d3bb6fef4ab349e93729d14d13f730bc.]

Spring-Kafka Consumer Group Coordination for ConcurrentKafkaListenerContainerFactory

I have couple of questions regarding the behaviour of spring-kafka during certain scenarios. Any answers or pointers would be great.
Background: I am building a kafka consumer which talk with external apis and sends acknowledge back. My Config looks like this:
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, brokerServers());
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
props.put(ConsumerConfig.GROUP_ID_CONFIG, this.configuration.getString("kafka-generic.consumer.group.id"));
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "5000000");
props.put(ConsumerConfig.REQUEST_TIMEOUT_MS_CONFIG, "6000000");
return props;
}
#Bean
public RetryTemplate retryTemplate() {
final ExponentialRandomBackOffPolicy backOffPolicy = new ExponentialRandomBackOffPolicy();
backOffPolicy.setInitialInterval(this.configuration.getLong("retry-exp-backoff-init-interval"));
final SimpleRetryPolicy retryPolicy = new SimpleRetryPolicy();
retryPolicy.setMaxAttempts(this.configuration.getInt("retry-max-attempts"));
final RetryTemplate retryTemplate = new RetryTemplate();
retryTemplate.setBackOffPolicy(backOffPolicy);
retryTemplate.setRetryPolicy(retryPolicy);
return retryTemplate;
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, Event> retryKafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, Event> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setConcurrency(this.configuration.getInt("kafka-concurrency"));
factory.setRetryTemplate(retryTemplate());
factory.getContainerProperties().setIdleEventInterval(this.configuration.getLong("kafka-rtm-idle-time"));
//factory.getContainerProperties().setAckOnError(false);
factory.getContainerProperties().setErrorHandler(kafkaConsumerErrorHandler);
factory.getContainerProperties().setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL_IMMEDIATE);
return factory;
}
Lets say no of partitions I have is 4. My partition distribution is for KafkaListener is:
#KafkaListener(topicPartitions = #TopicPartition(topic = "topic", partitions = {"0", "1"}),
containerFactory = "retryKafkaListenerContainerFactory")
public void receive(Event event, Acknowledgment acknowledgment) throws Exception {
serviceInvoker.callService(event);
acknowledgment.acknowledge();
}
#KafkaListener(topicPartitions = #TopicPartition(topic = "topic", partitions = {"2", "3"}),
containerFactory = "retryKafkaListenerContainerFactory")
public void receive1(Event event, Acknowledgment acknowledgment) throws Exception {
serviceInvoker.callService(event);
acknowledgment.acknowledge();
}
Now my questions are:
Let's say I have 2 machines where I deployed this code (with the same consumer group id). If I understood properly, if I get a event for a partition, one of the machines' kafkalistener for corresponding partition and will listen but the other machines' kafkalistener won't listen to this event. Is it?
My error handler is:
#Named
public class KafkaConsumerErrorHandler implements ErrorHandler {
#Inject
private KafkaListenerEndpointRegistry kafkaListenerEndpointRegistry;
#Override
public void handle(Exception e, ConsumerRecord<?, ?> consumerRecord) {
System.out.println("Shutting down all the containers");
kafkaListenerEndpointRegistry.stop();
}
}
Lets talk abt a scenario where a consumers' kafkalistener is called where it calls serviceInvoker.callService(event); but the service is down, then according to the retryKafkaListenerContainerFactory, it retries for 3 times then fails, then errorhandler is called thus stopping kafkaListenerEndpointRegistry. Will this shutdown all other consumers or machines with the same consumer group or just this consumer or machine?
Lets talk abt the scenerio in 2. Is there any configuration where we need to change to let kafka know to hold off acknowledgement for that much time?
My kafka producer produces messages for every 10 mins. Do I need to configure that 10 mins anywhere in my Consumer code or is it agnostic of such?
In my KafkaListener annotations I hardcoded topic name and partitions. Can I change it during run time?
Any help is really appreciated. Thanks in advance. :)

Correct; only 1 will get it.
It will only stop the local containers - Spring doesn't know anything about your other instances.
Since you have ackOnError=false, the offset won't be committed.
The consumer does not need to know how often messages are published.
You can't change them at runtime, but you can use property placeholders ${...} or Spel Expressions #{...} to set them up during application initialization.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.