Spring-Boot and Kafka : How to handle broker not available? - java

While the spring-boot app is running and if I shutdown the broker completely ( both kafka and zookeeper ) I am seeing this warn in console for infinite amount of time.
[org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1]
WARN o.apache.kafka.clients.NetworkClient - [Consumer
clientId=consumer-1, groupId=ResponseReceiveConsumerGroup]
Connection to node 2147483647 could not be established. Broker may not
be available.
Is there a way in Spring Boot to handle this gracefully instead of infinite logs on console ?

Increase the reconnect.backoff.ms property (see Kafka docs).
The default is only 50ms.

Related

How to integrate Camel Spring boot with Kafka in Confluent Cloud?

I have an application which uses camel Spring Boot with debeezium to listen to a mysql database and to publish in a Kafka topic.
It was all working fine since I changed the kafka from local to use confluent cloud. I have some other applications (normal producers and consumers) that connect to confluent cloud and all works fine.
This is my application.yml. I removed the debeezium-mysql part because it works fine. So I left only the config part of Kafka/Confluent.
routes:
debezium:
allow-public-key-retrieval: true
bootstrap-servers: ${application.kafka.brokers}
offset-storage:
topic-cleanup-policy: compact
camel:
component:
debezium-mysql:
ALL CONFIG with mysql, it is not here because it is working fine
kafka:
brokers: ${application.kafka.brokers}
schema-registry-u-r-l: ${application.schema-registry.base-urls}
value-serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
key-serializer: org.apache.kafka.common.serialization.StringSerializer
additional-properties:
#CCloud Schema Registry Connection parameter
schema.registry.basic.auth.credentials.source: USER_INFO
schema.registry.basic.auth.user.info: ${SCHEMA_REGISTRY_ACCESS_KEY}:${SCHEMA_REGISTRY_SECRET_KEY}
ssl.endpoint.identification.algorithm: https
client.dns.lookup: use_all_dns_ips
sasl-jaas-config: org.apache.kafka.common.security.plain.PlainLoginModule required username="${CONFLUENT_CLOUD_USERNAME}" password="${CONFLUENT_CLOUD_PASSWORD}";
security-protocol: SASL_SSL
retry-backoff-ms: 500
request-timeout-ms: 20000
sasl-mechanism: PLAIN
With this config, it keeps giving me an error when I try to star the app:
[AdminClient clientId=adminclient-1] Node -1 disconnected.
2023-02-02 16:57:24.853 INFO 9644 --- [| adminclient-1] org.apache.kafka.clients.NetworkClient : [AdminClient clientId=adminclient-1] Cancelled in-flight API_VERSIONS request with correlation id 0 due to node -1 being disconnected (elapsed time since creation: 146ms, elapsed time since send: 146ms, request timeout: 3600000ms)
I could check that the problem is in this AdminConfig, which doesn't get the right properties. An example, the security.protocol should be SASL_SSL, but it gets PLAINTEXT. But when creating the producer and consumer it gets the right values.
Really, I have been struggling with this two days. I would be really happy with any help. Thank you.

How to use Kafka with SSL via logback appander?

I use this logback appender to send logs to Kafka:
https://github.com/danielwegener/logback-kafka-appender
When Kafka was PLAINTEXT everything worked correctly. But when Kafka changed to SSL, it is not possible to send messages. I did not find the necessary information in readme.md. Has anyone had this setup experience? Or maybe use something else?
<topic>TEST_TOPIC_FOR_OS</topic>
<keyingStrategy class="com.github.danielwegener.logback.kafka.keying.NoKeyKeyingStrategy"/>
<deliveryStrategy class="com.github.danielwegener.logback.kafka.delivery.AsynchronousDeliveryStrategy">
</deliveryStrategy>
<producerConfig>metadata.fetch.timeout.ms=99999999999</producerConfig>
<producerConfig>bootstrap.servers=KAFKA BROKER HOST</producerConfig>
<producerConfig>acks=0</producerConfig>
<producerConfig>linger.ms=1000</producerConfig>
<producerConfig>buffer.memory=16777216</producerConfig>
<producerConfig>max.block.ms=100</producerConfig>
<producerConfig>retries=2</producerConfig>
<producerConfig>client.id=${HOSTNAME}-${CONTEXT_NAME}-logback</producerConfig>
<producerConfig>compression.type=none</producerConfig>
<producerConfig>security.protocol=SSL</producerConfig>
<producerConfig>ssl.keystore.location= path_to_jks</producerConfig>
<producerConfig>ssl.keystore.password=PASSWORD</producerConfig>
<producerConfig>ssl.truststore.location=path_to_jks </producerConfig>
<producerConfig>ssl.truststore.password=PASSWORD </producerConfig>
<producerConfig>ssl.endpoint.identification.algorithm=</producerConfig>
<producerConfig>ssl.protocol=TLSv1.1</producerConfig>
For any existing topic, I get an error:
12:05:49.505 [kafka-producer-network-thread | host-default-logback] route: DEBUG o.a.k.clients.producer.KafkaProducer breadcrumbId: - [Producer clientId=host-default-logback] Exception occurred during message send:
org.apache.kafka.common.errors.TimeoutException: Topic TEST_TOPIC_FOR_OS not present in metadata after 100 ms.
The application itself works correctly with this kafka and topic
The problem went away with the upgrade of appender to 0.2.0

Flink 1.9.1 Can no longer connect to Azure Event Hub

Java 8,
Flink 1.9.1,
Azure Event Hub
I can no longer connect to azure event hub with my flink project as of Jan 5th 2020. I was having the same issue with several spring boot apps but the issue was resolved when i upgraded to Spring Boot 2.2.2 which also updated Kafka Clients and Kafka dependencies to 2.3.1. I have attempted to update Flink's kafka dependencies without success. I've also submitted an issue
https://issues.apache.org/jira/browse/FLINK-15557
2020-01-10 19:36:30,364 WARN org.apache.kafka.clients.NetworkClient -
[Consumer clientId=consumer-1, groupId=****] Bootstrap broker
*****.servicebus.windows.net:9093 (id: -1 rack: null) disconnected
Connection Properties
"sasl.mechanism"="PLAIN");
"security.protocol"="SASL_SSL");
"sasl.jaas.config"="Endpoint=sb://<FQDN>/;SharedAccessKeyName=<KeyName>;SharedAccessKey=<KeyValue>;EntityPath=<EntityValue>;
You must be using entity level connection string and that is why your clients are observing connection failures. The issue should resolve when namespace level connection string is used.

Consumer does not receive messages after kafka producer/consumer restart

We have one producer & one consumer & one partition. Both consumer/producer are spring boot applications. The consumer app runs on my local machine while producer along with kafka & zookeeper on a remote machine.
During development, I redeployed my producer application with some changes. But after that my consumer is not receiving any messages. I tried restarting the consumer, but no luck. What can be the issue and/or how can it be solved?
Consumer Config:
spring:
cloud:
stream:
defaultBinder: kafka
bindings:
input:
destination: sales
content-type: application/json
kafka:
binder:
brokers: ${SERVICE_REGISTRY_HOST:127.0.0.1}
zkNodes: ${SERVICE_REGISTRY_HOST:127.0.0.1}
defaultZkPort: 2181
defaultBrokerPort: 9092
server:
port: 0
Producer Config:
cloud:
stream:
defaultBinder: kafka
bindings:
output:
destination: sales
content-type: application/json
kafka:
binder:
brokers: ${SERVICE_REGISTRY_HOST:127.0.0.1}
zkNodes: ${SERVICE_REGISTRY_HOST:127.0.0.1}
defaultZkPort: 2181
defaultBrokerPort: 9092
EDIT2:
After 5 minutes the consumer app dies with following exception:
2017-09-12 18:14:47,254 ERROR main o.s.c.s.b.k.p.KafkaTopicProvisioner:253 - Cannot initialize Binder
org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata
2017-09-12 18:14:47,255 WARN main o.s.b.c.e.AnnotationConfigEmbeddedWebApplicationContext:550 - Exception encountered during context initialization - cancelling refresh attempt: org.springframework.context.ApplicationContextException: Failed to start bean 'inputBindingLifecycle'; nested exception is org.springframework.cloud.stream.binder.BinderException: Cannot initialize binder:
2017-09-12 18:14:47,256 INFO main o.s.i.m.IntegrationMBeanExporter:449 - Unregistering JMX-exposed beans on shutdown
2017-09-12 18:14:47,257 INFO main o.s.i.m.IntegrationMBeanExporter:241 - Unregistering JMX-exposed beans
2017-09-12 18:14:47,257 INFO main o.s.i.m.IntegrationMBeanExporter:375 - Summary on shutdown: input
2017-09-12 18:14:47,257 INFO main o.s.i.m.IntegrationMBeanExporter:375 - Summary on shutdown: nullChannel
2017-09-12 18:14:47,258 INFO main o.s.i.m.IntegrationMBeanExporter:375 - Summary on shutdown: errorChannel
See if the suggestion above about DEBUG reveals any further information. It looks like you are getting some Timeout exception from the KafkaTopicProvisioner. But that occurs when you restart the consumer I assume. It looks like the consumer has some trouble communicating to the broker somehow and you need to find out whats going on there.
Well, it looks like there is already a bug reported with spring-cloud-stream-binder-kafka stating the resetOffset property has no effect. Hence, on the consumer always requested messages with the offset as latest.
As mentioned on the git issue, the only workaround is to fix this via the kafka consumer CLI tool.

Kafka leader election causes Kafka Streams crash

I have a Kafka Streams application consuming from and producing to a Kafka cluster with 3 brokers and a replication factor of 3. Other than the consumer offset topics (50 partitions), all other topics have only one partition each.
When the brokers attempt a preferred replica election, the Streams app (which is running on a completely different instance than the brokers) fails with the error:
Caused by: org.apache.kafka.streams.errors.StreamsException: task [0_0] exception caught when producing
at org.apache.kafka.streams.processor.internals.RecordCollectorImpl.checkForException(RecordCollectorImpl.java:119)
...
at org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:197)
Caused by: org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.
Is it normal that the Streams app attempts to be the leader for the partition, given that it's running on a server that's not part of the Kafka cluster?
I can reproduce this behaviour on demand by:
Killing one of the brokers (whereupon the other 2 take over as leader for all partitions that had the killed broker as their leader, as expected)
Bringing the killed broker back up
Triggering a preferred replica leader election with bin/kafka-preferred-replica-election.sh --zookeeper localhost
My issue seems to be similar to this reported failure, so I'm wondering if this is a new Kafka Streams bug. My full stack trace is literally exactly the same as the gist linked in the reported failure (here).
Another potentially interesting detail is that during the leader election, I get these messages in the controller.log of the broker:
[2017-04-12 11:07:50,940] WARN [Controller-3-to-broker-3-send-thread], Controller 3's connection to broker BROKER-3-HOSTNAME:9092 (id: 3 rack: null) was unsuccessful (kafka.controller.RequestSendThread)
java.io.IOException: Connection to BROKER-3-HOSTNAME:9092 (id: 3 rack: null) failed
at kafka.utils.NetworkClientBlockingOps$.awaitReady$1(NetworkClientBlockingOps.scala:84)
at kafka.utils.NetworkClientBlockingOps$.blockingReady$extension(NetworkClientBlockingOps.scala:94)
at kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:232)
at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:185)
at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:184)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
I initially thought this connection error was to blame, but after the leader election crashes the Streams app, if I restart the Streams app, it works normally until the next election, without me touching the brokers at all.
All servers (3 Kafka brokers and the Streams app) are running on EC2 instances.
This is now fixed in 0.10.2.1. If you can't pick that up, make sure you have these two parameters set as follows in your streams config:
final Properties props = new Properties();
...
props.put(ProducerConfig.RETRIES_CONFIG, 10);
props.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, Integer.toString(Integer.MAX_VALUE));

Categories

Resources