I hope I'm able to explain this!
I'm starting a dockerized java spring boot application that will connect to a single dockerized Kafka instance.
To do this I have setup a link in the docker-compose file that will allow the application to connect to the kafka docker, named kafka-cluster on port 9092.
When I start the both containers I get an error in the java application saying that it's unable to connect to KafkaAdmin:
[AdminClient clientId=adminclient-2] Connection to node -1 (localhost/127.0.0.1:9092) could not be established. Broker may not be available.
But it's trying to connect to localhost/127.0.0.1.
Further up in the logs, I can see it started a connection to KafkaAdmin twice:
first:
2020-03-25 13:53:15.515 INFO 7 --- [ main] o.a.k.clients.admin.AdminClientConfig : AdminClientConfig values:
bootstrap.servers = [kafka-cluster:9092]
client.dns.lookup = default
client.id =
connections.max.idle.ms = 300000
<more properties>
and then again straight after (but on localhost):
2020-03-25 13:53:15.780 INFO 7 --- [ main] o.a.k.clients.admin.AdminClientConfig : AdminClientConfig values:
bootstrap.servers = [localhost:9092]
client.dns.lookup = default
client.id =
connections.max.idle.ms = 300000
<more properties>
This is the config:
#Bean
public KafkaAdmin kafkaAdmin() {
Map<String, Object> configs = new HashMap<>();
configs.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
return new KafkaAdmin(configs);
}
#Bean
public NewTopic sysCcukCdcAssetsCreate() {
return new NewTopic(newPanelTopic, 1, (short) 1);
}
#Bean
public NewTopic sysCcukCdcAssetsUpdate() {
return new NewTopic(updatedPanelTopic, 1, (short) 1);
}
where bootstrapServers = kafka-cluster:9092
I can't see why it seems like KafkaAdmin has 2 sets of configs but it seems like it is causing the error.
Any guidance or suggestions massively appreciated :)
So! This was a great rubber duck.
Turns out the spring.kafka.bootstrap.servers property hadn't been set in the properties file and it defaults to localhost setting this to kafka-cluster:9092 fixed it :)
Related
I'm using Java 11 and kafka-client 2.0.0.
I'm using the following code to generate a consumer :
public Consumer createConsumer(Properties properties,String regex) {
log.info("Creating consumer and listener..");
Consumer consumer = new KafkaConsumer<>(properties);
ConsumerRebalanceListener listener = new ConsumerRebalanceListener() {
#Override
public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
log.info("The following partitions were revoked from consumer : {}", Arrays.toString(partitions.toArray()));
}
#Override
public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
log.info("The following partitions were assigned to consumer : {}", Arrays.toString(partitions.toArray()));
}
};
consumer.subscribe(Pattern.compile(regex), listener);
log.info("consumer subscribed");
return consumer;
}
}
My poll loop is in a different place in the code :
public <K, V> void startWorking(Consumer<K, V> consumer) {
try {
while (true) {
ConsumerRecords<K, V> records = consumer.poll(600);
if (records.count() > 0) {
log.info("Polled {} records", records.count());
} else {
log.info("polled 0 records.. going to sleep..");
Thread.sleep(200);
}
}
} catch (WakeupException | InterruptedException e) {
log.error("Consumer is shutting down", e);
} finally {
consumer.close();
}
}
When I run the code and use this function, the consumer is created and the log contains the following messages :
Creating consumer and listener..
consumer subscribed
polled 0 records.. going to sleep..
polled 0 records.. going to sleep..
polled 0 records.. going to sleep..
The log doesn't contain any info regarding the partition assignment/revocation.
In addition I'm able to see in the log the properties that the consumer uses (group.id is set) :
2020-07-09 14:31:07.959 DEBUG 7342 --- [ main] o.a.k.clients.consumer.ConsumerConfig : ConsumerConfig values:
auto.commit.interval.ms = 5000
auto.offset.reset = latest
bootstrap.servers = [server1:9092]
check.crcs = true
client.id =
group.id=mygroupid
key.deserializer=..
value.deserializer=..
So I tried to use the kafka-console-consumer with the same configuration in order to consume from one of the topics that the regex(mytopic.*) should catch (in this case I used the topic mytopic-1) :
/usr/bin/kafka-console-consumer.sh --bootstrap-server server1:9092 --topic mytopic-1 --property print.timestamp=true --consumer.config /data/scripts/kafka-consumer.properties --from-begining
I have a poll loop in other part of my code that is timing out every 10m .So the bottom line - the problem is that partitions aren't assigned to the Java consumer. The prints inside the listener never happen and the consumer doesn't have any partitions to listen to.
It seems that I was missing the ssl property in my properties file. Don't forget to specify the security.protocol=ssl if you use ssl. It seems that kafka-client API doesn't throw exception if the Kafka uses ssl and you try to access it without ssl parameter configured.
I am Using Apache Pulsar v2.4.2.And taken refrence of pulsar-io-jdbc and created custom sink connector packaged nar.
And running the following commands to upload schema and create connector
1.Uploaded AVRO Schema.
bin/pulsar-admin schemas upload myjdbc --filename ./connectors/pulsar-io-jdbc-schema.conf
Here is the avro schema:
{
"type": "AVRO",
"schema": "{\"type\":\"record\",\"name\":\"Test\",\"fields\":[{\"name\":\"id\",\"type\":[\"null\",\"int\"]},{\"name\":\"name\",\"type\":[\"null\",\"string\"]}]}",
"properties": {}
}
2.Create connector
bin/pulsar-admin sinks localrun --archive ./connectors/pulsar-io-jdbc-2.4.2.nar --inputs myjdbc --name myjdbc --sink-config-file ./connectors/pulsar-io-jdbc.yaml
And onces I am sending message to topic "myjdbc" from producer.SendAsync(message).
where message is
for (int i = 0; i < count; i++)
{
JObject obj = new JObject();
obj.Add("id", i);
obj.Add("name", "testing topic" + i);
var str = obj.ToString(Formatting.None);
var messageId = await producer.SendAsync(Encoding.UTF8.GetBytes(str));
Console.WriteLine($"MessageId is: '{messageId}'");
}
Pulsar localrun connector is getting the following exception:
19:48:37.437 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ClientCnx - [id: 0x557fb218, L:/127.0.0.1:39488 - R:localhost/127.0.0.1:6650] Connected through proxy to target broker at 192.168.1.97:6650
19:48:37.445 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConsumerImpl - [myjdbc][public/default/myjdbc] Subscribing to topic on cnx [id: 0x557fb218, L:/127.0.0.1:39488 - R:localhost/127.0.0.1:6650]
19:48:37.503 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConsumerImpl - [myjdbc][public/default/myjdbc] Subscribed to topic on localhost/127.0.0.1:6650 -- consumer: 0
19:48:37.557 [pulsar-client-io-1-1] INFO com.scurrilous.circe.checksum.Crc32cIntChecksum - SSE4.2 CRC32C provider initialized
19:48:37.715 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConsumerImpl - [myjdbc] [public/default/myjdbc] Closed consumer
19:48:37.942 [main] INFO org.apache.pulsar.functions.LocalRunner - RuntimeSpawner quit because of
java.lang.NullPointerException: null
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:877) ~[com.google.guava-guava-25.1-jre.jar:?]
at com.google.common.cache.LocalCache.get(LocalCache.java:3950) ~[com.google.guava-guava-25.1-jre.jar:?]
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3973) ~[com.google.guava-guava-25.1-jre.jar:?]
at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4957) ~[com.google.guava-guava-25.1-jre.jar:?]
at org.apache.pulsar.client.impl.schema.StructSchema.decode(StructSchema.java:94) ~[org.apache.pulsar-pulsar-client-original-2.4.2.jar:2.4.2]
at org.apache.pulsar.client.impl.schema.AutoConsumeSchema.decode(AutoConsumeSchema.java:72) ~[org.apache.pulsar-pulsar-client-original-2.4.2.jar:2.4.2]
at org.apache.pulsar.client.impl.schema.AutoConsumeSchema.decode(AutoConsumeSchema.java:36) ~[org.apache.pulsar-pulsar-client-original-2.4.2.jar:2.4.2]
at org.apache.pulsar.client.api.Schema.decode(Schema.java:97) ~[org.apache.pulsar-pulsar-client-api-2.4.2.jar:2.4.2]
at org.apache.pulsar.client.impl.MessageImpl.getValue(MessageImpl.java:268) ~[org.apache.pulsar-pulsar-client-original-2.4.2.jar:2.4.2]
at org.apache.pulsar.functions.source.PulsarRecord.getValue(PulsarRecord.java:74) ~[org.apache.pulsar-pulsar-functions-instance-2.4.2.jar:2.4.2]
at org.apache.pulsar.functions.instance.JavaInstanceRunnable.readInput(JavaInstanceRunnable.java:473) ~[org.apache.pulsar-pulsar-functions-instance-2.4.2.jar:2.4.2]
at org.apache.pulsar.functions.instance.JavaInstanceRunnable.run(JavaInstanceRunnable.java:246) ~[org.apache.pulsar-pulsar-functions-instance-2.4.2.jar:2.4.2]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_232]
19:49:03.105 [function-timer-thread-5-1] ERROR org.apache.pulsar.functions.runtime.RuntimeSpawner - public/default/myjdbc-java.lang.NullPointerException Function Container is dead with exception.. restarting
And also write() method is not triggered.
I'm starting on Apache Kakfa with a simple Producer, Consumer app in Java. I'm using kafka-clients version 0.10.0.1 and running it on a Mac.
I created a topic named replicated_topic_partitioned with 3 partitions and with replication factor as 3.
I started the zookeeper at port 2181. I started three brokers with id 1, 2 and 3 on ports 9092, 9093 and 9094 respectively.
Here's the output of the describe command
kafka_2.12-2.3.0/bin/kafka-topics.sh --describe --topic replicated_topic_partitioned --bootstrap-server localhost:9092
Topic:replicated_topic_partitioned PartitionCount:3 ReplicationFactor:3 Configs:segment.bytes=1073741824
Topic: replicated_topic_partitioned Partition: 0 Leader: 3 Replicas: 3,1,2 Isr: 3,1,2
Topic: replicated_topic_partitioned Partition: 1 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
Topic: replicated_topic_partitioned Partition: 2 Leader: 2 Replicas: 2,3,1 Isr: 2,3,1
I wrote a simple producer and a consumer code. The producer ran successfully and published the messages. But when I start the consumer, the poll call just waits indefinitely. On debugging, I found that it keeps on looping at the awaitMetadataUpdate method on the ConsumerNetworkClient.
Here are the code for Producer and Consumer
Properties properties = new Properties();
properties.put("bootstrap.servers", "localhost:9092");
properties.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
properties.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
KafkaProducer<String, String> myProducer = new KafkaProducer<>(properties);
DateFormat dtFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss:SSS");
String topic = "replicated_topic_partitioned";
int numberOfRecords = 10;
try {
for (int i = 0; i < numberOfRecords; i++) {
String message = String.format("Message: %s sent at %s", Integer.toString(i), dtFormat.format(new Date()));
System.out.println("Sending " + message);
myProducer.send(new ProducerRecord<String, String>(topic, message));
}
} catch (Exception e) {
e.printStackTrace();
} finally {
myProducer.close();
}
Consumer.java
Properties properties = new Properties();
properties.put("bootstrap.servers", "localhost:9092");
properties.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
properties.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
properties.put("group.id", UUID.randomUUID().toString());
properties.put("auto.offset.reset", "earliest");
KafkaConsumer<String, String> myConsumer = new KafkaConsumer<>(properties);
String topic = "replicated_topic_partitioned";
myConsumer.subscribe(Collections.singletonList(topic));
try {
while (true){
ConsumerRecords<String, String> records = myConsumer.poll(1000);
printRecords(records);
}
} finally {
myConsumer.close();
}
Adding some key-fields from server.properties
broker.id=1
host.name=localhost
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs-1
num.partitions=1
num.recovery.threads.per.data.dir=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0
The server.properties for the other two brokers was a replica of the above with broker.id, the port and thelog.dirs changed.
This did not work for me:
Kafka 0.9.0.1 Java Consumer stuck in awaitMetadataUpdate()
But, if I start the consumer from the command line passing a partition, it successfully reads the messages for that partition. But it does not receive any message when just a topic is specified.
Works:
kafka_2.12-2.3.0/bin/kafka-console-consumer.sh --topic replicated_topic_partitioned --bootstrap-server localhost:9092
--from-beginning --partition 1
Does not work:
kafka_2.12-2.3.0/bin/kafka-console-consumer.sh --topic replicated_topic_partitioned --bootstrap-server localhost:9092
--from-beginning
NOTE: The above consumer works perfectly for a topic with replication factor equals 1.
Question:
Why does the Java Producer not read any message for topic with replication factor more than one (even when assigning it to a partition) (like myConsumer.assign(Collections.singletonList(new TopicPartition(topic, 2))?
Why does the console consumer read message only when passed a partition (again works for a topic with replication factor of one)
so, youre sending 10 records, but all 10 records have the SAME key:
for (int i = 0; i < numberOfRecords; i++) {
String message = String.format("Message: %s sent at %s", Integer.toString(i), dtFormat.format(new Date()));
System.out.println("Sending " + message);
myProducer.send(new ProducerRecord<String, String>(topic, message)); <--- KEY=topic
}
unless told otherwise (by setting a partition directly on the ProducerRecord) the partition into which a record is delivered is determine by something like:
partition = murmur2(serialize(key)) % numPartitions
so same key means same partition.
have you tried searching for your 10 records on partitions 0 and 2 maybe?
if you want a better "spread" of records amongst partitions, either use a null key (you'd get round robin) or a variable key.
Disclaimer: This is not an answer.
The Java consumer is now working as expected. I did not do any change to the code or the configuration. The only thing I did was to restart my Mac. This caused the kafka-logs folder (and the zookeeper folder too I guess) to be deleted.
I re-created the topic (with the same command - 3 partitions, replication factor of 3). Then re-started the brokers with the same configuration - no advertised.host.name or advertised.port config.
So, recreation of the kafka-logs and topics remediated something that was causing an issue earlier.
My only suspect is a non-properly terminated consumer. I ran the consumer code without the close call on the consumer in the finally block initially. I also had the same group.id. Maybe, all 3 partitions were assigned to consumers that weren't properly terminated or closed. This is just a guess..
But even calling myConsumer.position(new TopicPartition(topic, 2)) did not return a response earlier when I assigned the consumer to a partition. It was looping in the same awaitMetadataUpdate method.
This question already has answers here:
How to create a Topic in Kafka through Java
(6 answers)
Closed 1 year ago.
I am trying to create a Kafka topic using Java API, but getting LEADER is NOT AVAILABLE.
Code:
int partition = 0;
ZkClient zkClient = null;
try {
String zookeeperHosts = "localhost:2181"; // If multiple zookeeper then -> String zookeeperHosts = "192.168.20.1:2181,192.168.20.2:2181";
int sessionTimeOutInMs = 15 * 1000; // 15 secs
int connectionTimeOutInMs = 10 * 1000; // 10 secs
zkClient = new ZkClient(zookeeperHosts, sessionTimeOutInMs, connectionTimeOutInMs, ZKStringSerializer$.MODULE$);
String topicName = "mdmTopic5";
int noOfPartitions = 2;
int noOfReplication = 1;
Properties topicConfiguration = new Properties();
AdminUtils.createTopic(zkClient, topicName, noOfPartitions, noOfReplication, topicConfiguration);
} catch (Exception ex) {
ex.printStackTrace();
} finally {
if (zkClient != null) {
zkClient.close();
}
}
Error:
[2017-10-19 12:14:42,263] WARN Error while fetching metadata with correlation id 1 : {mdmTopic5=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2017-10-19 12:14:42,370] WARN Error while fetching metadata with correlation id 3 : {mdmTopic5=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2017-10-19 12:14:42,479] WARN Error while fetching metadata with correlation id 4 : {mdmTopic5=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
Does Kafka 0.11.0.1 supports AdminUtils.??? Please let me know how to create topic in this version.
Thanks in Advance.
Since Kafka 0.11 there is a proper Admin API for creating (and deleting) topics and I'd recommend to use it instead of directly connecting to Zookeeper.
See AdminClient.createTopics(): http://kafka.apache.org/0110/javadoc/org/apache/kafka/clients/admin/AdminClient.html#createTopics(java.util.Collection)
Generally LEADER NOT AVAILABLE points to network issues rather than issues with your code.
Try:
telnet host port to see if you can connect to all required hosts/ports from your machine.
However, the latest approach is to use the BOOTSTRAP_SERVERS while creating topics.
A working version of topic creation code using scala would be as follows:
Import the required kafka-clients using sbt.
// https://mvnrepository.com/artifact/org.apache.kafka/kafka-clients
libraryDependencies += Seq("org.apache.kafka" % "kafka-clients" % "2.1.1")
The code for topic creation in scala:
import java.util.Arrays
import java.util.Properties
import org.apache.kafka.clients.admin.NewTopic
import org.apache.kafka.clients.admin.{AdminClient, AdminClientConfig}
class CreateKafkaTopic {
def create(): Unit = {
val config = new Properties()
config.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, "192.30.1.5:9092")
val localKafkaAdmin = AdminClient.create(config)
val partitions = 3
val replication = 1.toShort
val topic = new NewTopic("integration-02", partitions, replication)
val topics = Arrays.asList(topic)
val topicStatus = localKafkaAdmin.createTopics(topics).values()
//topicStatus.values()
println(topicStatus.keySet())
}
}
Hope it helps.
I need to assign different values of memory for each new worker. So I tried changing memory for each bolt and spout. I am currently using a custom scheduler also. Here is my approach to the problem.
MY CODE:
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("spout", new EmailSpout(), 1).addConfiguration("node", "zoo1").setMemoryLoad(512.0);
builder.setBolt("increment1", new IncrementBolt(), PARALLELISM).shuffleGrouping("spout").addConfiguration("node", "zoo2").setMemoryLoad(2048.0);
builder.setBolt("increment2", new IncrementBolt(), PARALLELISM).shuffleGrouping("increment1").addConfiguration("node", "zoo3").setMemoryLoad(2048.0);
builder.setBolt("increment3", new IncrementBolt(), PARALLELISM).shuffleGrouping("increment2").addConfiguration("node", "zoo4").setMemoryLoad(2048.0);
builder.setBolt("output", new OutputBolt(), 1).globalGrouping("increment2").addConfiguration("node", "zoo1").setMemoryLoad(512.0);
Config conf = new Config();
conf.setDebug(false);
conf.setNumWorkers(4);
StormSubmitter.submitTopologyWithProgressBar("Microbenchmark", conf, builder.createTopology());
MY STORM.YAML:
storm.zookeeper.servers:
- "zoo1"
storm.zookeeper.port: 2181
nimbus.seeds: ["zoo1"]
storm.local.dir: "/home/ubuntu/eranga/storm-data"
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
- 6704
storm.scheduler: "org.apache.storm.scheduler.NodeBasedCustomScheduler"
supervisor.scheduler.meta:
node: "zoo4"
worker.profiler.enabled: true
worker.profiler.childopts: "-XX:+UnlockCommercialFeatures -XX:+FlightRecorder"
worker.profiler.command: "flight.bash"
worker.heartbeat.frequency.secs: 1
worker.childopts: "-Xmx2048m -Xms2048m -Djava.net.preferIPv4Stack=true -Dorg.xml.sax.driver=com.sun.org.apache.xerces.internal.parsers.SAXParser -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl -Djavax.xml.parsers.SAXParserFactory=com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl"
When I submit the topology I get the following error.
ERROR:
Exception in thread "main" java.lang.IllegalArgumentException: Topology will not be able to be successfully scheduled: Config TOPOLOGY_WORKER_MAX_HEAP_SIZE_MB=768.0 < 2048.0 (Largest memory requirement of a component in the topology). Perhaps set TOPOLOGY_WORKER_MAX_HEAP_SIZE_MB to a larger amount
at org.apache.storm.StormSubmitter.validateTopologyWorkerMaxHeapSizeMBConfigs(StormSubmitter.java:496)
Any suggestions?
Try using this.
import org.apache.storm.Config;
public class TopologyExecuter{
for(List<StormTopology> StormTopologyObject : StormTopologyObjects ){
Config topologyConf = new Config();
topologyConf.put(Config.TOPOLOGY_WORKER_CHILDOPTS,"-Xmx512m -Xms256m");
StormSubmitter.submitTopology("topology name", topologyConf, StormTopologyObject);
}
}
Did you try following the advice from the error message?
Perhaps set TOPOLOGY_WORKER_MAX_HEAP_SIZE_MB to a larger amount
Try adding this to storm.yaml:
topology.worker.max.heap.size.mb=2048.0