I'm trying to send a JSON through Kafka (Jhipster/SpringBoot Producer & Python Consumer), i've been able to receive the message in bytes, but when I try to decode it, i always get some problems before being able to extract the message.
My Producer is like this:
#RestController
#RequestMapping("/api")
public class ProducerResource{
private MessageChannel channel;
public ProducerResource(ProducerChannel channel) {
this.channel = channel.messageChannel();
}
#GetMapping("/subscribableChannel/{count}")
#Timed
#JsonInclude(Include.NON_NULL)
public void produce(#PathVariable int count) {
Map<String,String> json = new HashMap<String,String>();
json.put("message","HELL YEAH");
while(count > 0) {
channel.send(MessageBuilder.withPayload(json.toString().getBytes()).build());
count--;
}
}
}
And my Python Consumer:
def stop_handler(signal, frame, consumer):
print('Arrêt...')
consumer.close()
sys.exit(0)
def read_messages(consumer):
meta = consumer.partitions_for_topic(TOPIC)
for msg in consumer:
print(msg.value)
lol = json.load(msg.value.decode("utf-8"))
print(lol)
def main():
consumer = KafkaConsumer(TOPIC, auto_offset_reset='earliest',
group_id='read', bootstrap_servers=['localhost:9092'])
signal.signal(signal.SIGINT, lambda signal, frame: stop_handler(signal, frame, consumer))
read_messages(consumer)
return 0
if __name__ == "__main__":
exit(main())
And the warning that i get from Python is:
python3 scripts/simple_consumer.py
b'\xff\x01\x0bcontentType\x00\x00\x00\x0c"text/plain"{message=HELL YEAH}'
Traceback (most recent call last):
File "scripts/simple_consumer.py", line 36, in <module>
exit(main())
File "scripts/simple_consumer.py", line 31, in main
read_messages(consumer)
File "scripts/simple_consumer.py", line 23, in read_messages
lol = json.load(msg.value.decode("utf-8"))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
In my cloud configuration i only have Consumer/Producer HeaderMode = raw with kafka. Any suggestion would be really appreciated!!
UPDATE---
Here is my configuration for cloud:
spring:
profiles:
active: dev
include: swagger
devtools:
restart:
enabled: true
livereload:
enabled: false # we use gulp + BrowserSync for livereload
jackson:
serialization.indent_output: true
cloud:
stream:
default:
consumer:
headerMode: raw
producer:
headerMode: raw
kafka:
binder:
brokers: localhost
zk-nodes: localhost
bindings:
messageChannel:
destination: messageChannel
content-type: application/json
subscribableChannel:
destination: subscribableChannel
Related
My question is coming from this discussion: https://stackoverflow.com/a/74306116/3551820
I've added the proper configurations to enable batching consumption but Spring still doesn't work with batch converter to convert the headers of each message (kafka_batchConvertedHeaders).
#StreamListener(
target = "command-my-setup-input-channel"
)
public void handleMySetup(Message<List<MySetupDTO>> messages) {
//here Headers are arriving as bytes[]
List<?> batchHeaders = messages.getHeaders().get(KafkaHeaders.BATCH_CONVERTED_HEADERS, List.class);
log.warn("Messages received: {}", messages.getPayload().size()); //size of 2
The bean responsible to convert:
#Bean("batchConverter")
BatchMessageConverter batchConverter(KafkaHeaderMapper kafkaHeaderMapperCustom) {
BatchMessagingMessageConverter batchConv = new BatchMessagingMessageConverter();
batchConv.setHeaderMapper(kafkaHeaderMapperCustom);
return batchConv;
}
The configurations below:
spring.cloud.stream:
kafka:
binder:
autoCreateTopics: true
autoAddPartitions: true
healthTimeout: 10
requiredAcks: 1
minPartitionCount: 1
replicationFactor: 1
headerMapperBeanName: customHeaderMapper
bindings:
command-my-setup-input-channel:
consumer:
autoCommitOffset: false
batch-mode: true # enabling batch-mode
startOffset: earliest
resetOffsets: true
converter-bean-name: batchConverter # bean mapping
ackMode: manual
configuration:
heartbeat.interval.ms: 1000
max.poll.records: 2
max.poll.interval.ms: 890000
value.deserializer: com.xpto.MySetupDTODeserializer
bindings:
command-my-setup-input-channel:
destination: command.my.setup
content-type: application/json
binder: kafka
configuration:
value:
deserializer: com.xpto.MySetupDTODeserializer
consumer:
batch-mode: true
startOffset: earliest
resetOffsets: true
Error:
Bean named 'batchConverter' is expected to be of type 'org.springframework.kafka.support.converter.MessagingMessageConverter' but was actually of type 'org.springframework.kafka.support.converter.BatchMessagingMessageConverter'
Version: spring-cloud-stream 3.0.12.RELEASE
I have made two Spring Boot project. One with a kafka producer and the other with a listener.
Then I have a dockerfile like this:
Where I create a container for Zookeeper and Kafka and also one producer container and two consumer container.
version: '3'
services:
zookeeper:
image: wurstmeister/zookeeper
container_name: zookeeper
restart: always
ports:
- 2181:2181
kafka:
image: wurstmeister/kafka
container_name: kafka
restart: always
ports:
- 9092:9092
depends_on:
- zookeeper
links:
- zookeeper:zookeeper
environment:
KAFKA_ADVERTISED_HOST_NAME: kafka
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
consumer1:
image: consumer:0.0.1-SNAPSHOT
container_name: consumer1
depends_on:
- kafka
restart: always
ports:
- 8081:8081
environment:
SERVER_PORT: 8081
depends_on:
- kafka
links:
- kafka:kafka
consumer2:
image: consumer:0.0.1-SNAPSHOT
container_name: consumer2
depends_on:
- kafka
restart: always
ports:
- 8082:8082
environment:
SERVER_PORT: 8082
depends_on:
- kafka
links:
- kafka:kafka
producer:
image: producer:0.0.1-SNAPSHOT
container_name: producer
depends_on:
- kafka
restart: always
ports:
- 8080:8080
environment:
SERVER_PORT: 8080
depends_on:
- kafka
links:
- kafka:kafka
Now to my problem. I want my consumers to consume from the same topic which I have accomplished.
BUT - It seems that they are not consuming the messages in the order that the producer produces them.
As you can see below "Number: 4" is consumed before "number: 3" for example:
producer | i: 0
consumer2 | Number: 0
producer | i: 1
consumer2 | Number: 1
producer | i: 2
consumer1 | Number: 2
producer | i: 3
producer | i: 4
consumer2 | Number: 4
producer | i: 5
consumer1 | Number: 6
producer | i: 6
consumer2 | Number: 3
producer | i: 7
producer | i: 8
producer | i: 9
consumer2 | Number: 5
producer | i: 10
consumer1 | Number: 10
producer | i: 11
My KafkaProducer class:
#Service
public class KafkaProducer {
#Value("${topic.name.producer}")
private String topicName;
#Autowired
private KafkaTemplate<String, String> kafkaStringTemplate;
public void sendList(String word) {
kafkaStringTemplate.send(topicName, word);
}
}
I have a for loop that feeds this one with
for (int i = 0; i < 100; i++) {
producer.sendList("Number: " + i);
}
In my producer project I have a TopicConfiguration:
#Configuration
public class TopicConfiguration
{
#Value(value = "${spring.kafka.producer.bootstrap-servers}")
private String bootstrapAddress;
#Value(value = "${topic.name.producer}")
private String topicName;
#Bean
public NewTopic generalTopic() {
return TopicBuilder.name(topicName)
.partitions(3)
.replicas(1)
.build();
}
#Bean
public KafkaAdmin kafkaAdmin()
{
Map<String, Object> configs = new HashMap<>();
configs.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapAddress);
return new KafkaAdmin(configs);
}
}
Application.Properties file:
server.port=${SERVER_PORT}
# Producer properties
spring.kafka.producer.bootstrap-servers=kafka:9092
#spring.kafka.producer.bootstrap-servers=172.21.0.2:9092
spring.kafka.producer.key-serializer=org.apache.kafka.common.serialization.StringSerializer
spring.kafka.producer.value-serializer=org.apache.kafka.common.serialization.StringSerializer
spring.kafka.producer.group-id=group-1
topic.name.producer=test
# Common Kafka Properties
auto.create.topics.enable=true
My consumer project:
#Service
public class KafkaConsumer {
#Value("${topic.name.consumer")
private String topicName;
#KafkaListener(topics = "${topic.name.consumer}", groupId = "group-1")
public void consumeLinks(String word) throws InterruptedException {
System.out.println(word);
Thread.sleep(5000);
}
}
ApplicationProperties file:
#server.port=8081
server.port=${SERVER_PORT}
# Producer properties
spring.kafka.consumer.bootstrap-servers=kafka:9092
#spring.kafka.consumer.bootstrap-servers=172.21.0.2:9092
#spring.kafka.consumer.key-serializer=org.apache.kafka.common.serialization.StringSerializer
#spring.kafka.consumer.value-serializer=org.apache.kafka.common.serialization.StringSerializer
spring.kafka.consumer.max-poll-records=1
spring.kafka.consumer.group-id=group-1
topic.name.consumer=test
spring.kafka.consumer.auto-offset-reset=earliest
# Common Kafka Properties
auto.create.topics.enable=true
I have tried to google but have not found any solution or maybe I haven´t problem not understood how to solve. Is it someone that can tell me what is missing or have a link to a page for dummies how to solve it?
Your topic has 3 partitions. There will be no order guarantee unless you use exactly one partition. More specifically, data is only ordered within a partition; each consumer is consuming data that is ordered within its assigned partitions.
To show this, try
#KafkaListener(topics = "${topic.name.consumer}", groupId = "group-1")
public void consumeLinks(
#Payload String word,
#Header(KafkaHeaders.RECEIVED_PARTITION_ID) int partition) {
System.out.println(
"Received Message: " + word
+ " from partition: " + partition);
And if you use one partition, that means you can only have one consumer in that consumer group.
I'm trying to setup logstash in docker.
I'm using the logstash:8.0.0 image.
This is my logstash.yml
http.host: "0.0.0.0"
xpack.monitoring.enabled: false
This is my pipeline.conf
input {
beats {
port => 5044
}
}
output {
elasticsearch {
hosts => ["http://10.135.95.164:9200"]
index => "instameister"
username => "elastic"
password => ""
}
stdout { codec => rubydebug }
}
And this is the error im getting:
Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"Java::JavaLang::IllegalStateException", :message=>"Unable to configure plugins: (ConfigurationError) Something is wrong with your configuration.", :backtrace=>["org.logstash.config.ir.CompiledPipeline.<init>(CompiledPipeline.java:120)", "org.logstash.execution.JavaBasePipelineExt.initialize(JavaBasePipelineExt.java:85)", "org.logstash.execution.JavaBasePipelineExt$INVOKER$i$1$0$initialize.call(JavaBasePipelineExt$INVOKER$i$1$0$initialize.gen)", "org.jruby.internal.runtime.methods.JavaMethod$JavaMethodN.call(JavaMethod.java:837)", "org.jruby.ir.runtime.IRRuntimeHelpers.instanceSuper(IRRuntimeHelpers.java:1169)", "org.jruby.ir.runtime.IRRuntimeHelpers.instanceSuperSplatArgs(IRRuntimeHelpers.java:1156)", "org.jruby.ir.targets.InstanceSuperInvokeSite.invoke(InstanceSuperInvokeSite.java:39)", "usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.RUBY$method$initialize$0(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:47)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:80)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:70)", "org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:333)", "org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:87)", "org.jruby.RubyClass.newInstance(RubyClass.java:939)", "org.jruby.RubyClass$INVOKER$i$newInstance.call(RubyClass$INVOKER$i$newInstance.gen)", "org.jruby.ir.targets.InvokeSite.invoke(InvokeSite.java:207)", "usr.share.logstash.logstash_minus_core.lib.logstash.pipeline_action.create.RUBY$method$execute$0(/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:50)", "usr.share.logstash.logstash_minus_core.lib.logstash.pipeline_action.create.RUBY$method$execute$0$__VARARGS__(/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:49)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:80)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:70)", "org.jruby.ir.targets.InvokeSite.invoke(InvokeSite.java:207)", "usr.share.logstash.logstash_minus_core.lib.logstash.agent.RUBY$block$converge_state$2(/usr/share/logstash/logstash-core/lib/logstash/agent.rb:376)", "org.jruby.runtime.CompiledIRBlockBody.callDirect(CompiledIRBlockBody.java:138)", "org.jruby.runtime.IRBlockBody.call(IRBlockBody.java:58)", "org.jruby.runtime.IRBlockBody.call(IRBlockBody.java:52)", "org.jruby.runtime.Block.call(Block.java:139)", "org.jruby.RubyProc.call(RubyProc.java:318)", "org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:105)", "java.base/java.lang.Thread.run(Thread.java:829)"]}
All i see is the Unable to configure plugins: (ConfigurationError) Something is wrong with your configuration. But i have no idea whats wrong.
Instead of modifying logstash.yml, you can override the variables in the environment instead. Your pipeline.conf seems to be Ok. rubydebug codec is enabled by default for stdout.
So, assuming that you have a docker compose file, the configuration would be something like this:
logstash:
image: docker.elastic.co/logstash/logstash
container_name: logstash
restart: always
user: root
volumes:
- ./logstash/pipeline:/usr/share/logstash/pipeline:ro
- ./logstash/logs/:/logstash/logs/:rw
environment:
- xpack.monitoring.enabled=false
- outputs.elasticsearch=http://elasticuser:elasticuserpassword#elasticsearch:9200
depends_on:
- elasticsearch
In the ./logstash/pipeline directory, a logstash.conf file with:
input {
beats {
port => 5044
}
}
output {
elasticsearch {
hosts => "${outputs.elasticsearch}"
}
stdout {
}
}
Adapt to your needs.
The problem was that the key 'username' should be 'user'
This is the working config:
input {
beats {
port => 5044
}
}
output {
elasticsearch {
hosts => ["http://10.135.95.164:9200"]
user => "elastic"
password => ""
index => "instameister"
manage_template => false
}
stdout { codec => json_lines }
}
I am Using Apache Pulsar v2.4.2.And taken refrence of pulsar-io-jdbc and created custom sink connector packaged nar.
And running the following commands to upload schema and create connector
1.Uploaded AVRO Schema.
bin/pulsar-admin schemas upload myjdbc --filename ./connectors/pulsar-io-jdbc-schema.conf
Here is the avro schema:
{
"type": "AVRO",
"schema": "{\"type\":\"record\",\"name\":\"Test\",\"fields\":[{\"name\":\"id\",\"type\":[\"null\",\"int\"]},{\"name\":\"name\",\"type\":[\"null\",\"string\"]}]}",
"properties": {}
}
2.Create connector
bin/pulsar-admin sinks localrun --archive ./connectors/pulsar-io-jdbc-2.4.2.nar --inputs myjdbc --name myjdbc --sink-config-file ./connectors/pulsar-io-jdbc.yaml
And onces I am sending message to topic "myjdbc" from producer.SendAsync(message).
where message is
for (int i = 0; i < count; i++)
{
JObject obj = new JObject();
obj.Add("id", i);
obj.Add("name", "testing topic" + i);
var str = obj.ToString(Formatting.None);
var messageId = await producer.SendAsync(Encoding.UTF8.GetBytes(str));
Console.WriteLine($"MessageId is: '{messageId}'");
}
Pulsar localrun connector is getting the following exception:
19:48:37.437 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ClientCnx - [id: 0x557fb218, L:/127.0.0.1:39488 - R:localhost/127.0.0.1:6650] Connected through proxy to target broker at 192.168.1.97:6650
19:48:37.445 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConsumerImpl - [myjdbc][public/default/myjdbc] Subscribing to topic on cnx [id: 0x557fb218, L:/127.0.0.1:39488 - R:localhost/127.0.0.1:6650]
19:48:37.503 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConsumerImpl - [myjdbc][public/default/myjdbc] Subscribed to topic on localhost/127.0.0.1:6650 -- consumer: 0
19:48:37.557 [pulsar-client-io-1-1] INFO com.scurrilous.circe.checksum.Crc32cIntChecksum - SSE4.2 CRC32C provider initialized
19:48:37.715 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConsumerImpl - [myjdbc] [public/default/myjdbc] Closed consumer
19:48:37.942 [main] INFO org.apache.pulsar.functions.LocalRunner - RuntimeSpawner quit because of
java.lang.NullPointerException: null
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:877) ~[com.google.guava-guava-25.1-jre.jar:?]
at com.google.common.cache.LocalCache.get(LocalCache.java:3950) ~[com.google.guava-guava-25.1-jre.jar:?]
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3973) ~[com.google.guava-guava-25.1-jre.jar:?]
at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4957) ~[com.google.guava-guava-25.1-jre.jar:?]
at org.apache.pulsar.client.impl.schema.StructSchema.decode(StructSchema.java:94) ~[org.apache.pulsar-pulsar-client-original-2.4.2.jar:2.4.2]
at org.apache.pulsar.client.impl.schema.AutoConsumeSchema.decode(AutoConsumeSchema.java:72) ~[org.apache.pulsar-pulsar-client-original-2.4.2.jar:2.4.2]
at org.apache.pulsar.client.impl.schema.AutoConsumeSchema.decode(AutoConsumeSchema.java:36) ~[org.apache.pulsar-pulsar-client-original-2.4.2.jar:2.4.2]
at org.apache.pulsar.client.api.Schema.decode(Schema.java:97) ~[org.apache.pulsar-pulsar-client-api-2.4.2.jar:2.4.2]
at org.apache.pulsar.client.impl.MessageImpl.getValue(MessageImpl.java:268) ~[org.apache.pulsar-pulsar-client-original-2.4.2.jar:2.4.2]
at org.apache.pulsar.functions.source.PulsarRecord.getValue(PulsarRecord.java:74) ~[org.apache.pulsar-pulsar-functions-instance-2.4.2.jar:2.4.2]
at org.apache.pulsar.functions.instance.JavaInstanceRunnable.readInput(JavaInstanceRunnable.java:473) ~[org.apache.pulsar-pulsar-functions-instance-2.4.2.jar:2.4.2]
at org.apache.pulsar.functions.instance.JavaInstanceRunnable.run(JavaInstanceRunnable.java:246) ~[org.apache.pulsar-pulsar-functions-instance-2.4.2.jar:2.4.2]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_232]
19:49:03.105 [function-timer-thread-5-1] ERROR org.apache.pulsar.functions.runtime.RuntimeSpawner - public/default/myjdbc-java.lang.NullPointerException Function Container is dead with exception.. restarting
And also write() method is not triggered.
I'm using the Kafka JDK client ver 0.10.2.1 . I am able to produce simple messages to Kafka for a "heartbeat" test, but I cannot consume a message from that same topic using the sdk. I am able to consume that message when I go into the Kafka CLI, so I have confirmed the message is there. Here's the function I'm using to consume from my Kafka server, with the props - I pass the message I produced to the topic only after I have indeed confirmed the produce() was succesful, I can post that function later if requested:
private def consumeFromKafka(topic: String, expectedMessage: String): Boolean = {
val props: Properties = initProps("consumer")
val consumer = new KafkaConsumer[String, String](props)
consumer.subscribe(List(topic).asJava)
var readExpectedRecord = false
try {
val records = {
val firstPollRecs = consumer.poll(MAX_POLLTIME_MS)
// increase timeout and try again if nothing comes back the first time in case system is busy
if (firstPollRecs.count() == 0) firstPollRecs else {
logger.info("KafkaHeartBeat: First poll had 0 records- trying again - doubling timeout to "
+ (MAX_POLLTIME_MS * 2)/1000 + " sec.")
consumer.poll(MAX_POLLTIME_MS * 2)
}
}
records.forEach(rec => {
if (rec.value() == expectedMessage) readExpectedRecord = true
})
} catch {
case e: Throwable => //log error
} finally {
consumer.close()
}
readExpectedRecord
}
private def initProps(propsType: String): Properties = {
val prop = new Properties()
prop.put("bootstrap.servers", kafkaServer + ":" + kafkaPort)
propsType match {
case "producer" => {
prop.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer")
prop.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer")
prop.put("acks", "1")
prop.put("producer.type", "sync")
prop.put("retries", "3")
prop.put("linger.ms", "5")
}
case "consumer" => {
prop.put("group.id", groupId)
prop.put("enable.auto.commit", "false")
prop.put("auto.commit.interval.ms", "1000")
prop.put("session.timeout.ms", "30000")
prop.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
prop.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
prop.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest")
// poll just once, should only be one record for the heartbeat
prop.put("max.poll.records", "1")
}
}
prop
}
Now when I run the code, here's what it outputs in the console:
13:04:21 - Discovered coordinator serverName:9092 (id: 2147483647
rack: null) for group 0b8947e1-eb68-4af3-ac7b-be3f7c02e76e. 13:04:23
INFO o.a.k.c.c.i.ConsumerCoordinator - Revoking previously assigned
partitions [] for group 0b8947e1-eb68-4af3-ac7b-be3f7c02e76e 13:04:24
INFO o.a.k.c.c.i.AbstractCoordinator - (Re-)joining group
0b8947e1-eb68-4af3-ac7b-be3f7c02e76e 13:04:25 INFO
o.a.k.c.c.i.AbstractCoordinator - Successfully joined group
0b8947e1-eb68-4af3-ac7b-be3f7c02e76e with generation 1 13:04:26 INFO
o.a.k.c.c.i.ConsumerCoordinator - Setting newly assigned partitions
[HeartBeat_Topic.Service_5.2018-08-03.13_04_10.377-0] for group
0b8947e1-eb68-4af3-ac7b-be3f7c02e76e 13:04:27 INFO
c.p.p.l.util.KafkaHeartBeatUtil - KafkaHeartBeat: First poll had 0
records- trying again - doubling timeout to 60 sec.
And then nothing else, no errors thrown -so no records are polled. Does anyone have any idea what's preventing the 'consume' from happening? The subscriber seems to be successful, as I'm able to successfully call the listTopics and list partions no problem.
Your code has a bug. It seems your line:
if (firstPollRecs.count() == 0)
Should say this instead
if (firstPollRecs.count() > 0)
Otherwise, you're passing in an empty firstPollRecs, and then iterating over that, which obviously returns nothing.