Kafka 8.2.2 Dynamic Topic drops first event

Kafka 8.2.2 Dynamic Topic drops first event - java

EDIT : I am seeing the same exact behavior with the Kafka 9 Consumer API also.
I have a simple Kafaka 8.2.2 Producer with the enable topic creation property set to true. It will create a new topic when an event with a non-existent topic is created, but the event that creates that topic does not end up in Kafka and the RecordMetadata returned has no errors.
public void receiveEvent(#RequestBody EventWrapper events) throws InterruptedException, ExecutionException, TimeoutException {
log.info("Sending " + events.getEvents().size() + " Events ");
for (Event event : events.getEvents()) {
log.info("Sending Event - " + event);
ProducerRecord<String, String> record = new ProducerRecord<>(event.getTopic(), event.getData());
Future<RecordMetadata> ack = eventProducer.send(record);
log.info("ACK - " + ack.get());
}
log.info("SENT!");
}
I have a program that polls for new topics (I wasn't happy with the dynamic/regex topic code in Kafka 8) and it finds the new queue and subscribes, and it does see subsequent events, but never that first event.
I also tried the kafka-console-consumer script and it sees that exact same. First event never seen, then after that events start flowing.
Ideas?

Turns out there is a property you can set props.put("auto.offset.reset","earliest");
And after setting this, the Consumer does receive the first event put on the topic.

Related

How to get the ConsumerRecord object in StreamListener Consumer code

I wanted to enable the manual commit for my consumer and for that i have below code + configuration. Here i am trying to manually commit the offset in case signIn client throws exception and till manually comitting offet itw works fine but with this code the message which failed to process is not being consumed again so for that what i want to do is calling seek method and consume same failed offset again -
consumer.seek(newTopicPartition(atCommunityTopic,communityFeed.partition()),communityFeed.offset());
But the actual problem is here how do i get partition and offset details from. If somehow i can get ConsumerRecord object along with message then it will work.
spring.cloud.stream.kafka.bindings.atcommnity.consumer.autoCommitOffset=false
And Below is the consumer code through StreamListener
#StreamListener(ConsumerConstants.COMMUNITY_IN)
public void handleCommFeedConsumer(
#Payload Account consumerRecords,
#Header(KafkaHeaders.CONSUMER) Consumer<?, ?> consumer,
#Header(KafkaHeaders.ACKNOWLEDGMENT) Acknowledgment acknowledgment) {
consumerRecords.forEach(communityFeed -> {
try{
AccountClient.signIn(
AccountIn.builder()
.Id(atCommunityEvent.getId())
.build());
log.debug("Calling Client for Id : "
+ communityEvent.getId());
}catch(RuntimeException ex){
log.info("");
//consumer.seek(new TopicPartition(communityTopic,communityFeed.partition()),communityFeed.offset());
return;
}
acknowledgment.acknowledge();
});
}

See https://docs.spring.io/spring-kafka/docs/current/reference/html/#consumer-record-metadata
#Header(KafkaHeaders.PARTITION_ID) int partition
#Header(KafkaHeaders.OFFSET) long offset
IMPORTANT
Seeking the consumer yourself might not do what you want because the container may already have other records after this one; it's best to throw an exception and the error handler will do the seeks for you.

What settings required to retry message to from GCP pub/sub in java

I am new to GCP pub/sub and trying to resend a message which is not acknowledged (ack/nack). In subscription at GCP console dashboard, I have mentioned:
In my java code, I have created a subscriber
public Subscriber createSubscriber(String subscriptionId, MessageReceiver receiver) throws MessagingException {
Subscriber subscriber = null;
ProjectSubscriptionName subscriptionName = null;
String projectId = getProjectId();
if (Objects.isNull(projectId) || Objects.isNull(subscriptionId)) {
throw new MessagingException(MessagingErrorCodes.MIX90810
+ " Project Id/Subscription Id is null for subscriptionId = " + subscriptionId + " projectId= "
+ projectId, MessagingErrorCodes.MIX90810);
}
try {
subscriptionName = ProjectSubscriptionName.of(projectId, subscriptionId);
subscriber = Subscriber.newBuilder(subscriptionName, receiver).setExecutorProvider(getExecutorProvider()).build();
} catch (Exception e) {
throw new MessagingException(MessagingErrorCodes.MAX34540
+ " Error occurred while creating the subscriber for the subscriptionId = " + subscriptionId
+ "projectId " + projectId + "subscriptionName= " + subscriptionName,
MessagingErrorCodes.MAX34540, e);
}
enter code here
return subscriber;
}
I am getting messages on my receiveMessage(PubsubMessage message, AckReplyConsumer consumer) the first time but not getting again if I am not acknowledging the message. But if sending nack it's sending the message again.
#Service
public class MyMessageReceiver implements MessageReceiver {
#Override
public void receiveMessage(PubsubMessage message, AckReplyConsumer consumer) {
System.out.println(message.getMessageId());
}
}
should I need to mention other configuration to enable retry in case of not acknowledging the message as well?

Regarding the retry policy, the documentation says that Pub/Sub tries to redeliver the message, only if the acknowledgement deadline expires or if the subscriber nacks the message. Once the acknowledgement deadline passes, the message becomes a candidate to be redelivered. The redelivery may not be immediate as redelivery is performed on a best effort basis.
As already mentioned in the comments, the DEFAULT_MAX_ACK_EXTENSION_PERIOD is set to 60 minutes in Subscriber.java, which is the cause of this delay. The Ack deadline will continue to be extended (by the client library as a native functionality) until this duration is reached. Meaning, the unacked message is leased by the subscriber for 60 minutes and is not eligible to be redelivered within this period. The setMaxAckExtensionPeriod(Duration maxAckExtensionPeriod) is used to set a custom value to the maximum period a message’s acknowledgement deadline will be extended to.
Please also note that none of these values is a guarantee that the message will not be redelivered within that time frame. It is possible for the message to be redelivered before maxAckExtensionPeriod due to network or server blips.

Consumer group isn't being created in control center - Kafka

I'm currently trying to set up a consumer to consume messages from a topic. My log says it subscribes to the topic successfully
[Consumer clientId=consumer-1, groupId=consumer-group] Subscribed to topic(s): MY-TOPIC
and it clearly shows it is a part of a group, but when I go to the control center I can't find that group, however I can find the topic that I am subscribed too. It isn't consuming the records from the topic either which I attribute to not being apart of a valid group. I know it is polling the correct topic and I know there is records on the topic as I am constantly putting them on.
Here is my start method
#PostConstruct public void start()
{
// check if the config indicates whether to start the daemon or not
if (!parseBoolean(maskBlank(shouldStartConsumer, "true")))
{
System.err.println("CONSUMER DISABLED");
logger.warn("consumer not starting -- see value of " + PROP_EXTRACTOR_START_CONSUMER);
return;
}
System.err.println("STARTING CONSUMER");
Consumer<String, String> consumer = this.createConsumer(kafkaTopicName,
StringDeserializer.class, StringDeserializer.class);
Thread daemon = new Thread(() -> {
while (true)
{
ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(10));
if (records.count() > 0) // IS ALWAYS 0, Poll doesn't return records
{
printRecord(records);
records.iterator().forEachRemaining(r -> {
System.err.println("received record: " + r);
});
}
else { logger.debug("KafkaTopicConsumer::consumeMessage -- No messages found in topic {}", kafkaTopicName); }
}
});
daemon.setName(kafkaTopicName);
daemon.setDaemon(true);
daemon.start();
}
Note: createConsumer method is just adding all of my config settings and where I subscribe to my topic.
I have a feeling it has something to do with the thread... I can post some of my config if that would help as well just leave a comment. Thanks

Apache Kafka System Error Handling

We are trying to implement Kafka as our message broker solution. We are deploying our Spring Boot microservices in IBM BLuemix, whose internal message broker implementation is Kafka version 0.10. Since my experience is more on the JMS, ActiveMQ end, I was wondering what should be the ideal way to handle system level errors in the java consumers?
Here is how we have implemented it currently
Consumer properties
enable.auto.commit=false
auto.offset.reset=latest
We are using the default properties for
max.partition.fetch.bytes
session.timeout.ms
Kafka Consumer
We are spinning up 3 threads per topic all having the same groupId, i.e one KafkaConsumer instance per thread. We have only one partition as of now. The consumer code looks like this in the constructor of the thread class
kafkaConsumer = new KafkaConsumer<String, String>(properties);
final List<String> topicList = new ArrayList<String>();
topicList.add(properties.getTopic());
kafkaConsumer.subscribe(topicList, new ConsumerRebalanceListener() {
#Override
public void onPartitionsRevoked(final Collection<TopicPartition> partitions) {
}
#Override
public void onPartitionsAssigned(final Collection<TopicPartition> partitions) {
try {
logger.info("Partitions assigned, consumer seeking to end.");
for (final TopicPartition partition : partitions) {
final long position = kafkaConsumer.position(partition);
logger.info("current Position: " + position);
logger.info("Seeking to end...");
kafkaConsumer.seekToEnd(Arrays.asList(partition));
logger.info("Seek from the current position: " + kafkaConsumer.position(partition));
kafkaConsumer.seek(partition, position);
}
logger.info("Consumer can now begin consuming messages.");
} catch (final Exception e) {
logger.error("Consumer can now begin consuming messages.");
}
}
});
The actual reading happens in the run method of the thread
try {
// Poll on the Kafka consumer every second.
final ConsumerRecords<String, String> records = kafkaConsumer.poll(1000);
// Iterate through all the messages received and print their
// content.
for (final TopicPartition partition : records.partitions()) {
final List<ConsumerRecord<String, String>> partitionRecords = records.records(partition);
logger.info("consumer is alive and is processing "+ partitionRecords.size() +" records");
for (final ConsumerRecord<String, String> record : partitionRecords) {
logger.info("processing topic "+ record.topic()+" for key "+record.key()+" on offset "+ record.offset());
final Class<? extends Event> resourceClass = eventProcessors.getResourceClass();
final Object obj = converter.convertToObject(record.value(), resourceClass);
if (obj != null) {
logger.info("Event: " + obj + " acquired by " + Thread.currentThread().getName());
final CommsEvent event = resourceClass.cast(converter.convertToObject(record.value(), resourceClass));
final MessageResults results = eventProcessors.processEvent(event
);
if ("Success".equals(results.getStatus())) {
// commit the processed message which changes
// the offset
kafkaConsumer.commitSync();
logger.info("Message processed sucessfully");
} else {
kafkaConsumer.seek(new TopicPartition(record.topic(), record.partition()), record.offset());
logger.error("Error processing message : {} with error : {},resetting offset to {} ", obj,results.getError().getMessage(),record.offset());
break;
}
}
}
}
// TODO add return
} catch (final Exception e) {
logger.error("Consumer has failed with exception: " + e, e);
shutdown();
}
You will notice the EventProcessor which is a service class which processes each record, in most cases commits the record in database. If the processor throws an error (System Exception or ValidationException) we do not commit but programatically set the seek to that offset, so that subsequent poll will return from that offset for that group id.
The doubt now is that, is this the right approach? If we get an error and we set the offset then until that is fixed no other message is processed. This might work for system errors like not able to connect to DB, but if the problem is only with that event and not others to process this one record we wont be able to process any other record. We thought of the concept of ErrorTopic where when we get an error the consumer will publish that event to the ErrorTopic and in the meantime it will keep on processing other subsequent events. But it looks like we are trying to bring in the design concepts of JMS (due to my previous experience) into kafka and there may be better way to solve error handling in kafka. Also reprocessing it from error topic may change the sequence of messages which we don't want for some scenarios
Please let me know how anyone has handled this scenario in their projects following the Kafka standards.
-Tatha

if the problem is only with that event and not others to process this one record we wont be able to process any other record
that's correct and your suggestion to use an error topic seems a possible one.
I also noticed that with your handling of onPartitionsAssigned you essentially do not use the consumer committed offset, as you seem you'll always seek to the end.
If you want to restart from the last succesfully committed offset, you should not perform a seek
Finally, I'd like to point out, though it looks like you know that, having 3 consumers in the same group subscribed to a single partition - means that 2 out of 3 will be idle.
HTH
Edo

Retrieve multiple messages from SQS

I have multiple messages in SQS. The following code always returns only one, even if there are dozens visible (not in flight). setMaxNumberOfMessages I thought would allow multiple to be consumed at once .. have i misunderstood this?
CreateQueueRequest createQueueRequest = new CreateQueueRequest().withQueueName(queueName);
String queueUrl = sqs.createQueue(createQueueRequest).getQueueUrl();
ReceiveMessageRequest receiveMessageRequest = new ReceiveMessageRequest(queueUrl);
receiveMessageRequest.setMaxNumberOfMessages(10);
List<Message> messages = sqs.receiveMessage(receiveMessageRequest).getMessages();
for (Message message : messages) {
// i'm a message from SQS
}
I've also tried using withMaxNumberOfMessages without any such luck:
receiveMessageRequest.withMaxNumberOfMessages(10);
How do I know there are messages in the queue? More than 1?
Set<String> attrs = new HashSet<String>();
attrs.add("ApproximateNumberOfMessages");
CreateQueueRequest createQueueRequest = new CreateQueueRequest().withQueueName(queueName);
GetQueueAttributesRequest a = new GetQueueAttributesRequest().withQueueUrl(sqs.createQueue(createQueueRequest).getQueueUrl()).withAttributeNames(attrs);
Map<String,String> result = sqs.getQueueAttributes(a).getAttributes();
int num = Integer.parseInt(result.get("ApproximateNumberOfMessages"));
The above always is run prior and gives me an int that is >1
Thanks for your input

AWS API Reference Guide: Query/QueryReceiveMessage
Due to the distributed nature of the queue, a weighted random set of machines is sampled on a ReceiveMessage call. That means only the messages on the sampled machines are returned. If the number of messages in the queue is small (less than 1000), it is likely you will get fewer messages than you requested per ReceiveMessage call. If the number of messages in the queue is extremely small, you might not receive any messages in a particular ReceiveMessage response; in which case you should repeat the request.
and
MaxNumberOfMessages: Maximum number of messages to return. SQS never returns more messages than this value but might return fewer.

There is a comprehensive explanation for this (arguably rather idiosyncratic) behaviour in the SQS reference documentation.
SQS stores copies of messages on multiple servers and receive message requests are made to these servers with one of two possible strategies,
Short Polling : The default behaviour, only a subset of the servers (based on a weighted random distribution) are queried.
Long Polling : Enabled by setting the WaitTimeSeconds attribute to a non-zero value, all of the servers are queried.
In practice, for my limited tests, I always seem to get one message with short polling just as you did.

I had the same problem. What is your Receive Message Wait Time for your queue set to? When mine was at 0, it only returned 1 message even if there were 8 in the queue. When I increased the Receive Message Wait Time, then I got all of them. Seems kind of buggy to me.

I was just trying the same and with the help of these two attributes setMaxNumberOfMessages and setWaitTimeSeconds i was able to get 10 messages.
ReceiveMessageRequest receiveMessageRequest = new ReceiveMessageRequest(myQueueUrl);
receiveMessageRequest.setMaxNumberOfMessages(10);
receiveMessageRequest.setWaitTimeSeconds(20);
Snapshot of o/p:
Receiving messages from TestQueue.
Number of messages:10
Message
MessageId: 31a7c669-1f0c-4bf1-b18b-c7fa31f4e82d
...

receiveMessageRequest.withMaxNumberOfMessages(10);
Just to be clear, the more practical use of this would be to add to your constructor like this:
ReceiveMessageRequest receiveMessageRequest = new ReceiveMessageRequest(queueUrl).withMaxNumberOfMessages(10);
Otherwise, you might as well just do:
receiveMessageRequest.setMaxNumberOfMessages(10);
That being said, changing this won't help the original problem.

Thanks Caoilte!
I faced this issue also. Finally solved by using long polling follow the configuration here:
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-configure-long-polling-for-queue.html
Unfortunately, to use long polling, you must create your queue as FIFO one. I tried standard queue with no luck.
And when receiving, need also set MaxNumberOfMessages. So my code is like:
ReceiveMessageRequest receive_request = new ReceiveMessageRequest()
.withQueueUrl(QUEUE_URL)
.withWaitTimeSeconds(20)
.withMaxNumberOfMessages(10);
Although solved, still feel too wired. AWS should definitely provide a more neat API for this kind of basic receiving operation.
From my point, AWS has many many cool features but not good APIs. Like those guys are rushing out all the time.

For small task list I use FIFO queue like stackoverflow.com/a/55149351/13678017
for example modified AWS tutorial
// Create a queue.
System.out.println("Creating a new Amazon SQS FIFO queue called " + "MyFifoQueue.fifo.\n");
final Map<String, String> attributes = new HashMap<>();
// A FIFO queue must have the FifoQueue attribute set to true.
attributes.put("FifoQueue", "true");
/*
* If the user doesn't provide a MessageDeduplicationId, generate a
* MessageDeduplicationId based on the content.
*/
attributes.put("ContentBasedDeduplication", "true");
// The FIFO queue name must end with the .fifo suffix.
final CreateQueueRequest createQueueRequest = new CreateQueueRequest("MyFifoQueue4.fifo")
.withAttributes(attributes);
final String myQueueUrl = sqs.createQueue(createQueueRequest).getQueueUrl();
// List all queues.
System.out.println("Listing all queues in your account.\n");
for (final String queueUrl : sqs.listQueues().getQueueUrls()) {
System.out.println(" QueueUrl: " + queueUrl);
}
System.out.println();
// Send a message.
System.out.println("Sending a message to MyQueue.\n");
for (int i = 0; i < 4; i++) {
var request = new SendMessageRequest()
.withQueueUrl(myQueueUrl)
.withMessageBody("message " + i)
.withMessageGroupId("userId1");
;
sqs.sendMessage(request);
}
for (int i = 0; i < 6; i++) {
var request = new SendMessageRequest()
.withQueueUrl(myQueueUrl)
.withMessageBody("message " + i)
.withMessageGroupId("userId2");
;
sqs.sendMessage(request);
}
// Receive messages.
System.out.println("Receiving messages from MyQueue.\n");
var receiveMessageRequest = new ReceiveMessageRequest(myQueueUrl);
receiveMessageRequest.setMaxNumberOfMessages(10);
receiveMessageRequest.setWaitTimeSeconds(20);
// what receive?
receiveMessageRequest.withMessageAttributeNames("userId2");
final List<Message> messages = sqs.receiveMessage(receiveMessageRequest).getMessages();
for (final Message message : messages) {
System.out.println("Message");
System.out.println(" MessageId: "
+ message.getMessageId());
System.out.println(" ReceiptHandle: "
+ message.getReceiptHandle());
System.out.println(" MD5OfBody: "
+ message.getMD5OfBody());
System.out.println(" Body: "
+ message.getBody());
for (final Entry<String, String> entry : message.getAttributes()
.entrySet()) {
System.out.println("Attribute");
System.out.println(" Name: " + entry
.getKey());
System.out.println(" Value: " + entry
.getValue());
}
}

Here's a workaround, you can call receiveMessageFromSQS method asynchronously.
bulkReceiveFromSQS (queueUrl, totalMessages, asyncLimit, batchSize, visibilityTimeout, waitTime, callback) {
batchSize = Math.min(batchSize, 10);
let self = this,
noOfIterations = Math.ceil(totalMessages / batchSize);
async.timesLimit(noOfIterations, asyncLimit, function(n, next) {
self.receiveMessageFromSQS(queueUrl, batchSize, visibilityTimeout, waitTime,
function(err, result) {
if (err) {
return next(err);
}
return next(null, _.get(result, 'Messages'));
});
}, function (err, listOfMessages) {
if (err) {
return callback(err);
}
listOfMessages = _.flatten(listOfMessages).filter(Boolean);
return callback(null, listOfMessages);
});
}
It will return you an array with a given number of messages

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Kafka 8.2.2 Dynamic Topic drops first event - java

Turns out there is a property you can set props.put("auto.offset.reset","earliest"); And after setting this, the Consumer does receive the first event put on the topic.

Related

How to get the ConsumerRecord object in StreamListener Consumer code

What settings required to retry message to from GCP pub/sub in java

Consumer group isn't being created in control center - Kafka

Apache Kafka System Error Handling

Retrieve multiple messages from SQS

Categories

Resources