AWS Kinesis - how to resume consuming from last checkpoint - java

I'm converting a Kafka consumer to an AWS Kinesis consumer, using the KCL (v2). In Kafka, offsets are used to help a consumer keep track of its most recently-consumed message. If my Kafka app dies, it will use the offset to consume from where it left off when it restarts.
However this isn't the same in Kinesis. I can set kinesisClientLibConfiguration.withInitialPositionInStream(...) but the only arguments for that are TRIM_HORIZON, LATEST or AT_TIMESTAMP. If my Kinesis app died, it would not know where to resume consuming from when it restarts.
My KCL consumer is very simple. The main() method looks like:
KinesisClientLibConfiguration config = new KinesisClientLibConfiguration("benTestApp",
"testStream", new DefaultAWSCredentialsProviderChain(), UUID.randomUUID().toString());
config.withInitialPositionInStream(InitialPositionInStream.TRIM_HORIZON);
Worker worker = new Worker.Builder()
.recordProcessorFactory(new KCLRecordProcessorFactory())
.config(config)
.build();
and the RecordProcessor is a simple implementation:
#Override
public void initialize(InitializationInput initializationInput) {
LOGGER.info("Initializing record processor for shard: {}", initializationInput.getShardId());
}
#Override
public void processRecords(ProcessRecordsInput processRecordsInput) {
List<Record> records = processRecordsInput.getRecords();
LOGGER.info("Retrieved {} records", records.size());
records.forEach(r -> LOGGER.info("Record: {}", StandardCharsets.UTF_8.decode(r.getData())));
}
#Override
public void shutdown(ShutdownInput shutdownInput) {
LOGGER.info("Shutting down input");
}
If I check the corresponding DynamoDB table, the value of checkpoint is set as TRIM_HORIZON, and does not get updated with sequenceIds as records are consumed.
What's the solution here to ensure I consume every message?

As identified by #kdgregory, the KCL requires users to set their own checkpoints. Working code:
#Override
public void initialize(InitializationInput initializationInput) {
LOGGER.info("Initializing record processor for shard: {}", initializationInput.getShardId());
}
#Override
public void processRecords(ProcessRecordsInput processRecordsInput) {
List<Record> records = processRecordsInput.getRecords();
LOGGER.info("Retrieved {} records", records.size());
records.forEach(r -> LOGGER.info("Record with sequenceId {} at date {} : {}", r.getSequenceNumber(),
r.getApproximateArrivalTimestamp(), StandardCharsets.UTF_8.decode(r.getData())));
try {
processRecordsInput.getCheckpointer().checkpoint();
} catch (InvalidStateException | ShutdownException e) {
LOGGER.error("Unable to checkpoint");
}
}
#Override
public void shutdown(ShutdownInput shutdownInput) {
LOGGER.info("Shutting down input");
try {
shutdownInput.getCheckpointer().checkpoint();
} catch (InvalidStateException | ShutdownException e) {
LOGGER.error("Unable to checkpoint");
}
}

Related

Retry max 3 times when consuming batches in Spring Cloud Stream Kafka Binder

I am consuming batches in kafka, where retry is not supported in spring cloud stream kafka binder with batch mode, there is an option given that You can configure a SeekToCurrentBatchErrorHandler (using a ListenerContainerCustomizer) to achieve similar functionality to retry in the binder.
I tried the same, but with SeekToCurrentBatchErrorHandler, but it's retrying more than the time set which is 3 times.
How can I do that?
I would like to retry the whole batch.
How can I send the whole batch to dlq topic? like for record listener I used to match deliveryAttempt(retry) to 3 then send to DLQ topic, check in listener.
I have checked this link, which is exactly my issue but an example would be great help, with this library spring-cloud-stream-kafka-binder, can I achieve that. Please explain with an example, I am new to this.
Currently I have below code.
#Configuration
public class ConsumerConfig {
#Bean
public ListenerContainerCustomizer<AbstractMessageListenerContainer<?, ?>> customizer() {
return (container, dest, group) -> {
container.getContainerProperties().setAckOnError(false);
SeekToCurrentBatchErrorHandler seekToCurrentBatchErrorHandler
= new SeekToCurrentBatchErrorHandler();
seekToCurrentBatchErrorHandler.setBackOff(new FixedBackOff(0L, 2L));
container.setBatchErrorHandler(seekToCurrentBatchErrorHandler);
//container.setBatchErrorHandler(new BatchLoggingErrorHandler());
};
}
}
Listerner:
#StreamListener(ActivityChannel.INPUT_CHANNEL)
public void handleActivity(List<Message<Event>> messages,
#Header(name = KafkaHeaders.ACKNOWLEDGMENT) Acknowledgment
acknowledgment,
#Header(name = "deliveryAttempt", defaultValue = "1") int
deliveryAttempt) {
try {
log.info("Received activity message with message length {}", messages.size());
nodeConfigActivityBatchProcessor.processNodeConfigActivity(messages);
acknowledgment.acknowledge();
log.debug("Processed activity message {} successfully!!", messages.size());
} catch (MessagePublishException e) {
if (deliveryAttempt == 3) {
log.error(
String.format("Exception occurred, sending the message=%s to DLQ due to: ",
"message"),
e);
publisher.publishToDlq(EventType.UPDATE_FAILED, "message", e.getMessage());
} else {
throw e;
}
}
}
After seeing #Gary's response added the ListenerContainerCustomizer #Bean with RetryingBatchErrorHandler, but not able to import the class. attaching screenshots.
not able to import RetryingBatchErrorHandler
my spring cloud dependencies
Use a RetryingBatchErrorHandler to send the whole batch to the DLT
https://docs.spring.io/spring-kafka/docs/current/reference/html/#retrying-batch-eh
Use a RecoveringBatchErrorHandler where you can throw a BatchListenerFailedException to tell it which record in the batch failed.
https://docs.spring.io/spring-kafka/docs/current/reference/html/#recovering-batch-eh
In both cases provide a DeadLetterPublishingRecoverer to the error handler; disable DLTs in the binder.
EDIT
Here's an example; it uses the newer functional style rather than the deprecated #StreamListener, but the same concepts apply (but you should consider moving to the functional style).
#SpringBootApplication
public class So69175145Application {
public static void main(String[] args) {
SpringApplication.run(So69175145Application.class, args);
}
#Bean
ListenerContainerCustomizer<AbstractMessageListenerContainer<?, ?>> customizer(
KafkaTemplate<byte[], byte[]> template) {
return (container, dest, group) -> {
container.setBatchErrorHandler(new RetryingBatchErrorHandler(new FixedBackOff(5000L, 2L),
new DeadLetterPublishingRecoverer(template,
(rec, ex) -> new TopicPartition("errors." + dest + "." + group, rec.partition()))));
};
}
/*
* DLT topic won't be auto-provisioned since enableDlq is false
*/
#Bean
public NewTopic topic() {
return TopicBuilder.name("errors.so69175145.grp").partitions(1).replicas(1).build();
}
/*
* Functional equivalent of #StreamListener
*/
#Bean
public Consumer<List<String>> input() {
return list -> {
System.out.println(list);
throw new RuntimeException("test");
};
}
/*
* Not needed here - just to show we sent them to the DLT
*/
#KafkaListener(id = "so69175145", topics = "errors.so69175145.grp")
public void listen(String in) {
System.out.println("From DLT: " + in);
}
}
spring.cloud.stream.bindings.input-in-0.destination=so69175145
spring.cloud.stream.bindings.input-in-0.group=grp
spring.cloud.stream.bindings.input-in-0.content-type=text/plain
spring.cloud.stream.bindings.input-in-0.consumer.batch-mode=true
# for DLT listener
spring.kafka.consumer.auto-offset-reset=earliest
[foo]
2021-09-14 09:55:32.838ERROR...
...
[foo]
2021-09-14 09:55:37.873ERROR...
...
[foo]
2021-09-14 09:55:42.886ERROR...
...
From DLT: foo

Kinesis 2.2.11 java unable to create consumer

I need to migrate to Kinesis library to version 2.2.11 so I followed the tutorial: https://docs.aws.amazon.com/streams/latest/dev/kcl-migration.html
I need to run multiple instances of my consumer app, so every one of them needs to have an unique application name in order to have a separate lease table in DynamoDb.
When initializing the consumer Kinesis runs DynamoDBLeaseRefresher.createLeaseTableIfNotExists which checks if a new table needs to be created for this application name and creates one if it cannot be found.
So 2 operations are performed:
DescribeTable - it returns the table info or throws a ResourceNotFoundExecption,
if needed - CreateTable.
The problem for me is with the DescribeTable method. When I am looking for an existing table it returns it with no problem. But when I am looking for a non-existent table it throws the ResourceNotFoundExecption -> so far so good. Unfortunately it then gets wrapped and is now:
java.util.concurrent.CompletionException: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: software.amazon.awssdk.awscore.exception.AwsServiceException$Builder.extendedRequestId(Ljava/lang/String;)Lsoftware/amazon/awssdk/awscore/exception/AwsServiceException$Builder;
and the app expecting ResourceNotFoundException gets something different instead and crashes.
The wrapped exception message is a bit misleading: "Unable to execute HTTP request" since the request was performed and returned the proper message: "Resource not found".
Funny thing is that it sometimes works, the exception does not get wrapped, the CreateTable operation is performed and the consumer starts properly.
I have made a workaround for it for now where I just create the table before the initialization of the LeaseCoordinator, so it always gets the existing table.
here is my code:
public KinesisStreamReaderService(String streamName, String applicationName, String regionName) {
KinesisAsyncClient kinesisClient = KinesisAsyncClient.builder()
.credentialsProvider(EnvironmentVariableCredentialsProvider.create())
.region(Region.of(connectionProperties.getRegion()))
.httpClientBuilder(createHttpClientBuilder())
.build();
DynamoDbAsyncClient dynamoClient = DynamoDbAsyncClient.builder().region(Region.of(regionName)).build();
CloudWatchAsyncClient cloudWatchClient = CloudWatchAsyncClient.builder().region(Region.of(regionName)).build();
// if(!dynamoDbTableExists(dynamoClient, applicationName)) {
// createDynamoDbTable(dynamoClient, applicationName);
// }
ConfigsBuilder configsBuilder = new ConfigsBuilder(streamName, applicationName, kinesisClient,
dynamoClient, cloudWatchClient, workerId(), KinesisReaderProcessor::new);
configsBuilder.retrievalConfig().initialPositionInStreamExtended(
InitialPositionInStreamExtended.newInitialPosition(
InitialPositionInStream.LATEST));
scheduler = new Scheduler(
configsBuilder.checkpointConfig(),
configsBuilder.coordinatorConfig(),
configsBuilder.leaseManagementConfig(),
configsBuilder.lifecycleConfig(),
configsBuilder.metricsConfig(),
configsBuilder.processorConfig(),
configsBuilder.retrievalConfig().retrievalSpecificConfig(new PollingConfig(streamName, kinesisClient))
);
}
private void createDynamoDbTable(DynamoDbAsyncClient dynamoClient, String applicationName) {
log.info("Creating new lease table: {}", applicationName);
CompletableFuture<CreateTableResponse> createTableFuture = dynamoClient
.createTable(CreateTableRequest.builder()
.provisionedThroughput(ProvisionedThroughput.builder().readCapacityUnits(10L).writeCapacityUnits(10L).build())
.tableName(applicationName)
.keySchema(KeySchemaElement.builder().attributeName("leaseKey").keyType(KeyType.HASH).build())
.attributeDefinitions(AttributeDefinition.builder().attributeName("leaseKey").attributeType(
ScalarAttributeType.S).build())
.build());
try {
CreateTableResponse createTableResponse = createTableFuture.get();
log.debug("Created new lease table: {}", createTableResponse.tableDescription().tableName());
} catch (InterruptedException | ExecutionException e) {
throw new DataStreamException(e.getMessage(), e);
}
}
private boolean dynamoDbTableExists(DynamoDbAsyncClient dynamoClient, String tableName) {
CompletableFuture<DescribeTableResponse> describeTableResponseCompletableFutureNew = dynamoClient
.describeTable(DescribeTableRequest.builder()
.tableName(tableName).build());
try {
DescribeTableResponse describeTableResponseNew = describeTableResponseCompletableFutureNew
.get();
return nonNull(describeTableResponseNew);
} catch (InterruptedException | ExecutionException e) {
log.info(e.getMessage(), e);
}
return false;
}
private static String workerId() {
String workerId;
try {
workerId = format("%s_%s", getLocalHost().getCanonicalHostName(), randomUUID().toString());
} catch (UnknownHostException e) {
workerId = randomUUID().toString();
}
return workerId;
}
#Override
public void read(Consumer<String> consumer) {
this.consumer = consumer;
scheduler.run();
}
private class KinesisReaderProcessor implements ShardRecordProcessor {
private String shardId;
#Override
public void initialize(InitializationInput initializationInput) {
this.shardId = initializationInput.shardId();
log.info("Initializing record processor for shard: {}", shardId);
}
#Override
public void processRecords(ProcessRecordsInput processRecordsInput) {
log.debug("Checking shard {} for new records", shardId);
List<KinesisClientRecord> records = processRecordsInput.records();
if (!records.isEmpty()) {
log.debug("Processing {} records from kinesis stream shard {}", records.size(), shardId);
records.forEach(record -> {
String json = UTF_8.decode(record.data()).toString();
log.info(json);
consumer.accept(json);
});
}
}
#Override
public void leaseLost(LeaseLostInput leaseLostInput) {
log.info("Record processor has lost lease, terminating");
}
#Override
public void shardEnded(ShardEndedInput shardEndedInput) {
try {
shardEndedInput.checkpointer().checkpoint();
} catch (ShutdownException | InvalidStateException e) {
log.error(e.getMessage(), e);
}
}
#Override
public void shutdownRequested(ShutdownRequestedInput shutdownRequestedInput) {
try {
shutdownRequestedInput.checkpointer().checkpoint();
} catch (ShutdownException | InvalidStateException e) {
log.error(e.getMessage(), e);
}
}
}
}
Am I missing some configuration for the scheduler or something? Why is it sometimes working?
Thanks
Edit:
The problem is this block of code in DynamoDBLeaseRefresher.tableStatus() is invoked to check if the table exists:
DescribeTableResponse result;
try {
try {
result =
(DescribeTableResponse)FutureUtils.resolveOrCancelFuture(this.dynamoDBClient.describeTable(request), this.dynamoDbRequestTimeout);
} catch (ExecutionException var5) {
throw exceptionManager.apply(var5.getCause());
} catch (InterruptedException var6) {
throw new DependencyException(var6);
}
} catch (ResourceNotFoundException var7) {
log.debug("Got ResourceNotFoundException for table {} in leaseTableExists, returning false.", this.table);
return null;
}
and in my case it should get ResourceNotFoundException if the table is not found, but as I said the expection gets wrapped to CompletionException before it reaches the appropriate catch block and is caught in the code here:
catch (ExecutionException var5) {
throw exceptionManager.apply(var5.getCause());
This is happening 20 times in the loop while trying to Initialize the LeaseCoordinator and then just stops trying to initialize the connection. (As mentioned above it works occasionally, but that makes it even stranger to me)
With my workaround it only needs 1 try to get initialized
You don't need to create a lease table manually - DynamoDBLeaseCoordinator will create one if not exists on initialization and wait until it exists:
#Override
public void initialize() throws ProvisionedThroughputException, DependencyException, IllegalStateException {
final boolean newTableCreated =
leaseRefresher.createLeaseTableIfNotExists(initialLeaseTableReadCapacity, initialLeaseTableWriteCapacity);
if (newTableCreated) {
log.info("Created new lease table for coordinator with initial read capacity of {} and write capacity of {}.",
initialLeaseTableReadCapacity, initialLeaseTableWriteCapacity);
}
// Need to wait for table in active state.
final long secondsBetweenPolls = 10L;
final long timeoutSeconds = 600L;
final boolean isTableActive = leaseRefresher.waitUntilLeaseTableExists(secondsBetweenPolls, timeoutSeconds);
if (!isTableActive) {
throw new DependencyException(new IllegalStateException("Creating table timeout"));
}
}
The issue in your case, I think, is that it's eventually created and you probably should periodically check until table appears - like DynamoDBLeaseCoordinator#initialize() does.

Seek method behavior in spring kafka consumer 1.2.x

I do not want to commit offsets for those messages for which processing fails and I want them to be re-delivered again for processing. I am using spring-kafka 1.2.x and implemented ConsumerSeekAware in my listener.
#Component
public class Listener implements ConsumerSeekAware {
private static Logger logger = LoggerFactory.getLogger(Listener.class);
private final ThreadLocal<ConsumerSeekCallback> seekCallBack = new ThreadLocal<>();
#KafkaListener(topics = "my-topic", containerFactory = "kafkaManualAckListenerContainerFactory")
public void listen1(ConsumerRecord<String, String> consumerRecord) throws MyCustomException {
logger.info("received: key - " + consumerRecord.key() + " value - " + consumerRecord.value());
// Below code is just to show the issue.Not acknowledging so I can get the same msg again.
boolean should_commit = false;
if(should_commit) {
ack.acknowledge();
}
else {
this.seekCallBack.get().seek(consumerRecord.topic(), consumerRecord.partition() , consumerRecord.offset());
}
}
#Override
public void registerSeekCallback(ConsumerSeekCallback callback) {
logger.info("registerSeekCallback called..");
this.seekCallBack.set(callback);
}
#Override
public void onPartitionsAssigned(Map<TopicPartition, Long> assignments, ConsumerSeekCallback callback) {
logger.info("onPartitionsAssigned called..");
}
#Override
public void onIdleContainer(Map<TopicPartition, Long> assignments, ConsumerSeekCallback callback) {
logger.info("onIdleContainer called..");
}
}
#########Contaianer config (auto.commit is false in consumer)
factory.getContainerProperties().setAckOnError(false);
factory.getContainerProperties().setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL_IMMEDIATE);
The problem I am facing is if I have 10 messages in different partitions for a topic so I am getting all of them one by one and after getting all the messages I keep on getting the last message for any partition. I also tried SeekToCurrentErrorHandler which is implemented in version 2.0.x and that works perfectly. but I can not upgrade my kafka version. If I restart the container I get all the messages again which is fine but I don't want to stop the container when processing of a message fails.
So my question is it even possible to get the same (Exactly same without any need of stopping the container) behavior same as SeekToCurrentErrorHandler in spring-kafka 1.2.x ?

Google Pub/Sub reuse existing subscription

I have created java pub/sub consumer relying on the following pub/sub doc.
public static void main(String... args) throws Exception {
TopicName topic = TopicName.create(pubSubProjectName, pubSubTopic);
SubscriptionName subscription = SubscriptionName.create(pubSubProjectName, "ssvp-sub");
SubscriptionAdminClient subscriptionAdminClient = SubscriptionAdminClient.create();
subscriptionAdminClient.createSubscription(subscription, topic, PushConfig.getDefaultInstance(), 0);
MessageReceiver receiver =
new MessageReceiver() {
#Override
public void receiveMessage(PubsubMessage message, AckReplyConsumer consumer) {
System.out.println("Got message: " + message.getData().toStringUtf8());
consumer.ack();
}
};
Subscriber subscriber = null;
try {
subscriber = Subscriber.defaultBuilder(subscription, receiver).build();
subscriber.addListener(
new Subscriber.Listener() {
#Override
public void failed(Subscriber.State from, Throwable failure) {
// Handle failure. This is called when the Subscriber encountered a fatal error and is shutting down.
System.err.println(failure);
}
},
MoreExecutors.directExecutor());
subscriber.startAsync().awaitRunning();
Thread.sleep(60000);
} finally {
if (subscriber != null) {
subscriber.stopAsync();
}
}
}
It works well, but every run it ask for a new subscriber name by throwing StatusRuntimeException exception.
io.grpc.StatusRuntimeException: ALREADY_EXISTS: Resource already exists in the project (resource=ssvp-sub).
(see SubscriptionName.create(pubSubProjectName, "ssvp-sub") line in my code snippet)
I found out that in node.js client we can pass "reuseExisting:true" option to reuse existing subscription :
topic.subscribe('maybe-subscription-name', { reuseExisting: true }, function(err, subscription) {
// subscription was "get-or-create"-ed
});
What option should I pass if I use official java pubsub client?:
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-pubsub</artifactId>
<version>0.13.0-alpha</version>
</dependency>
The Java library does not have a method to allow one to call createSubscription with an existing subscription and not have an exception thrown. You have a couple of options, both of which involve using a try/catch block. The choice depends on whether or not you want to be optimistic about the existence of the subscription.
Pessimistic call:
try {
subscriptionAdminClient.createSubscription(subscription,
topic,
PushConfig.getDefaultInstance(),
0);
} catch (ApiException e) {
if (e.getStatusCode() != Status.Code.ALREADY_EXISTS) {
throw e;
}
}
// You know the subscription exists and can create a Subscriber.
Optimistic call:
try {
subscriptionAdminClient.getSubscripton(subscription);
} catch (ApiException e) {
if (e.getStatusCode() == Status.Code.NOT_FOUND) {
// Create the subscription
} else {
throw e;
}
}
// You know the subscription exists and can create a Subscriber.
In general, it is often the case that one would create the subscription prior to starting up the subscriber itself (via the Cloud Console or gcloud CLI), so you might even want to do the getSubscription() call and throw an exception no matter what. If a subscription got deleted, you might want to draw attention to this case and handle it explicitly as it has implications (like the fact that messages are no longer being stored to be delivered to the subscription).
However, if you are doing something like building a cache server that just needs to get updates transiently while it is up and running, then creating the subscription on startup could make sense.

Exception handling in Spring XD streams

How can you create a failsafe Spring XD stream, which will keep running properly after an exception is triggered for one specific message (i.e.logs the error but continues consuming the next messages in the stream), without having to add try catch(Throwable) in every Stream step?
Is there any easy way of doing this with the Reactor or RxJava model?
Example stream using Reactor:
#Override
public Publisher<Tuple> process(Stream<GenericMessage> inputStream) {
return inputStream
.flatMap(SomeClass::someFlatMap)
.filter(SomeClass::someFilter)
.when(Throwable.class, t -> log.error("error", t));
}
RxJava can be used by a processor module. On creation the subscription needs to be created and to handle errors the subscriber needs to add an onError handler:
subject = new SerializedSubject(PublishSubject.create());
Observable<?> outputStream = processor.process(subject);
subscription = outputStream.subscribe(new Action1<Object>() {
#Override
public void call(Object outputObject) {
if (ClassUtils.isAssignable(Message.class, outputObject.getClass())) {
getOutputChannel().send((Message) outputObject);
} else {
getOutputChannel().send(MessageBuilder.withPayload(outputObject).build());
}
}
}, new Action1<Throwable>() {
#Override
public void call(Throwable throwable) {
logger.error(throwable.getMessage(), throwable);
}
}, new Action0() {
#Override
public void call() {
logger.error("Subscription close for [" + subscription + "]");
}
});
Look at more examples here: https://github.com/spring-projects/spring-xd/tree/master/spring-xd-rxjava/src

Categories

Resources