How to distribute kafka load into openshift pods using kafka partitions - java

I have a spring boot application and I would like to distribute the load of a Kafka topic into 3 open-shift pods. I have the following example where I can listen from 3 Kafka partitions on three different threads, this spring boot application will load into one openshift pod. But I want to be a able to listen from one Kafka partition on one pod so when I load 3 pods on open-shift each pod will listen from one Kafka partition. This will allow me to scale the application to N partitions on N pods. I am not sure if this is possible or if need to use a different approach. Thanks
public class DepAcctInqConsumerController {
private static final Logger LOGGER = LoggerFactory.getLogger(DepAcctInqConsumerController.class);
#Value("${kafka.topic.acct-info.request}")
private String requestTopic;
#KafkaListener(id = "id-0",containerFactory = "requestReplyListenerContainerFactory",
topicPartitions = { #TopicPartition(topic = "${kafka.topic.acct-info.request}", partitions = "0" )})
public Message<?> listenPartition0(InGetAccountInfo accountInfo, #Header(KafkaHeaders.REPLY_TOPIC) byte[] replyTo,
#Header(KafkaHeaders.CORRELATION_ID) byte[] correlation,#Header(KafkaHeaders.RECEIVED_PARTITION_ID) int id) {
try {
LOGGER.info("Received request for partition id = " + id);
AccountInquiryDto accountInfoDto = getAccountInquiryDto(accountInfo);
return MessageBuilder.withPayload(accountInfoDto)
.setHeader(KafkaHeaders.TOPIC, replyTo)
.setHeader(KafkaHeaders.RECEIVED_PARTITION_ID, id)
.setHeader(KafkaHeaders.CORRELATION_ID, correlation)
.build();
} catch (Exception e) {
LOGGER.error(e.toString(),e);
}
return null;
}
#KafkaListener(id = "id-1",containerFactory = "requestReplyListenerContainerFactory",
topicPartitions = { #TopicPartition(topic = "${kafka.topic.acct-info.request}", partitions = "#{#finder.partitions(${kafka.topic.acct-info.request)}" )})
public Message<?> listenPartition1(InGetAccountInfo accountInfo, #Header(KafkaHeaders.REPLY_TOPIC) byte[] replyTo,
#Header(KafkaHeaders.CORRELATION_ID) byte[] correlation,#Header(KafkaHeaders.RECEIVED_PARTITION_ID) int id) {
try {
LOGGER.info("Received request for partition id = " + id);
AccountInquiryDto accountInfoDto = getAccountInquiryDto(accountInfo);
return MessageBuilder.withPayload(accountInfoDto)
.setHeader(KafkaHeaders.TOPIC, replyTo)
.setHeader(KafkaHeaders.RECEIVED_PARTITION_ID, id)
.setHeader(KafkaHeaders.CORRELATION_ID, correlation)
.build();
} catch (Exception e) {
LOGGER.error(e.toString(),e);
}
return null;
}
#KafkaListener(id = "id-2",containerFactory = "requestReplyListenerContainerFactory",
topicPartitions = { #TopicPartition(topic = "${kafka.topic.acct-info.request}", partitions = "2" )})
public Message<?> listenPartition2(InGetAccountInfo accountInfo, #Header(KafkaHeaders.REPLY_TOPIC) byte[] replyTo,
#Header(KafkaHeaders.CORRELATION_ID) byte[] correlation, #Header(KafkaHeaders.RECEIVED_PARTITION_ID) int id) {
try {
LOGGER.info("Received request for partition id = " + id);
AccountInquiryDto accountInfoDto = getAccountInquiryDto(accountInfo);
return MessageBuilder.withPayload(accountInfoDto)
.setHeader(KafkaHeaders.TOPIC, replyTo)
.setHeader(KafkaHeaders.RECEIVED_PARTITION_ID, id)
.setHeader(KafkaHeaders.CORRELATION_ID, correlation)
.build();
} catch (Exception e) {
LOGGER.error(e.toString(),e);
}
return null;
}

We don't need to have multiple kafka listeners for each partition. We just need one listener.
If you are running single pod, messages from all three partitions will be consumed by that single pod,
If you run more than 1 pod, partitions will be distributed across pods.
We can run as many pods as no of partitions.
All Pods must use same consumer group name.
This is all we need.
#KafkaListener(topics = "${kafka.topic.acct-info.request}")
public void receive(ConsumerRecord<String, String> record)

Related

Is there a way to broadcast Kafka messages from a specific timestamp (currentTimeStamp - 2 min) to a new SSE consumer

Is there a way to broadcast Kafka messages from a specific timestamp (e.g., currentTimestamp - 2 min) to a new Server-Sent-Event (SSE) consumer?
I have one multi-partition Kafka topic and multiple SSE client. I tried implementing AbstractConsumerSeekAware and use seekToTimestamp in the SSE controller but that's not working out. Its basically broadcasting the seek to all the SSE clients.
I want it only for the SSE client that newly joined.
Below is my code:
public class ServerController extends AbstractConsumerSeekAware {
......
#GetMapping(path = "/stream/{orgid}", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
Flux<ServerSentEvent<ServiceEvent>> getFolderWatch(#PathVariable String orgid) {
return Flux.create(sink -> {
this.seekToTimestamp(System.currentTimeMillis() - 2 * 60 * 1000);
MessageHandler handler = message -> sink.next((ServiceEvent) message.getPayload());
sink.onCancel(() -> subscribableChannel.unsubscribe(handler));
subscribableChannel.subscribe(handler);
}, FluxSink.OverflowStrategy.LATEST)
.filter(receiveRecord -> {
ServiceEvent event = (ServiceEvent) receiveRecord;
return event.getOrgId().equals(orgid);
}).map(e -> {
ServiceEvent event = (ServiceEvent) e;
return ServerSentEvent.<ServiceEvent>builder(event).build();
});
}
#KafkaListener(topics = "${topic.name.consumer}", groupId = "my-sse-application")
public void listenGroupFoo(#Payload String content) {
logger.info("Received Message in group foo: " + content);
subscribableChannel.send(new GenericMessage<>(new ServiceEvent(this, content)));
}
}

Why pulling message have nothing google pubsub?

I have subscription VIEW_TOPIC with pull strategy. Why I cannot see any message although have 7 delay messages? I cannot figure out what am I missing. By the way, I'm running subscriber on k8s GCP. I was also add GOOGLE_APPLICATION_CREDENTIALS variable environment.
Subscriber configuration
private Subscriber buildSubscriber() {
try (SubscriptionAdminClient subscriptionAdminClient = SubscriptionAdminClient.create()) {
TopicName topicName = TopicName.of(projectId, topic);
ProjectSubscriptionName subscriptionName =
ProjectSubscriptionName.of(projectId, subscriptionId);
// Create a pull subscription with default acknowledgement deadline of 10 seconds.
// Messages not successfully acknowledged within 10 seconds will get resent by the server.
Subscription subscription =
subscriptionAdminClient.createSubscription(
subscriptionName, topicName, PushConfig.getDefaultInstance(), 10);
System.out.println("Created pull subscription: " + subscription.getName());
} catch (IOException e) {
LOGGER.error("Cannot create pull subscription");
} catch (AlreadyExistsException existsException) {
LOGGER.warn("Subscription already created");
}
ProjectSubscriptionName subscriptionName = ProjectSubscriptionName.of(projectId, subscriptionId);
LOGGER.info("Subscribe topic: " + topic + " | SubscriptionId: " + subscriptionId);
// default is 4 * num of processor
ExecutorProvider executorProvider = InstantiatingExecutorProvider.newBuilder().build();
Subscriber.Builder subscriberBuilder = Subscriber.newBuilder(subscriptionName, new MessageReceiverImpl())
.setExecutorProvider(executorProvider);
// The subscriber will pause the message stream and stop receiving more messages from the
// server if any one of the conditions is met.
FlowControlSettings flowControlSettings =
FlowControlSettings.newBuilder()
.setMaxOutstandingElementCount(100)
// the maximum size of messages the subscriber
// receives before pausing the message stream.
// 10Mib
.setMaxOutstandingRequestBytes(10L * 1024L * 1024L)
.build();
subscriberBuilder.setFlowControlSettings(flowControlSettings);
Subscriber subscriber = subscriberBuilder.build();
subscriber.addListener(new ApiService.Listener() {
#Override
public void failed(ApiService.State from, Throwable failure) {
LOGGER.error(from, failure);
}
}, MoreExecutors.directExecutor());
return subscriber;
}
Subscriber
public void startSubscribeMessage() {
LOGGER.info("Begin subscribe topic " + topic);
this.subscriber.startAsync().awaitRunning();
LOGGER.info("Subscriber start successfully!!!");
}
public class MessageReceiverImpl implements MessageReceiver {
private static final Logger LOGGER = Logger.getLogger(MessageReceiverImpl.class);
private final LogSave logSave = MatchSave.getInstance();
#Override
public void receiveMessage(PubsubMessage message, AckReplyConsumer consumer) {
ByteString data = message.getData();
// Get the schema encoding type.
String encoding = message.getAttributesMap().get("googclient_schemaencoding");
Req.LogReq logReqMsg = null;
try {
switch (encoding) {
case "BINARY":
logReqMsg = Req.LogReq.parseFrom(data);
break;
case "JSON":
Req.LogReq.Builder msgBuilder = Req.LogReq.newBuilder();
JsonFormat.parser().merge(data.toStringUtf8(), msgBuilder);
logReqMsg = msgBuilder.build();
break;
}
LOGGER.info((JsonFormat.printer().omittingInsignificantWhitespace().print(logReqMsg)));
logSave.addLogMsg(battleLogMsg);
} catch (InvalidProtocolBufferException e) {
e.printStackTrace();
}
consumer.ack();
}
}
With Req.LogReq is a proto message. My dependency:
// google cloud
implementation platform('com.google.cloud:libraries-bom:22.0.0')
implementation 'com.google.cloud:google-cloud-pubsub'
implementation group: 'com.google.protobuf', name: 'protobuf-java-util', version: '3.17.2'
And the call function logSave.addLogMsg(battleLogMsg); is add message to CopyOnWriteArrayList

I can't send message using google pubsub emulator in spring boot

I'm trying to send push message using the emulator of pubsub, I'm using spring boot too, this is my configuration:
Dependency:
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-gcp-starter-pubsub</artifactId>
</dependency>
My bean:
#Configuration
#AutoConfigureBefore(value= GcpPubSubAutoConfiguration.class)
#EnableConfigurationProperties(value= GcpPubSubProperties.class)
public class EmulatorPubSubConfiguration {
#Value("${spring.gcp.pubsub.projectid}")
private String projectId;
#Value("${spring.gcp.pubsub.subscriptorid}")
private String subscriptorId;
#Value("${spring.gcp.pubsub.topicid}")
private String topicId;
#Bean
public Publisher pubsubEmulator() throws IOException {
String hostport = System.getenv("PUBSUB_EMULATOR_HOST");
ManagedChannel channel = ManagedChannelBuilder.forTarget(hostport).usePlaintext().build();
try {
TransportChannelProvider channelProvider =
FixedTransportChannelProvider.create(GrpcTransportChannel.create(channel));
CredentialsProvider credentialsProvider = NoCredentialsProvider.create();
// Set the channel and credentials provider when creating a `TopicAdminClient`.
// Similarly for SubscriptionAdminClient
TopicAdminClient topicClient =
TopicAdminClient.create(
TopicAdminSettings.newBuilder()
.setTransportChannelProvider(channelProvider)
.setCredentialsProvider(credentialsProvider)
.build());
ProjectTopicName topicName = ProjectTopicName.of(projectId, topicId);
// Set the channel and credentials provider when creating a `Publisher`.
// Similarly for Subscriber
return Publisher.newBuilder(topicName)
.setChannelProvider(channelProvider)
.setCredentialsProvider(credentialsProvider)
.build();
} finally {
channel.shutdown();
}
}
}
Of course, I have set PUBSUB_EMULATOR_HOST system variable to localhost:8085, where is the emulator running
I created a rest controller for testing:
for send push message
#Autowired
private Publisher pubsubPublisher;
#PostMapping("/send1")
public String publishMessage(#RequestParam("message") String message) throws InterruptedException, IOException {
Publisher pubsubPublisher = this.getPublisher();
ByteString data = ByteString.copyFromUtf8(message);
PubsubMessage pubsubMessage = PubsubMessage.newBuilder().setData(data).build();
ApiFuture<String> future = pubsubPublisher.publish(pubsubMessage);
//pubsubPublisher.publishAllOutstanding();
try {
// Add an asynchronous callback to handle success / failure
ApiFutures.addCallback(future,
new ApiFutureCallback<String>() {
#Override
public void onFailure(Throwable throwable) {
if (throwable instanceof ApiException) {
ApiException apiException = ((ApiException) throwable);
// details on the API exception
System.out.println(apiException.getStatusCode().getCode());
System.out.println(apiException.isRetryable());
}
System.out.println("Error publishing message : " + message);
System.out.println("Error publishing error : " + throwable.getMessage());
System.out.println("Error publishing cause : " + throwable.getCause());
}
#Override
public void onSuccess(String messageId) {
// Once published, returns server-assigned message ids (unique within the topic)
System.out.println(messageId);
}
},
MoreExecutors.directExecutor());
}
finally {
if (pubsubPublisher != null) {
// When finished with the publisher, shutdown to free up resources.
pubsubPublisher.shutdown();
pubsubPublisher.awaitTermination(1, TimeUnit.MINUTES);
}
}
return "ok";
for get message:
#PostMapping("/pushtest")
public String pushTest(#RequestBody CloudPubSubPushMessage request) {
System.out.println( "------> message received: " + decode(request.getMessage().getData()) );
return request.toString();
}
I have created my topic and subscription in the emulator, I followed this tutorial:
https://cloud.google.com/pubsub/docs/emulator
I'm set the endpoint "pushtest" for get push message in the emulator, with this command:
python subscriber.py PUBSUB_PROJECT_ID create-push TOPIC_ID SUBSCRIPTION_ID PUSH_ENDPOINT
But when I run the test, doesn't reach "/pushtest" endpoint and I'm getting this error:
Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask#265d5d05
[Not completed, task = java.util.concurrent.Executors$RunnableAdapter#a8c8be3
[Wrapped task = com.google.common.util.concurrent.TrustedListenableFutureTask#1a53c57c
[status=PENDING, info=[task=[running=[NOT STARTED YET], com.google.api.gax.rpc.AttemptCallable#3866e1d0]]]]]
rejected from java.util.concurrent.ScheduledThreadPoolExecutor#3f34809a
[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1]
for assurance that the emulator is running ok, I'm run the test in python with the following command:
python publisher.py PUBSUB_PROJECT_ID publish TOPIC_ID
And I'm getting messages correctly in "pushtest" endpoint.
I don't know why sorry for my hazing.
Thanks for your help.
I found the problem.
Only comment this line in the bean
channel.shutdown();
HAHA very simple.

Error in shutting down the Google Pub sub publisher

I have started the publisher of google pub sub using the coding[1].After finishing the publisher I shut down the publisher as[2].But when I run I'm getting an error[3]saying that the publisher is not properly shutdown.
I'm using pubsub 1.61.0 version.
Is there any way to handle this error?
[1]
public class PublisherExample {
// use the default project id
private static final String PROJECT_ID = ServiceOptions.getDefaultProjectId();
/** Publish messages to a topic.
* #param args topic name, number of messages
*/
public static void main(String... args) throws Exception {
// topic id, eg. "my-topic"
String topicId = args[0];
int messageCount = Integer.parseInt(args[1]);
ProjectTopicName topicName = ProjectTopicName.of(PROJECT_ID, topicId);
Publisher publisher = null;
List<ApiFuture<String>> futures = new ArrayList<>();
try {
// Create a publisher instance with default settings bound to the topic
publisher = Publisher.newBuilder(topicName).build();
for (int i = 0; i < messageCount; i++) {
String message = "message-" + i;
// convert message to bytes
ByteString data = ByteString.copyFromUtf8(message);
PubsubMessage pubsubMessage = PubsubMessage.newBuilder()
.setData(data)
.build();
// Schedule a message to be published. Messages are automatically batched.
ApiFuture<String> future = publisher.publish(pubsubMessage);
futures.add(future);
}
} finally {
// Wait on any pending requests
List<String> messageIds = ApiFutures.allAsList(futures).get();
for (String messageId : messageIds) {
System.out.println(messageId);
}
if (publisher != null) {
// When finished with the publisher, shutdown to free up resources.
publisher.shutdown();
publisher.awaitTermination(1, TimeUnit.MINUTES);
}
}
}
}
[2]
if (publisher != null) {
resources.
publisher.shutdown();
publisher.awaitTermination(1, TimeUnit.MINUTES);
}
[3]
io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference cleanQueue
SEVERE: *~*~*~ Channel ManagedChannelImpl{logId=9, target=pubsub.googleapis.com:443} was not shutdown properly!!! ~*~*~*
Make sure to call shutdown()/shutdownNow() and wait until awaitTermination() returns true.
java.lang.RuntimeException: ManagedChannel allocation site
at io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference.<init>(ManagedChannelOrphanWrapper.java:103)
at io.grpc.internal.ManagedChannelOrphanWrapper.<init>(ManagedChannelOrphanWrapper.java:53)
at io.grpc.internal.ManagedChannelOrphanWrapper.<init>(ManagedChannelOrphanWrapper.java:44)
at io.grpc.internal.AbstractManagedChannelImplBuilder.build(AbstractManagedChannelImplBuilder.java:419)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.createSingleChannel(InstantiatingGrpcChannelProvider.java:254)

Manual ACK for AggregatingMessageHandler

I'm trying to build integration scenario like this Rabbit -> AmqpInboundChannelAdapter(AcknowledgeMode.MANUAL) -> DirectChannel -> AggregatingMessageHandler -> DirectChannel -> AmqpOutboundEndpoint.
I want to aggregate messages in-memory and release it if I aggregate 10 messages, or if timeout of 10 seconds is reached. I suppose this config is OK:
#Bean
#ServiceActivator(inputChannel = "amqpInputChannel")
public MessageHandler aggregator(){
AggregatingMessageHandler aggregatingMessageHandler = new AggregatingMessageHandler(new DefaultAggregatingMessageGroupProcessor(), new SimpleMessageStore(10));
aggregatingMessageHandler.setCorrelationStrategy(new HeaderAttributeCorrelationStrategy(AmqpHeaders.CORRELATION_ID));
//default false
aggregatingMessageHandler.setExpireGroupsUponCompletion(true); //when grp released (using strategy), remove group so new messages in same grp create new group
aggregatingMessageHandler.setSendPartialResultOnExpiry(true); //when expired because timeout and not because of strategy, still send messages grouped so far
aggregatingMessageHandler.setGroupTimeoutExpression(new ValueExpression<>(TimeUnit.SECONDS.toMillis(10))); //timeout after X
//timeout is checked only when new message arrives!!
aggregatingMessageHandler.setReleaseStrategy(new TimeoutCountSequenceSizeReleaseStrategy(10, TimeUnit.SECONDS.toMillis(10)));
aggregatingMessageHandler.setOutputChannel(amqpOutputChannel());
return aggregatingMessageHandler;
}
Now, my question is - is there any easier way to manualy ack messages except creating my own implementation of AggregatingMessageHandler in this way:
public class ManualAckAggregatingMessageHandler extends AbstractCorrelatingMessageHandler {
...
private void ackMessage(Channel channel, Long deliveryTag){
try {
Assert.notNull(channel, "Channel must be provided");
Assert.notNull(deliveryTag, "Delivery tag must be provided");
channel.basicAck(deliveryTag, false);
}
catch (IOException e) {
throw new MessagingException("Cannot ACK message", e);
}
}
#Override
protected void afterRelease(MessageGroup messageGroup, Collection<Message<?>> completedMessages) {
Object groupId = messageGroup.getGroupId();
MessageGroupStore messageStore = getMessageStore();
messageStore.completeGroup(groupId);
messageGroup.getMessages().forEach(m -> {
Channel channel = (Channel)m.getHeaders().get(AmqpHeaders.CHANNEL);
Long deliveryTag = (Long)m.getHeaders().get(AmqpHeaders.DELIVERY_TAG);
ackMessage(channel, deliveryTag);
});
if (this.expireGroupsUponCompletion) {
remove(messageGroup);
}
else {
if (messageStore instanceof SimpleMessageStore) {
((SimpleMessageStore) messageStore).clearMessageGroup(groupId);
}
else {
messageStore.removeMessagesFromGroup(groupId, messageGroup.getMessages());
}
}
}
}
UPDATE
I managed to do it after your help. Most important parts: Connection factory must have factory.setPublisherConfirms(true). AmqpOutboundEndpoint must have this two settings: outboundEndpoint.setConfirmAckChannel(manualAckChannel()) and outboundEndpoint.setConfirmCorrelationExpressionString("#root"), and this is implementation of rest of classes:
public class ManualAckPair {
private Channel channel;
private Long deliveryTag;
public ManualAckPair(Channel channel, Long deliveryTag) {
this.channel = channel;
this.deliveryTag = deliveryTag;
}
public void basicAck(){
try {
this.channel.basicAck(this.deliveryTag, false);
}
catch (IOException e) {
e.printStackTrace();
}
}
}
public abstract class AbstractManualAckAggregatingMessageGroupProcessor extends AbstractAggregatingMessageGroupProcessor {
public static final String MANUAL_ACK_PAIRS = PREFIX + "manualAckPairs";
#Override
protected Map<String, Object> aggregateHeaders(MessageGroup group) {
Map<String, Object> aggregatedHeaders = super.aggregateHeaders(group);
List<ManualAckPair> manualAckPairs = new ArrayList<>();
group.getMessages().forEach(m -> {
Channel channel = (Channel)m.getHeaders().get(AmqpHeaders.CHANNEL);
Long deliveryTag = (Long)m.getHeaders().get(AmqpHeaders.DELIVERY_TAG);
manualAckPairs.add(new ManualAckPair(channel, deliveryTag));
});
aggregatedHeaders.put(MANUAL_ACK_PAIRS, manualAckPairs);
return aggregatedHeaders;
}
}
and
#Service
public class ManualAckServiceActivator {
#ServiceActivator(inputChannel = "manualAckChannel")
public void handle(#Header(MANUAL_ACK_PAIRS) List<ManualAckPair> manualAckPairs) {
manualAckPairs.forEach(manualAckPair -> {
manualAckPair.basicAck();
});
}
}
Right, you don't need such a complex logic for the aggregator.
You simply can acknowledge them after the aggregator release - in the service activator in between aggregator and that AmqpOutboundEndpoint.
And right you have to use there basicAck() with the multiple flag to true:
#param multiple true to acknowledge all messages up to and
Well, for that purpose you definitely need a custom MessageGroupProcessor to extract the highest AmqpHeaders.DELIVERY_TAG for the whole batch and set it as a header for the output aggregated message.
You might just extend DefaultAggregatingMessageGroupProcessor and override its aggregateHeaders():
/**
* This default implementation simply returns all headers that have no conflicts among the group. An absent header
* on one or more Messages within the group is not considered a conflict. Subclasses may override this method with
* more advanced conflict-resolution strategies if necessary.
*
* #param group The message group.
* #return The aggregated headers.
*/
protected Map<String, Object> aggregateHeaders(MessageGroup group) {

Categories

Resources