Producing messages from kafka to grpc in real time [closed]

Producing messages from kafka to grpc in real time [closed] - java

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 days ago.
Improve this question
I have a biDirection channel via grpc
#Override
public StreamObserver<MessageReceiver.MessageRequest> biDirectionalMessageStream(StreamObserver<MessageReceiver.MessageResponse> responseObserver) {
return new StreamObserver<>() {
#Override
public void onNext(MessageReceiver.MessageRequest messageRequest) {
responseObserver.onNext(MessageReceiver.MessageResponse.newBuilder()
.setMessage("SomeMessage")
.build());
}
#Override
public void onError(Throwable throwable) {
responseObserver.onError(throwable);
}
#Override
public void onCompleted() {
responseObserver.onCompleted();
}
};
}
The message source is Kafka. Messages in the topic are for different grpc connections. At the time the grpc message is received, the connection may not yet be open. How can I, when opening a grpc connection, transfer all messages that were received before opening the connection and then transfer them in real time from kafka?
I see several options:
Make an in memory buffer into which to read messages from kafka and, when a connection is opened, send messages from the buffer to the grpc channel
Make persistent storage (redis/mongo/postgres etc..) and use it instead of a buffer
Not sure which solution is better (and if the latter, not sure which storage to choose). Load profile - 17,000 rpc

As I understand your question, where you put the data doesn't really matter, assuming you don't care about kafka retention policies and handle consumed offsets appropriately.
For example, you add every event to an arraylist, and then commit those offsets, but never establish connection? You'll lose data. Or, you do establish gRPC connectivity, but only produce some data (say, your arraylist eventually caused OOM), data still lost, and may be duplicated on restart.
If you have full control over the gRPC client sending the initial request, why not have it be a Kafka producer on its own, and not send a gRPC request to some "proxy" to do that work?
Regarding storage, you can use any of those options, but Mongo or Postgres would integrate better with Debezium, if you want to use that to actually get data into Kafka instead. But for getting to gRPC, you don't need to create a consumer until you actually started the gRPC connection

Using pull-based Kafka, it is better to use a gRPC client to get a message from Kafka, so don't need a buffer between them.

Related

Slow message consumption using AmazonSQSClient

So, i used concurrency in spring jms 50-100, allowing max connections upto 200. Everything is working as expected but if i try to retrieve 100k messages from queue, i mean there are 100k messages on my sqs and i reading them through the spring jms normal approach.
#JmsListener
Public void process (String message) {
count++;
Println (count);
//code
}
I am seeing all the logs in my console but after around 17k it starts throwing exceptions
Something like : aws sdk exception : port already in use.
Why do i see this exception and how do. I get rid of it?
I tried looking on the internet for it. Couldn't find anything.
My setting :
Concurrency 50-100
Set messages per task :50
Client acknowledged
timestamp=10:27:57.183, level=WARN , logger=c.a.s.j.SQSMessageConsumerPrefetch, message={ConsumerPrefetchThread-30} Encountered exception during receive in ConsumerPrefetch thread,
javax.jms.JMSException: AmazonClientException: receiveMessage.
at com.amazon.sqs.javamessaging.AmazonSQSMessagingClientWrapper.handleException(AmazonSQSMessagingClientWrapper.java:422)
at com.amazon.sqs.javamessaging.AmazonSQSMessagingClientWrapper.receiveMessage(AmazonSQSMessagingClientWrapper.java:339)
at com.amazon.sqs.javamessaging.SQSMessageConsumerPrefetch.getMessages(SQSMessageConsumerPrefetch.java:248)
at com.amazon.sqs.javamessaging.SQSMessageConsumerPrefetch.run(SQSMessageConsumerPrefetch.java:207)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: Address already in use: connect
Update : i looked for the problem and it seems that new sockets are being created until every sockets gets exhausted.
My spring jms version would be 4.3.10
To replicate this problem just do the above configuration with the max connection as 200 and currency set to 50-100 and push some 40k messages to the sqs queue.. One can use https://github.com/adamw/elasticmq this as a local stack server which replicates Amazon sqs.. After being done till here. Comment jms listener and use soap ui load testing and call the send message to fire many messages. Just because you commented #jmslistener annotation, it won't consume messages from queue. Once you see that you have sent 40k messages, stop. Uncomment #jmslistener and restart the server.
Update :
DefaultJmsListenerContainerFactory factory =
new DefaultJmsListenerContainerFactory();
factory.setConnectionFactory(connectionFactory);
factory.setDestinationResolver(new DynamicDestinationResolver());
factory.setErrorHandler(Throwable::printStackTrace);
factory.setConcurrency("50-100");
factory.setSessionAcknowledgeMode(Session.CLIENT_ACKNOWLEDGE);
return factory;
Update :
SQSConnectionFactory connectionFactory = new SQSConnectionFactory( new ProviderConfiguration(), amazonSQSclient);
Update :
Client configuration details :
Protocol : HTTP
Max connections : 200
Update :
I used cache connection factory class and it seems. I read on stack overflow and in their official documentation to not use cache connection factory class and default jms listener container factory.
https://stackoverflow.com/a/21989895/5871514
It's gives the same error that i got before though.
update
My goal is to get a 500 tps, i.e i should be able to consume that much.. So i tried this method and it seems I can reach 100-200, but not more than that.. Plus this thing is a blocker at high concurrency .. If you use it.. If you have some better solution to achieve it.. I am all ears.
**updated **
I am using amazonsqsclient

Starvation on the Consumer
One possible optimization that JMS clients tend to implement, is a message consumption buffer or "prefetch". This buffer is sometimes tunable via the number of messages or by a buffer size in bytes.
The intention is to prevent the consumer from going to the server every single time it receives a messages, rather than pulling multiple messages in a batch.
In an environment where you have many "fast consumers" (which is the opinionated view these libraries may take), this prefetch is set to a somewhat high default in order to minimize these round trips.
However, in an environment with slow message consumers, this prefetch can be a problem. The slow consumer is holding up messaging consumption for those prefetched messages from the faster consumer. In a highly concurrent environment, this can cause starvation quickly.
That being the case the SQSConnectionFactory has a property for this:
SQSConnectionFactory sqsConnectionFactory = new SQSConnectionFactory( new ProviderConfiguration(), amazonSQSclient);
sqsConnectionFactory.setNumberOfMessagesToPrefetch(0);
Starvation on the Producer (i.e. via JmsTemplate)
It's very common for these JMS implementations to expect be interfaced to the broker via some intermediary. These intermediaries actually cache and reuse connections or use a pooling mechanism to reuse them. In the Java EE world, this is usually taken care of a JCA adapter or other method on a Java EE server.
Because of the way Spring JMS works, it expects an intermediary delegate for the ConnectionFactory to exist to do this caching/pooling. Otherwise, when Spring JMS wants to connect to the broker, it will attempt to open a new connection and session (!) every time you want to do something with the broker.
To solve this, Spring provides a few options. The simplest being the CachingConnectionFactory, which caches a single Connection, and allows many Sessions to be opened on that Connection. A simple way to add this to your #Configuration above would be something like:
#Bean
public ConnectionFactory connectionFactory(AmazonSQSClient amazonSQSclient) {
SQSConnectionFactory sqsConnectionFactory = new SQSConnectionFactory(new ProviderConfiguration(), amazonSQSclient);
// Doing the following is key!
CachingConnectionFactory connectionfactory = new CachingConnectionFactory();
connectionfactory.setTargetConnectionFactory(sqsConnectionFactory);
// Set the #connectionfactory properties to your liking here...
return connectionFactory;
}
If you want something more fancy as a JMS pooling solution (which will pool Connections and MessageProducers for you in addition to multiple Sessions), you can use the reasonably new PooledJMS project's JmsPoolConnectionFactory, or the like, from their library.

Mutex for websocket in spring boot

I have a problem, and I don't know exactly what to search for.
I have a spring boot app which broadcast the message via web socket with a stomp javascript client. The question is if I can put a lock on the message when it is sent because I want no one to send another message at the same time. The system that I want to make is like a traffic light.
If you can give me an example or what to look for.

You should use synchronized keyword and wait for the client response. synchronized keyword ensures that only one thread can execute the method at the same time. And you need client response because you can sequentially send two messages, say in two seconds interval, but your client will get them at the same time. Response can be some dummy ok-message.
public class Traffic {
synchronized void Send() {
// write message to websocket
// read response from websocket
}
}

Emergency storage for java

I am writing transport adapter for the messages(I receive the message from java native methods and send it to the RabbitMQ's queue) and I mustn't lose any messages(For example the connection to RabbitMQ's server is unavailable). I need an persistence storage for my messages when process or send it was failed. Now I use mapDB library and its queue implementation
public void send(byte[] message){
queue.add(message);
db.commit();
//do process and send the message
queue.poll();//message sent success, we can remove it from storage
db.commit();
}
But it implementation doing very long. Please advice the best implementation for this case.
The message must be in the right order.

how to minimize resource drain of port listener when data transfer is only secondary function [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have a program whose primary function is to perform computations. I then want a client program to be able to ping the server program and receive a response associated with the state of the main program and the progress of the computation.
I need the server to be listening to a port yet not have the listener consuming too many resources that would be better spent on the computation.
The two ways I've thought about doing it so far are:
having a single thread constantly listening
(since I know the IP addresses of all the client machines in
advance) having the computational computer act as the client and the
recipients of the data act like the servers. I realize this is
counter-intuitive, but the the client (the computer doing the work)
could simply send out a packet containing the data to any servers
currently listening to the port on a sporadic basis and would
therefore exhaust the resources on a less frequent basis.
Neither of these options feel like they are the best option available, but lacking experience using sockets, I don't know what my best option would be.
So to avoid this sounding like I'm asking for opinions, I would simply like you to state if you've encounter a similar scenario, and how you accommodated it. No opinions about which is better than which, just quantifiable facts about what options there are out there.
Also, I've looked at questions discussing the difference between server sockets and RMI and have concluded that for my scenario, RMI would not be the best option since the computer receiving the data won't be doing any computations of its own. But please correct this train of thought if it is incorrect.

You can run a secondary thread, and put it to sleep from time to time, but it is not the best way to spend resources, since this thread will aways be on the runnable pool, spending resource.
public void run() {
while(true) {
//do everything you need
try {
// thread to sleep for 1000 milliseconds
Thread.sleep(1000);
} catch (Exception e) {
System.out.println(e);
}
}
}
You could implement some web service api to, if it is possible to make get or post requests, so this could be a better approach.
Here is a good tutorial: http://www.vogella.com/tutorials/REST/article.html
Using Sockets is a good idea too, since it is simple, but remember to let the proxy and the ports you choose aways open, here is a nice example:
Server:
public class Servidor {
public static void main(String[] args) throws IOException {
ServerSocket servidor = new ServerSocket(12345);
System.out.println("Port 12345 open!");
Socket cliente = servidor.accept();
System.out.println("new client connection " +
cliente.getInetAddress().getHostAddress()
);
Scanner s = new Scanner(cliente.getInputStream());
while (s.hasNextLine()) {
System.out.println(s.nextLine());
}
s.close();
servidor.close();
cliente.close();
}
}
Client:
public class Cliente {
public static void main(String[] args)
throws UnknownHostException, IOException {
Socket cliente = new Socket("127.0.0.1", 12345);
System.out.println("Client connected to server!");
Scanner teclado = new Scanner(System.in);
PrintStream saida = new PrintStream(cliente.getOutputStream());
while (teclado.hasNextLine()) {
saida.println(teclado.nextLine());
}
saida.close();
teclado.close();
cliente.close();
}
}

Just use a thread and blocking I/O. Keep it simple. A blocked thread doesn't consume any resources at all except the memory for its stack.

Preventing RabbitMQ from blocking upstream services

I have a Spring application that consumes messages on a specific port (say 9001), restructures them and then forwards to a Rabbit MQ server. The code segment is:
private void send(String routingKey, String message) throws Exception {
String exchange = applicationConfiguration.getAMQPExchange();
String exchangeType = applicationConfiguration.getAMQPExchangeType();
Connection connection = myConnection.getConnection();
Channel channel = connection.createChannel();
channel.exchangeDeclare(exchange, exchangeType);
channel.basicPublish(exchange, routingKey, null, message.getBytes());
log.debug(" [CORE: AMQP] Sent message with key {} : {}",routingKey, message);
}
If the Rabbit MQ server fails (crashes, runs out of RAM, turned off etc) the code above blocks, preventing the upstream service from receiving messages (a bad thing). I am looking for a way of preventing this behaviour whilst not losing mesages so that at some time in the future they can be resent.
I am not sure how best to address this. One option may be to queue the messages to a disk file and then use a separate thread to read and forward to the Rabbit MQ server?

If I understand correctly, the issue you are describing is a known JDK socket behaviour when the connection is lost mid-write. See this mailing list thread: http://markmail.org/thread/3vw6qshxsmu7fv6n.
Note that if RabbitMQ is shut down, the TCP connection should be closed in a way that's quickly observable by the client. However, it is true that stale TCP connections can take
a while to be detected, that's why RabbitMQ's core protocol has heartbeats. Set heartbeat
interval to a low value (say, 6-8) and the client itself will notice unresponsive peer
in that amount of time.
You need to use Publisher confirms [1] but also account for the fact that the app itself
can go down right before sending a message. As you rightly point out, having a disk-based
WAL (write-ahead log) is a common solution for this problem. Note that it is both quite
tricky to get right and still leaves some time window where your app process shutting down can result in an unpublished and unlogged message.
No promises on the time frame but the idea of adding WAL to the Java client has been discussed.
http://www.rabbitmq.com/confirms.html

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.