Prevent Flux Publisher from going out of memory - java

I'm trying to use a Flux to stream events to subscribers using RSocket. There can be a huge backlog of events (in the database) and they must be send out in order without any gaps without either flooding the publisher (out of memory) or the consumer. None of the OverflowStrategy's seem suitable:
IGNORE: I'd like to block (or get a callback when there's more demand), not get an error
ERROR: I'd like to block (or get a callback when there's more demand), not get an error
DROP: bad, because events cannot be skipped (no gaps)
LATEST: bad, because events cannot be skipped (no gaps)
BUFFER: leads to out of memory on publisher
I have everything working, but if I don't limit my rate in the subscribers the publisher side goes out of memory -- that's bad, as a bad subscriber could kill my service. For some reason, I'm misunderstanding how back pressure works. Everywhere I look there is talk of limitRate. This works, but it only works for me on the subscriber side. Using a limitRate on the publisher side has no effect at all.
I've used Flux.generate and Flux.create to create the events I want on the publisher side, but they don't seem to respond to back pressure at all. So I must be missing something as the whole back pressure mechanism in Reactor is described as very transparent and easy to use...
Here's my publisher:
#MessageMapping("events")
public Flux<String> events(String data) {
Flux<String> flux = Flux.generate(new Consumer<SynchronousSink<String>>() {
long offset = 0;
#Override
public void accept(SynchronousSink<String> emitter) {
emitter.next("" + offset++);
}
});
return flux.limitRate(100); // limitRate doesn't do anything
}
And my consumer:
#Autowired RSocketRequester requester;
#EventListener(ApplicationReadyEvent.class)
public void run() throws InterruptedException {
requester.route("events")
.data("Just Go")
.retrieveFlux(String.class)
//.limitRate(1000) // commenting this line makes publisher go OOM
.bufferTimeout(20000, Duration.ofMillis(10))
.subscribe(new Consumer<List<String>>() {
long totalReceived = 0;
long totalBytes = 0;
#Override
public void accept(List<String> s) {
totalReceived += s.size();
totalBytes += s.stream().mapToInt(String::length).sum();
System.out.printf("So we received: %4d messages # %8.1f msg/sec (%d kB/sec)\n", s.size(), ((double)totalReceived / (System.currentTimeMillis() - time)) * 1000, totalBytes / (System.currentTimeMillis() - time));
try {
Thread.sleep(200); // Delay consumer so publisher has to slow down
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
});
Thread.sleep(100000); // leave Spring running for a bit (dirty)
}
What I don't understand why this wouldn't work. The generate uses a call back, but it keeps getting called as fast as possible leading to huge memory allocations in the JVM and it goes OOM. Why does it keep calling generate?
What am I missing?

Related

How should I implement Vertx resiliency with retrying until some condition is met?

I would like to implement in my Vertx application resiliency. The case is that I have 2 application instances (primary and secondary) and I would like to try to send messages from primary to secondary as long as some condition is not met. In this way I want to implement resiliency - when there is no internet on primary's network - try sending the same message over and over. Let's say it is some boolean flag. I would like to achieve something like that (in pseudocode):
while(condition not met)
boolean messageSendingSuccess = tryToSendMessage(secondaryAddress);
if(messageSendingSuccess) return;
In reality I would like to create the whole queue of messages and implement the solution for the whole queue of messages (until the first message is sent - wait. when first message is sent - start trying to send new message etc.), but this is the scenario for one message.
How can I achieve that? I know that CircuitBreaker will not help me here https://vertx.io/docs/vertx-circuit-breaker/java/ because it is trying to reconnect given number of times. I do not want to stop trying after given number of times or after some time - I would like to stop trying when some condition is met.
The solution from this webpage looks interesting:
https://blog.axway.com/product-insights/amplify-platform/application-integration/vert-x-how-to-handle-retry-with-the-eventbus but I think this solution is only for communicating inside the application (not to communicate through the network) - I do not see where could I point out the network address where I should send the message in this code?
public class RetryWithHandlerVerticle extends AbstractVerticle {
#Override
public void start() throws Exception {
vertx.eventBus() // (1)
.send("hello.handler.failure.retry", // (2)
"Hello", // (3)
new Handler<AsyncResult<Message<String>>>() { // (4)
private int count = 1;
#Override
public void handle(final AsyncResult<Message<String>> aResult) {
if (aResult.succeeded()) { // (5)
System.out.printf("received: \"%s\"\n", aResult.result().body());
} else if (count < 3) { // (6)
System.out.printf("retry count %d, received error \"%s\"\n",
count, aResult.cause().getMessage());
vertx.eventBus().send("hello.handler.failure.retry", "Hello", this); // (7)
count = count + 1;
} else {
aResult.cause().printStackTrace(); // (8)
}
}
});
}
}
The Vert.x Event Bus does not store messages and thus does not offer any guarantee about delivery. In other words, when doing point-to-point messaging, it is the same as making an HTTP request: if you don't get an acknowledgement by the receiver, you don't know if the requests succeeded.
Consequently, you have to deal with it as with an HTTP interaction: design your system with commands that are idempotent. Then if you receive an error (failure or timeout), you can retry until the command is processed and acknowledged.
The Vert.x Circuit breaker has builtin retry policies, but it is your job to design the commands and implement the receiver side for idempotency.

Aborting a library operation?

I'm using a 3rd party library, which has a method:
secureSend(int channel, byte[] data);
This method sends my binary data to the library, and if the data is larger than 64K, the method splits it to 64K chunks and sends them in order.
This method is marked as blocking, so it won't return immediately. Therefore is also advised to spawn a thread for each usage of this function:
new Thread(new Runnable() {
public void run() {
library.secureSend(channel, mydata);
}
}).start();
If I'm trying to send larger data (>1Mb), it will take about 30 seconds. This is fine.
However sometimes I need to interrupt the sending because there is a higher priority data to send.
Currently, If I spawn a new thread with calling secureSend it will have to wait, as library operates in FIFO-manner, ie.: it will finish first with previous sendings.
I decompiled the library's class files, and secureSend has the following pseudo algorithm:
public synchronized void secureSend(int c, byte[] data) {
try {
local_data = data;
HAS_MORE_DATA_TO_SEND = (local_data.length > 0)
while (HAS_MORE_DATA_TO_SEND) {
HAS_MORE_DATA_TO_SEND = sendChunk(...); //calculates offset, and length, and returns if still has more, operates with local_data!
}
} catch(IOException ex) {}
}
I've tried to interrupt the thread (I've stored it), but it didn't helped.
The library spends a lot of time in that while loop. However, it also fear of IOException.
My question: can I anyhow interrupt/kill/abort this function call? Maybe somehow throwing an IOException into the Thread? Is this somewhat possible?

Hive velocity stream - Reactive Java or Kafka Streams

I have a high velocity endpoint with a certain payload.
I have 800 listener threads active on a queue.
I then need to generate events from that payload. There can be X amount of events from each payload.
I then need to aggregate these events into buckets of Y that will be sent to another service via http.
The HTTP calls take time so i need them to be async and in parallel (additional Z amount of threads to make these calls.
The flow needs to be highly efficient.
What I have right now is this.
The http listeners call a service which generates the events and puts them into a ConcurrentLinkedQueue
then call a ResettableCountDownLatch.countDown (set to wait for Y).
I have a thread pool where each executor awaits for the ResettableCountDownLatch then checks if the queue has more than Y, if yes then polls Y events from the queue and does the call.
I would love to hear if there is more efficient way to work here, any open source projects that fit here (I heard of Reactive (on spring boot 2), or Kafka Streams)
doneSignal = new ResettableCountDownLatch(y);
executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(senderThreads);
executor.submit(() -> {
while (true) {
try {
doneSignal.await();
doneSignal.reset();
if (concurrentQueue.size() > bulkSize) {
log.debug("concurrentQueue size is: {}, going to start new thread", concurrentQueue.size());
executor.submit(() -> {
long start = System.currentTimeMillis();
while (concurrentQueue.size() > bulkSize) {
List<ObjectNode> events = populateList(concurrentQueue, bulkSize);
log.debug("got: {} from concurrentQueue, it took: {}", events.size(), (System.currentTimeMillis() - start));
String eventWrapper = wrapEventsInBulk(events);
sendEventToThirdPary(eventWrapper);
}
});
}
} catch (Exception e) {
}
}

Limiting rate of requests with Reactor

I'm using project reactor to load data from a web service using rest. This is done in parallel with multiple threads. I'm starting to hit rate limits on the web service, so I would like to send at most 10 requests per second to avoid getting these errors. How would I do that using reactor?
Using zipWith(Mono.delayMillis(100))? Or is there some better way?
Thank you
You can use delayElements instead of the whole zipwith.
One could use Flux.delayElements to process a 10 requests batch at every 1s; be aware though that if the processing takes longer than 1s the next batch will still be started in parallel hence being processed together with the previous one (and potentially many other previous ones)!
That's why I propose another solution where a 10 requests batch is still processed at every 1s but, if its processing takes longer than 1s, the next batch will fail (see overflow IllegalStateException); one could deal with that failure such that to continue the overall processing but I won't show that here because I want to keep the example simple; see onErrorResume useful to handle overflow IllegalStateException.
The code below will do a GET on https://www.google.com/ at a rate of 10 requests per second. You'll have to do additional changes in order to support the situation where your server is not able to process in 1s all your 10 requests; you could just skip sending requests when those asked at previous second are still processed by your server.
#Test
void parallelHttpRequests() {
// this is just for limiting the test running period otherwise you don't need it
int COUNT = 2;
// use whatever (blocking) http client you desire;
// when using e.g. WebClient (Spring, non blocking client)
// the example will slightly change for no longer use
// subscribeOn(Schedulers.elastic())
RestTemplate client = new RestTemplate();
// exit, lock, condition are provided to allow one to run
// all this code in a #Test, otherwise they won't be needed
var exit = new AtomicBoolean(false);
var lock = new ReentrantLock();
var condition = lock.newCondition();
MessageFormat message = new MessageFormat("#batch: {0}, #req: {1}, resultLength: {2}");
Flux.interval(Duration.ofSeconds(1L))
.take(COUNT) // this is just for limiting the test running period otherwise you don't need it
.doOnNext(batch -> debug("#batch", batch)) // just for debugging
.flatMap(batch -> Flux.range(1, 10) // 10 requests per 1 second
.flatMap(i -> Mono.fromSupplier(() ->
client.getForEntity("https://www.google.com/", String.class).getBody()) // your request goes here (1 of 10)
.map(s -> message.format(new Object[]{batch, i, s.length()})) // here the request's result will be the output of message.format(...)
.doOnSubscribe(s -> debug("doOnSubscribe: #batch = " + batch + ", i = " + i)) // just for debugging
.subscribeOn(Schedulers.elastic()) // one I/O thread per request
)
)
// consider using onErrorResume to handle overflow IllegalStateException
.subscribe(
s -> debug("received", s) // do something with the above request's result
e -> {
// pay special attention to overflow IllegalStateException
debug("error", e.getMessage());
signalAll(exit, condition, lock);
},
() -> {
debug("done");
signalAll(exit, condition, lock);
}
);
await(exit, condition, lock);
}
// you won't need the "await" and "signalAll" methods below which
// I created only to be easier for one to run this in a test class
private void await(AtomicBoolean exit, Condition condition, Lock lock) {
lock.lock();
while (!exit.get()) {
try {
condition.await();
} catch (InterruptedException e) {
// maybe spurious wakeup
e.printStackTrace();
}
}
lock.unlock();
debug("exit");
}
private void signalAll(AtomicBoolean exit, Condition condition, Lock lock) {
exit.set(true);
try {
lock.lock();
condition.signalAll();
} finally {
lock.unlock();
}
}

Jetty - Possible memory leak when using websockets and ByteBuffer

I'm using Jetty 9.3.5.v20151012 to deliver a large number of events to clients using websockets. The events consist of 3 parts: a number, an event type and a timestamp and each event is serialized as byte[] and sent using ByteBuffer.
After a certain number of hours/days, depending on the number of clients, I notice an increase in heap memory and without any possibility for the GC to recover it.
When the heap (set to 512MB) is almost full, the memory used by the jvm is about 700-800 MB and the CPU is at 100% (it seams like the GC is trying very often to clean up). At the beginning, when I start Jetty, the memory is at about 30MB when calling the GC but after some time, this number increases more and more. Eventually the process is killed.
I'm using jvisualvm as profiler for memory leak debug and I've attached some screenshots of the head dump:
Here is the main code that handles the message sending using ByteBuffer:
I basically have a method that creates a byte[] (fullbytes) for all events that need to be sent in one message:
byte[] numberBytes = ByteBuffer.allocate(4).putFloat(number).array();
byte[] eventBytes = ByteBuffer.allocate(2).putShort(event).array();
byte[] timestampBytes = ByteBuffer.allocate(8).putDouble(timestamp).array();
for (int i = 0; i < eventBytes.length; i++) {
fullbytes[i + scount*eventLength] = eventBytes[i];
}
for (int i = 0; i < numberBytes.length; i++) {
fullbytes[eventBytes.length + i + scount*eventLength] = numberBytes[i];
}
for (int i = 0; i < timestampBytes.length; i++) {
fullbytes[numberBytes.length + eventBytes.length + i + scount*eventLength] = timestampBytes[i];
}
And then another method (called in a separate thread) that sends the bytes on the websockets
ByteBuffer bb = ByteBuffer.wrap(fullbytes);
wsSession.getRemote().sendBytesByFuture(bb);
bb.clear();
As I've read on a few place (in documentation or here and here), this issue should not appear, since I'm not using direct ByteBuffer. Could this be a bug related to Jetty / websockets?
Please advise!
EDIT:
I've made some more tests and I have noticed that the problem appears when sending messages to a client that is not connected, but jetty has not received the onClose event (for ex. a user puts his laptop in standby). Because the on close event is not triggered, the server code doesn't unregister the client and keeps trying to send the messages to that client. I don't know why but the close event is received after 1 or 2 hours. Also, sometimes (in don't know the context yet) although the event is received and the client (socket) is unregistered, a reference to a WebSocketSession object (for that client) still hangs. I haven't found out why this happens yet.
Until then, I have 2 possible workarounds, but I have no idea how to achieve them (that have other good uses as well):
Always detect when a connection is not open (or closed temporarily, for ex. user puts laptop in standby). I tried using sendPing() and implementing onFrame() but I couldn't find a solution. Is there a way to do this?
Periodically "flush" the buffer. How can I discard the messages that were not sent to the client so they don't keep on queuing?
EDIT 2
This may be pointing the topic to another direction so I made another post here.
EDIT 3
I've done some more tests regarding the large number of messages/bytes sent and I found out why "it seamed" that the memory leak only appeared sometimes: when sending bytes async on a different thread than the one used when sevlet.configure() is called, after a large build-up, the memory is not being released after the client disconnects. Also I couldn't simulate the memory leak when using sendBytes(ByteBuffer), only with sendBytesByFuture(ByteBuffer) and sendBytes(ByteBuffer, WriteCallback).
This seams very strangem but I don't believe I'm doing something "wrong" in the tests.
Code:
#Override
public void configure(WebSocketServletFactory factory) {
factory.getPolicy().setIdleTimeout(1000 * 0);
factory.setCreator(new WebSocketCreator() {
#Override
public Object createWebSocket(ServletUpgradeRequest req,
ServletUpgradeResponse resp) {
return new WSTestMemHandler();
}
});
}
#WebSocket
public class WSTestMemHandler {
private boolean connected = false;
private int n = 0;
public WSTestMemHandler(){
}
#OnWebSocketClose
public void onClose(int statusCode, String reason) {
connected = false;
connections --;
//print debug
}
#OnWebSocketError
public void onError(Throwable t) {
//print debug
}
#OnWebSocketConnect
public void onConnect(final Session session) throws InterruptedException {
connected = true;
connections ++;
//print debug
//the code running in another thread will trigger memory leak
//when to client endpoint is down and messages are still sent
//because the GC will not cleanup after onclose received and
//client disconnects
//if the "while" loop is run in the same thread, the memory
//can be released when onclose is received, but that would
//mean to hold the onConnect() method and not return. I think
//this would be bad practice.
new Thread(new Runnable() {
#Override
public void run() {
while (connected) {
testBytesSend(session);
try {
Thread.sleep(4);
} catch (InterruptedException e) {
}
}
//print debug
}
}).start();
}
private void testBytesSend(Session session) {
try {
int noEntries = 200;
ByteBuffer bb = ByteBuffer.allocate(noEntries * 14);
for (int i = 0; i < noEntries; i++) {
n+= 1.0f;
bb.putFloat(n);
bb.putShort((short)1);
bb.putDouble(123456789123.0);
}
bb.flip();
session.getRemote().sendBytes(bb, new WriteCallback() {
#Override
public void writeSuccess() {
}
#Override
public void writeFailed(Throwable arg0) {
}
});
//print debug
} catch (Exception e) {
e.printStackTrace();
}
}
}
Your ByteBuffer use is incredibly inefficient.
Don't create all of those minor/tiny ByteBuffers just to get a byte array, and then toss it out. ick.
Note: you don't even use the .array() call correctly, as not all ByteBuffer allocations have a backing array you can access like that.
The byte array's numberBytes, eventBytes, timestampBytes, and fullbytes should not exist.
Create a single ByteBuffer, representing the entire message you intend to send, allocate it it to be either the size you need, or larger.
Then put the individual bytes you want into it, flip it, and give the Jetty implementation that single ByteBuffer.
Jetty will use the standard ByteBuffer information (such as position and limit) to determine what part of that ByteBuffer should actually be sent.

Categories

Resources