LOG4j2 async logger blocking functionality - java

background:
in log4j2 when using an asyncappender you have the ability to set the parameter "blocking" to false for the logger so any logs which overflow the buffer size will be discarded and not slow up the main thread. (see here under asyncAppender https://logging.apache.org/log4j/2.x/manual/appenders.html)
I am upgrading our application to the glorious asyncLogger structure found here: https://logging.apache.org/log4j/2.x/manual/async.html
while I see I can set the ring buffer size, I do not see anything stating that I can stop it from blocking the applications main thread,
question:
so just to be certain I am asking here as I dont see anything in the docs, if one has more logs coming in than logs going out(say we are stashing them in a DB and the insert is taking awhile) so the ringbuffersize is exceeded when using asyncloggers what happens to the extra logs? and will the main thread be slowed in any way?
thanks!

With async loggers, if your appender cannot keep up with the logging rate of the application, the ringbuffer will eventually fill up. When the ringbuffer is full, loggers will block trying to add a log event to the ringbuffer, so yes, application threads will be affected.
One of the reasons the default ringbuffer size is so large is that it can handle "bursts" of log events without impacting the application. However, it is important to select an appender that is appropriate for (can handle) the sustained logging rate of your application. (Testing with 2 or 3 times your target load is a good idea.)
FYI, the fastest appender that comes with log4j 2 is RandomAccessFileAppender (and its Rolling variant).
The next release (2.5.1) will have a feature to allow users to drop events (e.g. DEBUG and TRACE events) when the ringbuffer is 80% full or more.

If blocking is set on the AsyncAppender and the queue is full it appears the event will be passed to the Appender configured on the errorRef attribute, if one is configured.
However, you asked also asked about what happens when using asynchronous loggers, which is separate. With that everything is asynchronous and I am pretty sure is handled differently than the AsyncAppender.

Related

Log4j Console Appender Limitations

I am curious about the numbers around the limitations for the Log4j console appender. I have a service that processes messages using a thread pool and logs each event after processing. Before the thread pool approach, the service would just use the main thread for processing all of the messages. This took too long so I implemented a thread pool so each thread can process a subset of the messages as they are independent of each other.
However I started running into an issue where apparently the async queue is full and the threads would discard logs until the queue capacity would be available. I tracked down where this log iscoming from and it’s here due to the discarding policy: https://logging.apache.org/log4j/2.x/log4j-core/apidocs/src-html/org/apache/logging/log4j/core/async/DiscardingAsyncQueueFullPolicy.html#line.49
This is a problem as I need the logs and I need to use a console appender. I added a config to instead use the default policy so we don’t discard logs: https://logging.apache.org/log4j/2.x/log4j-core/apidocs/src-html/org/apache/logging/log4j/core/async/DefaultAsyncQueueFullPolicy.html#line.29
But now the issue is that processing the messages is taking too long, and it makes sense because now when the queue is full, the thread takes time to send logs to the console instead of returning and processing another batch of messages.
My questions:
Is there anything I can do to address this issue if I need to use a console appender? Would more cpu/memory help in this case for the threads?
Why exactly does the queue get so full quickly? Because when using the main thread to process ALL of the messages (so not batches) we don’t run into this issue but using the threads to batch process the messages we do? Also can we check the log4j queue size programmatically?
Can we configure the size of the log4j queue if we’re using a console appender?
Is there a logs/second figure for the max to expect using a console appender? so we can compare and see if we’re logging much more somehow.
We want to log the events to console, so we haven’t tried to use a different logger such as to a file. Would that be our only solution here if we are trying to log too many logs/second?
The reason your logging queue is filling up is that you increased the performance of your application too much :-). Now you have a bottleneck in the logging appender.
There are some benchmarks comparing a FileAppender to a ConsoleAppender in Log4j2 documentation. Your figures might vary, but due to the synchronization mechanisms in System.out, the console appender is around 20 times slower than the file appender. The bottleneck is not really the filesystem, since redirecting stdout to /dev/null gives similar results.
To improve performance, you can set the direct property on the console appender to true (cf. documentation). This basically replaces System.out with new FileOutputStream(FileDescriptor.out) and provides performance on par with the file appender.
You can fine tune the LMAX disruptor using Log4j2 properties (cf. documentation). E.g. log4j2.asyncLoggerRingBufferSize allows you to increase the size of the async queue (ring buffer in LMAX terminology).

Log4J2: Get the current log events count available in ring buffer of asynchronous logger

We have recently moved to the Log4J 2.13 version in our Java application and are using all asynchronous logger configuration for high performance.
Under high load and logging scenarios, using asynchronous loggers has helped us a lot as the calling code executes very fast and delegating the logging request to a separate thread. However, the logging continues to happen asynchronously in the backend even after the calling code has finished - this is as per the expected results.
In the above scenario, at any point in time, we want to know how many log statements are available in the ring buffer of the asynchronous logger and are still pending to be logged. Is there a way to get this count?
Please note that we don't want the default/configured ring buffer size of the asynchronous logger in the application. Instead, we want to know its current state i.e. how many log statements does it hold (that are pending to be logged) at any instant in time.
Log4J provides a RingBufferAdmin MBean that exposes two operations; getBufferSize() and getRemainingCapacity(). You should be able to access that through the MBean server in your application.

Logback's AsyncAppender is losing log events, even when our rate of logging is low

We are using logback with an AsyncAppender. We've been using it this way for years, but recently we noticed that log messages are going missing. We understand that AsyncAppender doesn't guarantee that this won't happen, so fair enough, in a sense. But in our case, the rate of messages being logged is very low, so the fact that log messages are going missing is very surprising.
We verified that AsyncAppender is to blame by temporarily using a synchronous FileAppender. When we did this, log messages did not go missing. Also, we tried setting discardThreshold to 0, and this also fixed the problem.
But these fixes are not so great for us. These logs are being written in Future-like things that are supposed to be non-blocking, and so we don't want them to ever block. If they were spewing out logging events at a furious rate, we'd be willing to live with losing log entries from time to time, but that's not what's occurring. The rate of log events is slow.
Any theories on what is going on here? And how we can fix it and keep our Futures free from blocking activities? (Leaving discardThreshold at 0 is not so good for us, because that allows logging events to block if necessary to make sure that all log events are written, and we don't want our Futures to ever block.)
One thought we had about this is that we do actually have a lot of DEBUG log events, but the logging level in our logback.xml is configured at the moment to INFO, so the DEBUG events are not logged.
Is it possible that the DEBUG events are somehow clogging up the AsyncAppender, even though they are configured not be logged. Until now, it has always been our theory that DEBUG events would be filtered out long before hitting the AsyncAppender queue. Is this not the case?
Or is it possible that AsyncAppender is not thread-safe?
We are using logback version 1.2.3.

When not to use AsyncAppender in logback by default

Logback supports using an async appender with the class
ch.qos.Logback.classic.AsyncAppender and according to the documentation, this will reduce the logging overhead on the application. So, why not just make it the default out of the box. What usecases are better served by using a sync appender. One problem I can see with the Async appender is that the log messages will not be chronological. Are there any other such limitations?
The AsyncAppender acts as a dispatcher to another appender. It buffers log events and dispatches them to, say, a FileAppender or a ConsoleAppender etc.
Why use the AsyncAppender?
The AsyncAppender buffers log events, allowing your application code to move on rather than wait for the logging subsystem to complete a write. This can improve your application's responsiveness in cases where the underlying appender is slow to respond e.g. a database or a file system that may be prone to contention.
Why not make it the default behavior?
The AsyncAppender cannot write to a file or console or a database or a socket etc. Instead, it just delegates log events to an appender which can do that. Without the underlying appender, the AsyncAppender is, effectively, a no-op.
The buffer of log events sits on your application's heap; this is a potential resource leak. If the buffer builds more quickly than it can be drained then the buffer will consume resources that your application might want to use.
The AsyncAppender's need for configuration to balance the competing demands of no-loss and resource leakage and to handle on-shutdown draining of its buffer means that it is more complicated to manage and to reason about than simply using synchronous writes. So, on the basis of preferring simplicity over complexity, it makes sense for Logback's default write strategy to be synchronous.
The AsyncAppender exposes configuration levers that you can use to address the potential resource leakage. For example:
You can increase the buffer capacity
You can instruct Logback to drop events once the buffer reaches maximum capacity
You can control what types of events are discarded; drop TRACE events before ERROR events etc
The AsyncAppender also exposes configuration levers which you can use to limit (though not eliminate) the loss of events during application shutdown.
However, it remains true that the simplest safest way of ensuring that log events are successfully written is to write them synchronously. The AsyncAppender should only be considered when you have a proven issue where writing to an appender materially affects your application responsiveness/throughput.

How to handle disk full errors while logging in logback?

I am using slf4j+logback for logging in our application. Earlier we were using jcl+log4j and moved recently.
Due to the high amount of logging in our application, there is a chance of disk being full in production environment. In such cases we need to stop logging and application should work fine. What I found from the web is that we need to poll logback StatusManager for such errors. But this will add a dependency with logback for the application.
For log4j, I found that we can create an Appender which stops logging in such scenarios. That again will cause a application dependency with log4j.
Is there a way to configure this with only slf4j or is there any other mechanism to handle this?
You do not have to do or configure anything. Logback is designed to handle this situation quite nicely. Once target disk is full, logback's FileAppender will stop writing to it for a certain short amount of time. Once that delay elapses, it will attempt to recover. If the recovery attempt fails, the waiting period is increased gradually up to a maximum of 1 hour. If the recovery attempt succeeds, FileAppender will start logging again.
The process is entirely automatic and extends seamlessly to RollingFileAppender. See also graceful recovery.
On a more personal note, graceful recovery is one my favorite logback features.
You may try extending the slf4j.Logger class, specifically the info, debug, trace and other methods and manually query for the available space (via File.getUsableSpace() ) before every call.
That way you will not need any application dependency
2 real options:
add a cron task on linux (or scheduled one on windows) to clean up your mess (incl. gzip some, if need be).
buy a larger hard disk and manually perform the maintenance
+-reduce logging
Disk full is like OOM, you can't know what fails 1st when catch it. Dealing w/ out of memory (or disk) is by preventing it. There could be a lot of cases when extra disk space could be needed and the task failed.

Categories

Resources