I am using slf4j+logback for logging in our application. Earlier we were using jcl+log4j and moved recently.
Due to the high amount of logging in our application, there is a chance of disk being full in production environment. In such cases we need to stop logging and application should work fine. What I found from the web is that we need to poll logback StatusManager for such errors. But this will add a dependency with logback for the application.
For log4j, I found that we can create an Appender which stops logging in such scenarios. That again will cause a application dependency with log4j.
Is there a way to configure this with only slf4j or is there any other mechanism to handle this?
You do not have to do or configure anything. Logback is designed to handle this situation quite nicely. Once target disk is full, logback's FileAppender will stop writing to it for a certain short amount of time. Once that delay elapses, it will attempt to recover. If the recovery attempt fails, the waiting period is increased gradually up to a maximum of 1 hour. If the recovery attempt succeeds, FileAppender will start logging again.
The process is entirely automatic and extends seamlessly to RollingFileAppender. See also graceful recovery.
On a more personal note, graceful recovery is one my favorite logback features.
You may try extending the slf4j.Logger class, specifically the info, debug, trace and other methods and manually query for the available space (via File.getUsableSpace() ) before every call.
That way you will not need any application dependency
2 real options:
add a cron task on linux (or scheduled one on windows) to clean up your mess (incl. gzip some, if need be).
buy a larger hard disk and manually perform the maintenance
+-reduce logging
Disk full is like OOM, you can't know what fails 1st when catch it. Dealing w/ out of memory (or disk) is by preventing it. There could be a lot of cases when extra disk space could be needed and the task failed.
Related
background:
in log4j2 when using an asyncappender you have the ability to set the parameter "blocking" to false for the logger so any logs which overflow the buffer size will be discarded and not slow up the main thread. (see here under asyncAppender https://logging.apache.org/log4j/2.x/manual/appenders.html)
I am upgrading our application to the glorious asyncLogger structure found here: https://logging.apache.org/log4j/2.x/manual/async.html
while I see I can set the ring buffer size, I do not see anything stating that I can stop it from blocking the applications main thread,
question:
so just to be certain I am asking here as I dont see anything in the docs, if one has more logs coming in than logs going out(say we are stashing them in a DB and the insert is taking awhile) so the ringbuffersize is exceeded when using asyncloggers what happens to the extra logs? and will the main thread be slowed in any way?
thanks!
With async loggers, if your appender cannot keep up with the logging rate of the application, the ringbuffer will eventually fill up. When the ringbuffer is full, loggers will block trying to add a log event to the ringbuffer, so yes, application threads will be affected.
One of the reasons the default ringbuffer size is so large is that it can handle "bursts" of log events without impacting the application. However, it is important to select an appender that is appropriate for (can handle) the sustained logging rate of your application. (Testing with 2 or 3 times your target load is a good idea.)
FYI, the fastest appender that comes with log4j 2 is RandomAccessFileAppender (and its Rolling variant).
The next release (2.5.1) will have a feature to allow users to drop events (e.g. DEBUG and TRACE events) when the ringbuffer is 80% full or more.
If blocking is set on the AsyncAppender and the queue is full it appears the event will be passed to the Appender configured on the errorRef attribute, if one is configured.
However, you asked also asked about what happens when using asynchronous loggers, which is separate. With that everything is asynchronous and I am pretty sure is handled differently than the AsyncAppender.
I've looking for an answer on this for a while. In the company I work we have a highly concurrent system but recently found that the logging configuration of the web server (Jboss) includes the Console appender. The application loggers are also going to the console. We started to get deadlocks on the logging actions, most of them to the console appender (I know that Log4j has a really nasty sincronization bug, but i'm almost sure that we dont have any sincronization method in the related code). Another thing founded, is that the IT Guys regularly access to the console with a putty console, put pauses to check the logs and then just close the putty window.
Is it possible that the console appender, and the use of console for logging and monitoring in a production environment, are causing deadlocks and race conditions on the system?. My understanding is that the console should be used only on development phases with an IDE, because on a highly concurrent system it will be another resource to get (slow because of unbuffered I/O) subject to race conditions.
Thanks.
From the Best practices for Performance tuning JBoss Enterprise Application Platform 5, page 9
Turn off console logging in production
Turn down logging verbosity
Use asynchronous logging.
Wrap debug log statements with If(debugEnabled())
I heavily recommend first and last considerations in production due to a bug in Log4J that calculates what to log before logging it i.e. if MyClass#toString() is a heavy operation, Log4J will first calculate this String before (yes, it will do the heavy operation) and then it will check if this String must be logged (pretty bad, indeed =).
Also, tell the IT guys to use the less command when checking the log files in order to not blocking the files, do not check the console directly =. This command works in Linux, if your server is on Unix environment, the command will be tail (based on #Toni's comment).
IMO, I think that an official JBoss performance guide is the best proof to stop using the console logging in production (even if this still doesn't prove your deadlock problem).
We have a high-speed, high-volume application, which is using log4j. Typically we have been using the SyslogAppender, thinking that it's the lightest weight, fastest appender. But we are seeing high CPU utilization from SYSLOG under high volume (because the filter rules in the SYSLOG conf).
We probably want to switch to using a FileAppender. The question is do we want to use this in conjunction with the log4j AsyncAppender to remove any pauses due to flush (force) to disk?
(The application is very latency sensitive, so we want minimize any latency the appender might add.) Also - I'm not really sure SyslogAppender is really faster the FileAppender, anyway (but that's the way things have been since I started).
Any thoughts on this would be appreciated.
I would definitely use the AsyncAppender.
I've seen a low latency application virtually stop using a standard file appender. Admittedly they were using (OS)VMs on shared hardware and disks so one VM could monopolise the disk IO and bring the others to a halt while trying to log.
You might also look into logging to JMS and other asynchronous strategies.
I am considering logging business events in a J2EE web application by using Java logging and FileHandler.
I am wondering whether that could cause a performance bottleneck, since many log records will be written to one file.
What are your experiences and opinions?
Is logging a busy web application to one file with Java logging and FileHandler likely to become performance bottleneck?
It all depends on how much log statements you add. If you add logging after every line of code then performance will must certainly degrade.
Use logging for the important cases, set the correct logging level for your current purposes (testing or actual deployment) and use constructions like
if (Logger.isDebugEnabled()) {
Logger.debug("Value is " + costlyOperation()")
}
to avoid calling code that is costly to run.
You might also want to check this article
In order to avoid generalities like "it depends" or "a little" etc. you should measure the performance of your application with and without the logging overhead. Apache JMeter can help you generate the load for the test.
The information you can collect through logging is usually so essential for the integrity of the application, that you can not operate blindly. There is also a slight overhead if you use Google Analytics, but the benefits prevail.
In order to keep your log files within reasonable sizes, you can always use rotating log files.
I think that JavaRevisited blog has a pretty good post on a problem with performance: Top 10 Tips on Logging in Java
In a recent project, I log audit events to a database table and I was concerned about performance, so I added the ability to log in 'asynchronous' mode. In this mode the logger runs in a low-priority background thread and the act of logging from the main thread just puts the log events onto a queue which are lazily retrieved and written by the background logging thread.
This approach will only work, however, if there are natural 'breaks' in the processing; if your system is constantly busy then the queue will never be emptied. One way to solve this is to make the background thread more active depending on the number of the log messages in the queue (an enhancement I've yet to implement).
You should:
Define an appropriate metric of performance (e.g., responsiveness, throughput, etc.). Then you should measure this metric with all logging turned off and then on. The difference would be the cost of logging.
Then you should experiment with different logging libraries and the modes they provide and document the observed differences.
In my personal experience, for all the three projects I worked on, I found that asynchronous logging helped improve the application throughput a lot. But the same may not hold for you, so make sure you make your decision after careful measurements.
The following does not directly relate to your question.
I noticed that you specifically mentioned business logging. In this case, you may also want to keep logging relevant and clean, in case you find your log files are growing huge and difficult to understand. There is a generally accepted design pattern in this area: log as per function. This would mean that business logging (e.g., customer requested a refund) goes to a different destination, interface logging would go to another destination (e.g., user clicked the upvote button != user upvoted an answer), and a cross system call would go to another destination (e.g., Requesting clearance through payment gateway). Some people keep a master log file with all events as well just to see a timeline of the process while some design log miners/scrappers to construct timelines when required.
Hope this helps,
We have several java application server running here, with several apps. They all log with Log4J into the same file system, which we created only for that reason.
From time to time it happens that the file system runs out of space and the app gets
log4j:ERROR Failed to flush writer,
java.io.IOException
Unfortunately Log4J does not recover from this error, so that even after space is freed in the file system, no more logs are written from that app. Are there any options, programming-wise or setting-wise, to get Log4J going again, besides restarting the app?
I didn't test this, but the website of logback states:
Graceful recovery from I/O failures
Logback's FileAppender and all its sub-classes, including
RollingFileAppender, can gracefully recover from I/O failures. Thus,
if a file server fails temporarily, you no longer need to restart your
application just to get logging working again. As soon as the file
server comes back up, the relevant logback appender will transparently
and quickly recover from the previous error condition.
I assume the same would be true for the above situation.
What do you see is an acceptable outcome here? I'd consider writing a new Appender that wraps whichever appender is accessing the disk, and tries to do something sensible when it detects IOExceptions. Maybe get it to wrap the underlying Appenders write methods in a try-catch block, and send you or a sysadmin an email.
limit the size of your logs and try using a custom appender to archive logs to a backup machine with lots of disk space.