I am considering logging business events in a J2EE web application by using Java logging and FileHandler.
I am wondering whether that could cause a performance bottleneck, since many log records will be written to one file.
What are your experiences and opinions?
Is logging a busy web application to one file with Java logging and FileHandler likely to become performance bottleneck?
It all depends on how much log statements you add. If you add logging after every line of code then performance will must certainly degrade.
Use logging for the important cases, set the correct logging level for your current purposes (testing or actual deployment) and use constructions like
if (Logger.isDebugEnabled()) {
Logger.debug("Value is " + costlyOperation()")
}
to avoid calling code that is costly to run.
You might also want to check this article
In order to avoid generalities like "it depends" or "a little" etc. you should measure the performance of your application with and without the logging overhead. Apache JMeter can help you generate the load for the test.
The information you can collect through logging is usually so essential for the integrity of the application, that you can not operate blindly. There is also a slight overhead if you use Google Analytics, but the benefits prevail.
In order to keep your log files within reasonable sizes, you can always use rotating log files.
I think that JavaRevisited blog has a pretty good post on a problem with performance: Top 10 Tips on Logging in Java
In a recent project, I log audit events to a database table and I was concerned about performance, so I added the ability to log in 'asynchronous' mode. In this mode the logger runs in a low-priority background thread and the act of logging from the main thread just puts the log events onto a queue which are lazily retrieved and written by the background logging thread.
This approach will only work, however, if there are natural 'breaks' in the processing; if your system is constantly busy then the queue will never be emptied. One way to solve this is to make the background thread more active depending on the number of the log messages in the queue (an enhancement I've yet to implement).
You should:
Define an appropriate metric of performance (e.g., responsiveness, throughput, etc.). Then you should measure this metric with all logging turned off and then on. The difference would be the cost of logging.
Then you should experiment with different logging libraries and the modes they provide and document the observed differences.
In my personal experience, for all the three projects I worked on, I found that asynchronous logging helped improve the application throughput a lot. But the same may not hold for you, so make sure you make your decision after careful measurements.
The following does not directly relate to your question.
I noticed that you specifically mentioned business logging. In this case, you may also want to keep logging relevant and clean, in case you find your log files are growing huge and difficult to understand. There is a generally accepted design pattern in this area: log as per function. This would mean that business logging (e.g., customer requested a refund) goes to a different destination, interface logging would go to another destination (e.g., user clicked the upvote button != user upvoted an answer), and a cross system call would go to another destination (e.g., Requesting clearance through payment gateway). Some people keep a master log file with all events as well just to see a timeline of the process while some design log miners/scrappers to construct timelines when required.
Hope this helps,
Related
I'm using a Spring Boot back-end to provide some restful API and need to log all of my request-response logs into ElasticSearch.
Which of the following two methods has better performance?
Using Spring Boot ResponseBodyAdvice to log every request and response that is sent to the client directly to ElasticSearch.
Log every request and response into a log file and using filebeat and/or logstash to send them to ElasticSearch.
First off, I assume, that you have a distributed application, otherwise just write your stuff in a log file and that's it
I also assume that you have quite a log of logs to manage, otherwise, if you're planning to log like a couple of messages in a hour, then it doesn't really matter which way you go - both will do the job.
Technically both ways can be implemented, although for the first path I would suggest a different approach, at least I did something similar ~ 5 years ago in one of my projects:
Create a custom log appender that throws everything into some queue (for async processing) and from that took an Apache Flume project that can write stuff to the DB of your choice in a transaction manner with batch support, "all-or-nothing" semantics, etc.
This approach solves issues that might appear in the "first" option that you've presented, while some other issues will be left unsolved.
If I compare the first and the second option that you've presented,
I think you better off with filebeat / logstash or even both to write to ES, here is why:
When you log in the advice - you will "eat" the resources of your JVM - memory, CPU to maintain ES connections pool, thread pool for doing an actual log (otherwise the business flow might slow down because of logging the requests to ES).
In addition you won't be able to write "in batch" into the elasticsearch without the custom code and instead will have to create an "insert" per log message that might be wasty.
One more "technicality" - what happens if the application gets restarted for some reason, will you be able to write all the logs prior to the restart if everything gets logged in the advice?
Yet another issue - what happens if you want to "rotate" the indexes in the ES, namely create an index with TTL and produce a new index every day.
filebeat/logstash potentially can solve all these issues, however they might require a more complicated setup.
Besides, obviously you'll have more services to deploy and maintain:
logstash is way heavier than filebeat from the resource consumption standpoint, and usually you should parse the log message (usually with grok filter) in logstash.
filebeat is much more "humble" when it comes to the resource consumption, and if you have like many instances to log (really distributed logging, that I've assumed you have anyway) consider putting a service of filebeat (deamon set if you have k8s) on each node from which you'll gather the logs, so that a single filebeat process could handle different instances, and then deploy a cluster of instances of logstash on a separate machine so that they'll do a heavy log-crunching all the time and stream the data to the ES.
How does logstash/filebeat help?
Out of my head:
It will run in its own pace, so even if process goes down, the messages produced by this process will be written to the ES after all
It even can survive short outages of the ES itself I think (should check that)
It can handle different processes written in different technologies, what if tomorrow you'll want to gather logs from the database server, for example, that doesn't have spring/not written java at all
It can handle indices rotation, batch writing internally so you'll end up with effective ES management that otherwise you had to write by yourself.
What are the drawbacks of the logstash/filebeat approach?
Again, out of my head, not a full list or something:
Well, much more data will go through the network all-in-all
If you use "LogEvent" you don't need to parse the string, so this conversion is redundant.
As for performance implications - it basically depends on what do you measure how exactly does your application look like, what hardware do you have, so I'm afraid I won't be able to give you a clear answer on that - you should measure in your concrete case and come up with a way that works for you better.
Not sure if you can expect a clear answer to that. It really depends on your infrastructure and used hardware.
And do you mean by performance the performance of your spring boot backend application or performance in terms of how long it takes for your logs to arrive at ElasticSearch?
I just assume the first one.
When sending the logs directly to ElasticSearch your bottleneck will be the used network and while logging request and responses into a log file first, your bottleneck will probably be the used harddisk and possible max I/O operations.
Normally I would say that sending the logs directly to ElasticSearch via network should be the faster option when you are operating inside your company/network because writing to a disk is always quite slow in comparison. But if you are using fast SSDs the effect should be neglectable. And if you need to send your network packages to a different location/country this can also change fast.
So in summary:
If you have a fast network connection to your ElasticSearch and HDDs/slower SSDs the performance might be better using the network.
If your ElasticSearch is not at your location and you can use fast SSD, writing the logs into a file first might be the faster option.
But in the end you maybe have to try out both approaches, implement some timers and check for yourself.
we are using both solution. first approach have less complexity.
we choose second approach when we dont want to touch the code and have too many instance of app.
about performance. with writing directly on elasticsearch you have better performance because you are not occupying disk I/O. but assume that when the connection between your app and elasticsearch server is dropped. you would have lost log after some retrying attempts.
using rsyslog and logstash is more reliable for big clusters.
Async loggers in log4j2 can improve the logging performance a lot, but are they robust enough? When programs are killed unexpectedly, will the logging messages before that time point be flushed into disk? And does anyone know how many big projects(such as apache projects) use async loggers and give some examples? Any help will be appreciated.
When any process dies you are liable to lose log events that are being buffered. Most people who use File Appenders turn buffering on because the performance without it is considerably slower. Events in the OS buffer would be lost in that case. Likewise with most network protocols, unless you are using something like Apache Flume that immediately acknowledges the receipt, but even then a few messages could be lost simply because the process died before the data was written. But Remko's answer covers the subject of losing messages better than I could.
As for who uses it I can only answer that we know that Async Loggers are being used since we do get questions from time to time but there is no way to formally track who is using any open source project, much less how.
My company uses Async Loggers for a mission-critical equity trading system, without issues.
I have a big appengine-java application that uses java.util.Logging.
For debugging purposes, I put an INFO message basically on every put, delete, get or query. The application-wide logging settings filters all log messages with level lower than WARNING.
My question is: all this INFO messages, even though filtered, do slow down my app or not?
Every additional operation you perform will add to the overhead you have. I have had some REST calls time out because I had forgotten a logger in the wrong place :)
So yes, they do slow things down, but to what effect is very, very highly dependent on how much you are logging. In a normal situation, logging should not have any noticeable performance penalty. This should be easy to measure, just set your logging level higher to not log so much, and see if the application performs faster!
When i start a JVM in debug mode things naturally slow down.
Is there a way to state that i am interested in only debugging a single application instead of the 15 (making up a number here) applications that run on this JVM.
An approach that facilitates this might make things faster particularly when we already know from the logs and other trace facilities that the likely issue with a single application
Appreciate thoughts and comments
Thanks
Manglu
I am going to make a lot of assumptions here, especially as your question is missing a lot of contextual information.
Is there a way to state that i am interested in only debugging a single application instead of the 15 (making up a number here) applications that run on this JVM.
Firstly, I will assume that you are attempting to do this in production. If so, step back and think what could go wrong. You might be putting a single breakpoint, but that will queue up all the requests arriving at that breakpoint, and by doing so you've thrown any SLA requirements out of the window. And, if your application is handling any sensitive data, you must have seen something that you were not supposed to be seeing.
Secondly, even if you were doing this on a shared development or testing environment this is a bad idea. Especially if are unsure of what you are looking for. If you are hunting a synchronization bug, then this is possibly the wrong way to do so; other threads will obviously be sharing data that you are reading and make it less likely to find the culprit.
The best alternative to this is to switch on trace logging in your application. This will, of course be useless, unless you have embedded the appropriate logger calls in your application (especially to trace method arguments and return values). With trace logs at your disposal, you should be able to create an integration or unit test that will reproduce the exact conditions of failure on your local developer installation; this is where you ought to be doing your debugging. Sometimes, even a functional test will suffice.
There is no faster approach in general, as it is simply not applicable to all situations. It is possible for you to establish a selected number of breakpoints in any of the other environments, but it simply isn't worth the trouble, unless you know that only your requests are being intercepted by the debuggee process.
My Application when running is writing logs. Now I need to check whether indexing is completed or not by checking for a status message as to whether it's written in logs or not (note that logging is going on dynamically and the process is running). My application is not sending a signal as to when it has completed the process of indexing, just logs it and goes doing other stuff. Should I poll the logs continuously to check whether status has been written in logs but that would be kind of anti-pattern or bad design. I cant even have a busy-waiting or a do nothing loop and then check, another bad design. How can I check for the entered entry in logs in the best way without querying logs repetedly for that and with consuming less CPU cycles?
Polling is the usual solution. Other solutions require the
collaboration of the generating process in some way; if this is
possible, it's obviously a preferable solution, but if the generating
process is to remain unaware of the listener (in the sense of not
knowing about its existance), then polling is about the only valid
solution. (Depending on the logging facilities, you might be able to
arrange for the log to go into a named pipe, and read that.)
Note that polling isn't necessarily that expensive, if you aren't doing
it too often.
If you control both of the programs (i.e. reading and writing the log), then the easiest solution is to have the writer notify all listeners when it is done using some form of inter-process communication (e.g. signals).
Only if IPC is not possible, you should look at smarter methods of waiting of polling for changes. Most operating systems let you register a callback for when a file or directory is modified. Take a look at this question for some suggestions.
Assuming that log parsing is the only alternative you have, the idiom you are looking for has the following high-level representation (UNIX CLI style)
# tail -f logfile.txt | grep STATUS_PATTERN
There (1) "tail -f" prints out any new lines that are appended to logfile.txt and passes them to (2) "grep" which performs the actual pattern matching.
Both (1) and (2) functionality is trivial to be implemented in Java/C++ as a separate thread/process and provide more lightweight load than the periodic polling.
You will also need a little bit of extra functionality to detect log rotation conditions.