Overhead reduction of conditional trace/logging calls

Overhead reduction of conditional trace/logging calls - java

For tracing and debugging my Java code, I am using a simple Util class rather than a full-blown logging framework:
public class Util {
public static int debugLevel;
public static void info(String msg) {
// code for logfile output
// handling of unprintable characters
// etc. omitted
System.out.println(msg);
}
public static void info4(String msg) {
if (debugLevel >= 4) {
info(msg);
}
}
}
This allows compact single-line statements like the following:
info4(String.format("%16s: ", host) + Util.toHex(sidNetto));
Using a debugLevel variable, I can control the verbosity of my program. Usually, the debug level is set globally at execution start. But it can also be adjusted locally on routine level.
Basically, I save the repeated if (debugLevel >= DEBUG_ALL) {...} brackets around my trace calls. However, the parameters of the call have to be prepared and passed at runtime regardless of the debug level.
My question:
How could I nudge the compile-time optimizer or the JVM to remove superfluous trace calls? I am thinking on the lines of C/C++ function inlining.
A related question regarding C# was discussed here. But I am not sure how to port the suggested answers to Java.
Another related post back from 2010 discussed similar approaches to mine. Im wondering if third-party tools like ProGuard are actually required do solve such a common task.

Most logging APIs that I know suggest to check if the log level is enabled before actually calling the log method in case the message has to be prepared first, e.g.:
if (logger.isTraceEnabled()) {
String msg = String.format("Name changed from %s to %s", oldName, newName);
logger.trace(msg);
}
Some logging APIs like SLF4J also provide more complex log methods that accept a format string and multiple arguments, so that the log message is only built in case the log level is enabled:
logger.trace("Name changed from {} to {}", oldName, newName);
This is sufficient in most of the cases, but sometimes your message is more complex to build, or the arguments have to be converted to strings first. In this case, checking the log level is still a good approach.
Since Java 8, you could also take advantage of lambda expressions to solve this issue. Your log method could be implemented like that:
public void log(Supplier<String> messageSupplier) {
if (isLogEnabled()) {
String msg = messageSupplier.get();
// TODO: log msg
}
}
As you can see, the message is retrieved form the messageSupplier only in case logging is enabled. Thanks to lambda expressions, implementing a Supplier<String> is very easy:
logger.log(() -> String.format("Name changed from %s to %s", oldName, newName));
Update (thanks to Joshua Taylor)
Since Java 8, the java.util.logging API already supports message suppliers, e.g. see Logger#info, so you could easily exchange your logging implementation by the 'on-board' solution from JRE.

This is how most logging frameworks do it. For lightweight arguments (this includes built-in formatters which is a good best practice) do not check the level, otherwise check the level before serialising complicated string arguments.
You could use Java 8 java.util.functions.Supplier<String> for lacy evaluation, but I think there might be no performance to gain over the explicite level test case.
The logger would look like:
void debug(String ptrn, Supplier<String> args...)
And you can use it like:
debug("Hello {0}", this::getName());

It seems weird to not use established logging framework due to their complexity, but worry about minor optimizations like method inlining, while ignoring the greater problem of formatting a log string irrespective of log level. But if you insist on reinventing the wheel:
The JVM (at least the Oracle hotspot JVM) automatically inlines short methods, and performs dead code elimination of unreachable branches. To be detected as unreachable, the log level of the message and the level threshold would have to be constant (compile-time constant, or static final). Otherwise, the JVM will compare logging levels on each call, though it is still likely to perform speculative inlining (inline the branch usually taken, guarded by a conditional branch instruction) which ensures that a branch instruction is only executed in the unusual case.
A much greater concern however is the cost of building the log message, which should only be incurred if the message must actually be logged. The old log4j approach of requiring calling code to check whether logging is enabled before preparing the message is rather verbose and easily forgotten. Instead, SLF4J defers string concatenation to the logging system by having the log methods take a format string and a variable number of objects to be inserted into placeholders. The SLF4J FAQ writes:
The following two lines will yield the exact same output. However, the second form will outperform the first form by a factor of at least 30, in case of a disabled logging statement.
logger.debug("The new entry is "+entry+".");
logger.debug("The new entry is {}.", entry);
It is worth noting that the arguments (here: entry) are of type Object, so their conversion to String only happens if the message actually has to be logged.
To be clear, there is no reliable way to skip evaluation of method arguments by redefining a method, because such elimination may only occur if the just in time compiler can prove the evaluation to be side effect free, which the hotspot jvm only detects if it has inlined the entire evaluation, which it will only to for very simple evaluations. Therefore, the API solution of moving formatting into the logging system is probabaly the best you can hope for.

Related

Method resolution during logging: Why is ()->String.format(...) not recoginzed as Supplier<String>?

I have made a call to the log4j-v2-API like this:
Logger logger = LogManager.getLogger("MyLogger");
/**
* syntactic sugar as part of a facade
*/
private void logAtLevel(String level, Supplier<String> messageSupplier, Throwable thrown){
Level priority = Level.toLevel(level, Level.ALL);
if (null == thrown) {
logger.log(priority, messageSupplier);
} else {
logger.log(priority, messageSupplier, thrown);
}
}
/**
* method calling logger
*/
private void callLogging(Object value){
logAtLevel("Debug", ()->String.format("something or other: %s", value), null);
}
My expectations would have been that the above call creates a log entry "something or other: <object.toString()>" however I got "lambda#1223454" instead.
This suggests to me that the method that was executed is log(level, object) rather than the expected log(level, supplier)
Why are the methods resolved the way they are and how can I prevent this (preferably without casting)?

Your Supplier is not Log4j2's Supplier!
java.util.function.Supplier was introduced in Java 8. At around the same time (version 2.4, cf. LOG4J2-599) Log4j2 introduced org.apache.logging.log4j.util.Supplier so that the library can be used on Java 7.
That is why your code does not call log(Level, Supplier<?>, Throwable) but log(Level, Object, Throwable) and you end up with Supplier#toString() being logged instead of Supplier#get().
This is probably something that should change and I filed a wish list bug about it (cf. #1262).
Remark: Wrapping well established logging APIs like Log4j2 API and SLF4J into custom wrappers is not a good idea (cf. this question). While it is not rocket science to write a good wrapper, there are many details you should consider. For example your wrapper breaks location information.

Doing a level lookup for every call, when your intent is to avoid every nanosecond for levels that don't matter, is probably something you need to reconsider.
At any rate, there is nothing immediately obvious about your snippet that would explain what you observe. Thus, let's get into debugging it, and likely causes.
The code you are running isn't the code you wrote. For example, because you haven't (re)compiled.
logger.log is overloaded, and older versions of slf4j exist that contain only the 'object' variant, which just calls toString(), which would lead to precisely the behaviour you are now witnessing.
You can do some debugging on both of these problems by applying the following trick:
Add System.out.println(TheClassThisCodeIsIn.class.getResource("TheClassThisCodeIsIn.class")); - this will tell you exactly where the class file is at.
Then use javap -v -c path/to/that/file.class and check the actual call to logger.log - is the 'signature' that it links to the Supplier variant or the Object variant?
The different effects and what it means:
The sysout call isn't showing up. Then you aren't running the code you are looking at, and you need to investigate your toolstack. How are you compiling it, because it's not working.
The javap output shows that the 'Object' variant is invoked. In which case the compilation step is using an old version of log4j2 in its classpath. It's not about the slf4j version that exists at runtime, it's what the compiler is using, as that determines which overload is chosen.
Everything looks fine and yet you are still seeing lambda#... - then I would start suspecting that this is the actual output of your lambda code somehow, or something really, really wonky is going on. The only things I can think of that get you here are ridiculously exotic, not worth mentioning.

toString in logback

I am using logback in our project. I have gone through the links for logback for curly braces.
logger.debug(" My class output value - {}, object toString() {}", object.value(), object.toString());
Debug is not enabled in my project. We saw the toString() is getting called in our project which impact our performance. Will the toString() be called during the code execution with debug disabled?
Can i use toString() with this approach? Because as per curly braces definition string concat will not happen. Will it be only for string concat or is that applicable for method calls too?

From the LogBack documentation:
Better alternative
There exists a convenient alternative based on message formats. Assuming entry is an object, you can write:
Object entry = new SomeObject();
logger.debug("The entry is {}.", entry);
Only after evaluating whether to log or not, and only if the decision is positive, will the logger implementation format the message and replace the '{}' pair with the string value of entry. In other words, this form does not incur the cost of parameter construction when the log statement is disabled.
So, by passing just the object, without invoking toString(), you will save the toString() overhead.

Logback doesn't change the rules of Java: When you call a method, the arguments to the method need to be evaluated in order to be passed to the method. All you save by using the curly-brace notation when the log level isn't enabled is the cost of the String concatenation, that is to say, the cost of constructing the complete String to be logged from the individual components.
If the cost of evaluating the arguments is measurably decreasing your performance to the point where it no longer meets your customers' needs, then you probably want to avoid the cost of running it if the log level isn't enabled. From the Logback Manual, Chapter 2: Architecture, "Parameterized logging":
One possible way to avoid the cost of parameter construction is by surrounding the log statement with a test. Here is an example.
if(logger.isDebugEnabled()) {
logger.debug("Entry number: " + i + " is " + String.valueOf(entry[i]));
}
This way you will not incur the cost of parameter construction if debugging is disabled for logger. On the other hand, if the logger is enabled for the DEBUG level, you will incur the cost of evaluating whether the logger is enabled or not, twice: once in debugEnabled and once in debug. In practice, this overhead is insignificant because evaluating a logger takes less than 1% of the time it takes to actually log a request.
Using the curly-brace syntax (presented shortly after in the manual) often is a good compromise, and I really prefer it just because it helps distinguish between the statement being logged and the data that goes into it. But it isn't quite the same as being skipped entirely if your log level isn't enabled, because the parameters are still evaluated and passed to the logging system before it can figure out whether it needs to log them or not.

Why bother using lambda expressions in logging APIs if the compiler can possibly inline the logging call

Many logging frameworks (e.g., log4j) allow you to pass lambda expressions instead of Strings to the logging API. The argument is that if the string is particularly expressive to construct, the string construction can be lazily executed via the lambda expression. That way, the string is only constructed if the system's log level matches that of the call.
But, given that modern compilers do much method inlining automatically, is there really a point to using lambda expressions in this way? I'll supply a simplified example below to demonstrate this concern.
Suppose our traditional logging method looks like this:
void log(int level, String message) {
if (level >= System.logLevel)
System.out.println(message);
}
// ....
System.logLevel = Level.CRITICAL;
log(Level.FINE, "Very expensive string to construct ..." + etc);
Let's suppose that FINE is less than CRITICAL, so, although an expensive string is constructed, it's all for not since the message is not outputted.
Lambda logging APIs help this situation so that the string is only evaluated (constructed) when necessary:
void log(int level, Supplier<String> message) {
if (level >= System.logLevel)
System.out.println(message.apply());
}
// ....
System.logLevel = Level.CRITICAL;
log(Level.FINE, () -> "Very expensive string to construct ..." + etc);
But, it's feasible that the compiler can just inline the logging method so that the net effect is as follows:
System.logLevel = Level.CRITICAL;
if (Level.FINE >= System.logLevel)
System.out.println("Very expensive string to construct..." + etc);
In this case, we don't have to evaluate the string prior to the logging API call (because there is none), and presumably, we would gain performance from just the inlining.
In summary, my question is, how do lambda expressions help us in this situation given that the compiler can possibly inline logging API calls? The only thing I can think of is that, somehow, in the lambda case, the string is not evaluate if the logging level is not a match.

Your optimization hasn't just introduced inlining - it's changed ordering. That's not generally valid.
In particular, it wouldn't be valid to change whether methods are called, unless the JIT can prove that those methods have no other effect. I'd be very surprised if a JIT compiler would inline and reorder to that extent - the cost of checking that all the operations involved in constructing the argument to the method have no side effects is probably not worth the benefit in most cases. (The JIT compiler has no way of treating logging methods differently to other methods.)
So while it's possible for a really, really smart JIT compiler to do this, I'd be very surprised to see any that actually did this. If you find yourself working with one, and write tests to prove that this approach is no more expensive than using lambda expressions, and continue to prove that over time, that's great - but it sounds like you're keener on assuming that's the case, which I definitely wouldn't.

Raffi lets look at an example on how the compiler inlining you are talking about will change the program logic and compiler needs to be very smart enough to be able to figure that out:
public String process(){
//do some important bussiness logic
return "Done processing";
}
1) Without inlining the process() will be callled regardless of logging level:
log( Level.FINE, "Very expensive string to construct ..." + process() );
2) With inlining the process() will be called only under certain logging level and our important bussiness logic wont be able to run:
if (Level.FINE >= System.logLevel)
System.out.println("Very expensive string to construct..." + process() );
The compiler in this case has to figure out how the message string is created and not inline the method if it calls any other method during its creation.

This kind of optimisation-inlining would work for only really simple examples like you have provided (when it is just String concatenation).
In fact, this API can be used in more sophisticated way:
public void log(Level level, Supplier<String> msgSupplier)
Let's say I have a dedicated supplier, which performs a quite expensive log-message producing:
Supplier<String> supplier = () -> {
// really complex stuff
};
and then I use it in several places:
LOGGER.log(Level.SEVERE, supplier);
...
LOGGER.log(Level.SEVERE, supplier);
Then, what would you inline? Unwrapping-inlining it into
System.logLevel = Level.CRITICAL;
if (Level.FINE >= System.logLevel)
System.out.println(supplier.get());
doesn't make any sense.
As it said in java.util.logging.Logger class JavaDoc:
Log a message, which is only to be constructed if the logging level is
such that the message will actually be logged.
So this is a purpose: if you can avoid construction, that you don't need to perform these calculations and pass the result as parameter.

Logging syntax details

I have a simple question on logging
why it is common to use this syntax for logging:
LOG.debug("invalidate {}",_clusterId);
not this:
LOG.debug("invalidate" + _clusterId);

In your example, say you have the logging level set to INFO. You'd like to ignore debug-level messages entirely. But the log method can't check the log level until the method is entered, after it gets the parameters. So if you don't know if you're going to need a parameter it's better to avoid having to evaluate it.
With your second example, even though logging is set to info, _clusterId gets toString called on it, then that resulting string is concatenated with the preceding string. Then once the method is entered the logger figures out the debug level doesn't need logging and it throws away the newly-created string and exits the method.
With the first example if debug-level logging is not enabled then _clusterId doesn't get toString called on it and the log message doesn't get built. Calling toString may be slow or create garbage, it's better to avoid it for cases where nothing is going to be logged anyway.
Here's the source code for the debug method on log4j's org.apache.log4j.Category (which is the superclass of Logger):
public void debug(Object message, Throwable t) {
if(repository.isDisabled(Level.DEBUG_INT))
return;
if(Level.DEBUG.isGreaterOrEqual(this.getEffectiveLevel()))
forcedLog(FQCN, Level.DEBUG, message, t);
}

When you have a statement with several parameters, writing the pattern as a string followed by the parameters makes the code more readable. It may also be more efficient, avoiding the needless creation of many temporary string objects, but that depends on how the logging framework implements interpolation internally.
To see the first point, compare this with the equivalent line that uses string concatenation.
LOG.debug("{}: Error {} while processing {} at stage {}", currentFile,
exception.getMessage(), operation.getName(), operation.getStage())
When there's only one parameter it doesn't really matter which one you use, apart from being consistent with the general case.

Seeking for a lightweight, but flexible solution to log method entry and exit (possibly by an exception) with log4j

I need a lightweight solution, which would enable me to log method entry and/or exit and/or exception. I would like to be able to configure:
The entry/exit/exception log level. For instance, I might have Debug level for exit, Info level for entry and Error level for exception.
The entry/exit/exception log message. The log message should allow me to reference the method name, the method parameters and the method result.
Let us take this method, for example:
public ObjectB MyFunc(int x, string s, ObjectA y)
{
// implementation
}
Then, assuming #Log is such a magic annotation, one could have any of these #Log annotations on the MyFunc method:
#Log // should be a reasonable default
#Log(EntryLevel = LogLevel.Info, ExitLevel = LogLevel.Debug) // makes sense to have ExceptionLevel = LogLevel.Error by default
#Log(EntryText = "{method}({methodArguments}) = ?", ExitText = "{method} = {methodResult}")
#Log(EntryText = "{method}({#x}, {#s}, ...)")
Where:
{method} would be replaced by "MyFunc"
{methodResult} would be replaced by the toString of the method result
{#x} would be replaced by the toString of the x method argument
{#s} would be replaced by the toString of the s method argument
{#y} would be replaced by the toString of the y method argument
{methodArguments} would be replaced by something like "x: {#x}, s: {#s}, y: {#y}", or "{#x}, {#s}, {#y}", which could be governed by a boolean flag, like includeArgumentName.
For the record, in .NET we use a modified (by us) version of the Log4PostSharp library, which does all of the above by injecting the right bytecode during the compilation phase. Hence, I have a pretty good idea what I am looking for, though I have absolutely no idea how to do it in Java and whether something like this has already been done.
Thanks.
EDIT
I would like to address the performance issue. The code may be abundant with the logging statements. Executing these statements must not incur any visible performance penalty when no actual logging is performed (due to log level constraints, for instance).

Have you considered AspectJ? Some additional details are at http://www.christianschenk.org/blog/logging-with-aspectj/ and http://marxsoftware.blogspot.com/2008/10/logging-method-entryexit-with.html .

This sounds like a classic AOP problem.
Spring has a flexible solution that is relatively easy to use, see here.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.