Java Log Coverage tool

Java Log Coverage tool - java

Are there any tools or strategies for generating a "Log coverage" report on (Java, log4j)? Like code coverage, but ensuring that there aren't large methods, classes, or packages which don't log anything.
When coding web services, me team doesn't write many log statements. When debugging a real time problem with running, production code, we always wish we had. Inevitably we try to reproduce the bug in our test environment with either a debugger attached or additional log statements added, which can be very difficult depending on the structures and inter-operation involved.
Does anyone use this as a code-quality metric?

Code coverage takes special instrumentation because you're trying to find out whether a piece of production code is exercised by any test. What you're asking is a little more vague and could be either much easier ("is any logging done for this large class?") or much harder to the point of impossible ("did we log the method that's going to break in production?").
For the first question, you could whip up a shell script pretty quickly to do the job. Here's a skeleton in Perl, for example. Here, I assume that we're using SLF4J and that seeing the import of "LoggerFactory" is enough evidence to assume there's a logger.
while ($filename = shift) {
open my $in, "<$filename";
my $loc = 0;
my $log = "NO LOGGER";
while (<$in>) {
$loc++;
if (m/import org.slf4j.LoggerFactory/) {
$log = "has logger";
}
}
print "$filename : $loc LOC $log\n";
$total{$log} += $loc;
}
print "\n\nTOTAL LOGGED: $total{'has logger'}\nTOTAL UNLOGGED: $total{'NO LOGGER'}\n";
and I can run this from my shell to run over all the Java files in a little project with
$ find . -name \*.java -exec perl haslog.pm {} \+
This only works for small-sized projects, and it's fairly brittle but it wouldn't be a ton of work to make a more robust version of this.

Lots of logs can be noise and in my experience I always found tracing through logs painful. Having said that if the logs are managed well you can get good diagnostics/reporting. One of the reason for the code not being tested properly is because of having lots of logs in production code. What developers tend to do is add a log statement when they are developing to check the code works, consequently it encourages not writing a test with the right assertion. What you need is lots of little classes that are well tested composed together. The assertion should exactly tell you why the test is failing.
Lets say in your code path you are expecting something to happen which is its main responsibility (e.g Created a DB entry to register user/or someone logging in), when I say its main responsibility I am not talking about a side effect that happens in your code path. If you have an error condition in the main code path the exception should be thrown all the way up the stack were you can log and convert that to a user friendly message. RuntimeExceptions are a good here because you dont want to be catching these exceptions until its all the way up to the view layer. Side effects can be logged as well because they are like info/warnings.

Related

Is it possible to have Log4J 2 keep everything in memory and output on demand?

Ok, so the problem I'd like to solve is having tests in CI being overly verbose. Now I could turn down the logging in general, but that would hurt when you actually need to debug one. What I think I'd like to do is accumulate all the logs in memory, and then set the actual level to emit when the suite has errored or succeeded, then emit/append the logs that need to be. Bonus points if I could somehow do this by inspecting the exception stack when it fails. Is this possible?
possible duplicate really it's the same question, but in 6 years we've gotten log4j2 and junit5, so I guess I'm wondering if this has changed.

Nope, that watcher is still it (write your own, in other words), because your request is still bizarre. Time isn't going to change this.
Logging frameworks love to write ASAP, not later. After all, maybe the VM is about to hard-crash and that is the absolute worst time of all to cache some writes - the very log statements that provide insights into the hard crash are now lost to the ether.
Normally, the solution is to use tools that read the log files and process them in some fashion - usually, by chucking most of the lines out as you copy and zip them up for the ages, keeping the full dump (and uncompressed) only for the most recent few logs.
Key principle there being: A separate process (may not even be written in java) that does this later.
Can you configure your CI system to do this? That sounds like the right approach: Log at a pretty verbose level, and have the CI system strip the logs down to a much more strict level after the tests ran if the tests were successful.

For JUnit 4 the library System Rules allows you to suppress output that is written to the console for successful tests. You have to add the rules SystemErrRule and SystemOutRule to your test.
#Rule
public final SystemErrRule muteSystemErr
= new SystemErrRule().muteForSuccessfulTests();
#Rule
public final SystemOutRule muteSystemOut
= new SystemOutRule().muteForSuccessfulTests();
Disclaimer: I'm the author of System Rules.

Understanding a large Java program

I am working on a java project and I have to extend (add more functionality) it. But I don't know how should I learn the existing one before incorporating them.
Is there any specific path I should follow?
Can I run it in a way so that I can see, statement by statement, the execution of the program?
I am a kind of stuck in understanding it, thanks.

Here is another approach that is hacky, but I've found useful in the past when unable to attach a debugger. If there is a piece of code that you are looking at, but are having a hard time figuring out who is calling it you can throw a new runtime exception, catch it and print the stack trace.
try {
throw new RuntimeException("who is calling me");
} catch (RuntimeException e) {
e.printStackTrace();
}

You can always fire it up in a debugger/your IDE of choice and step through it all you want, though it's probably best to find someone who is more familiar with the source to provide you an overview, or to look for documentation on where to start.

Pick one piece of functionality for which you understand the requirements. Find the entry point for that feature and follow the code for that one feature. It should give you a good understanding of how the architecture works.

Integrating with code that is already written can be very difficult. In my experience, some of the best clues I've gotten about already-written code come from the method signatures (the mapping of the function's input to its output). The method's signature can give you a lot of hints about a program, namely where and especially how that particular method fits in the context of the larger program. Usually, a method signature coupled with a descriptive method name can give you enough information to be dangerous, especially in a typed language like Java.
Although I wouldn't suggest running the code line by line and looking at changes (because this usually amounts to tons of work) but for really ugly but important code sometimes it is necessary (I've definitley done it before using DDD for C programs). In this case, a quick google search reveals http://www.debugtools.com/ , a graphical java debugger, which may do the trick; there also seems to be version of DDD that works with Java.

This is a recurrent question on Stack Overflow. There is already very good answers all around:
https://stackoverflow.com/questions/3147059/taking-over-a-project
Cleaning up a large, legacy Java project
https://stackoverflow.com/questions/690158/how-do-you-learn-other-peoples-code
Also, this book might help: Working Effectively with Legacy Code
"Patience and fortitude conquer all things." - Ralph Waldo Emerson

I would recommend you to start with the debug as well so you can go through the program step by step.

Documentation:
If you have documentation, it’ll be helpful. But it can be a pitfall, as much documentation is out date, they can be misleading you.
Bugfix:
You could start with bugfix or new feature implantation. Start work with small scope, it’ll be easy work. During the bugfix, you could understand the code more and more.

Baseline the code, I generally would use git
Do a build of the application
Run it.
If baseline fails build or process is too complicated, create a branch and fix it
Create a branch and modify a string or something that would show some visible change if you modify the code.
If Javadocs are not created via ant or build files, create a new branch to do this.
If there is no JUnit test cases (or if there are but they don't work), create a branch and fix it.
Create a new branch to do the merge.
The following is if you're using Eclipse or similar product
If you're the only developer, create a new branch and set up project settings for code formatting and cleanup. Then execute the code formatting and cleanup. This would allow you to have a more stable baseline for future work. If not, try to coordinate with others.
Install FindBugs, Checkclipse, PMD to do some simple checks on the code base. Looking at WTFs sometimes will give you a better idea on how things are working (or not)
Install Eclemma and see how much of the code is actually tested.

Debugging: to System.out.println() or to not System.out.println()

That is my question. More specifically, I'm trying to get used to Eclipse's debugger and I'd like to know if printing to console is still done in some cases or if it's considered a bad practise that should be entirely avoided. Also, what can be considered as good approach(es) to debugging overall?

Use System.err.println() instead.
Why?
System.out.println() is often redirected to a file or another output, while this is pretty much always printed on the console. It's easier for debugging and also the right way to do it.
Edit (warning: subjective):
Since you asked about whether System.out.println should be entirely avoided: I don't believe in anything that you must always avoid, be it using goto's, crashing your computer with a BSOD, or whatever. Sometimes you just need a quick-and-dirty way to get small things done fast, and it just plain isn't worth the 1 hour you'll spend on it to try to do things the "right" way, instead of a 5-minute fix, no matter how good the "good" way is. Use your judgment when deciding if something should be used or not, but never set rules for yourself like "I'll never use goto!". :)
Edit 2 (example):
Let's say you're debugging a crashing driver and you suspect that an if statement that shouldn't be execute is being executed. Instead of spending three hours finding out how to use ZwRaiseHardError to display a message box, just call KeBugCheck inside the if and crash the darned system. Sure, you'll reboot, but unless your reboot takes hours, you just saved yourself that much time.

The best choice would be a logging library (of course, this adds an extra dependency to your project). Check out commons-logging, for instance.
The main advantage is that you can write your debug messages in the DEBUG level and when you deploy your code, you'll just configure the logger to skip those messages (instead of searching for all occurrences of System.out.println in your code).
One other great advantage is that loggers usually can be configured to write anywhere (even send email messages or SMS) also without touching your code.

Minor point: if your program actually outputs something useful to the console via System.out, you may want to instead print the debugging info to System.err
You should generally strive to have as much debugging as possible (ideally using some standard logger like log4j). This both eases debugging when you're actually developing the program AND allows for much easier debugging of already-released code in production. The benefit is that your code remains unchanged and you don't need to ADD debugf prints, yet by default the logging config can turn off the logging until it's actually needed (or at least turn down the level of logs)
As far as general simple "throw printlns at the wall" debugging, it can sometimes be one of the fastest ways to debug, though it should by no means be the only/main one.
Why can it be useful? Among other reasons, because running a Java program in a debugger may be much slower than outside of it; or because your bug manifests in an environment/situation that can't be easily replicated in your Eclipse debugger.

If the debugging print lines are not going to stay in the code after you've fixed your bug, then do whatever is easiest for you. Lambert's advice of using System.err.println() is a good idea since you can differentiate it from other output that your program may produce. If the debugging print lines are going to stay in your code, then I would advise on using a logging framework like log4j. This way you can dial up or down the level of output based on whether you're trying to debug something or just running in production. Be sure to output things at the right level when using log4j. Don't just log everything at INFO.

I use System.out.println for my debugging in case i have a problem or to inform me that methods have started to make sure everything has worked properly but when I publish the program I always remove it because it slows down the program.

General debugging log practices

Due to recent events, i am trying to figure out how much debugging logs i should use for code in general.
What i have been doing is using debugging logs pretty sparingly, and just in cases where i wanted some extra information or what have you. This made sense to me, as it seems like you shouldn't log every little thing your code does, as that could make flood you with so much information that it would be easier to miss something that was actually important (or go crazy from digging through and verifying logs).
On the other hand, i present an example: I just started using logback/slf4j for my java project, and to test that i had the .xlm file set up correctly i added a debugging log statement to the end of a method that initializes gui components. Normally i would have never put a log statement there, because it is pretty obvious if your gui components don't initialize correctly when you run the program. However this time i ran the program, and low and behold the logs showed that the gui components were being initialized twice, even though only one set of them were being displayed. A decent sized bug, but something i likely wouldn't have caught without without those debugging statements.
So my question: Is there any "best practices" out there when it comes to debugging logs? I have seen many best practice questions when it comes to info logs, exceptions, errors, etc, but haven't found much out there in regards to debugging logs.
Thanks :)

Some thoughts:
Don't just log what's happening, but take care to log the available parameters/method arguments etc. It's easy to overlook this.
It's easy to disable debug logging via configuration rather than logging in after the fact.
Don't worry about logging overhead until it really becomes an issue.
You can automate some logging (entry/exit of methods) by using an AOP framework (Spring / AspectJ etc.)

I don't think there are any "best practices" in deciding what / how much to log. It is one of those catch-22 situations. If you need to look at the logs, there is "never" enough information there, but if you don't then "all" logging is just code clutter and an unnecessary runtime overhead. You need to make a separate judgement about where to draw the line for each application.
One point to bear in mind though. If you and your customers are able to hack the code to add temporary debugging statements, then you don't need as much permanent logging code in place. But if you don't have the option of hacking on the (almost) production code to debug things, then you need a certain level of logging code in place, just in case ...

How to identify which lines of code participated in a specific execution of a Java program?

Suppose that I have a Java program within an IDE (Eclipse in this case).
Suppose now that I execute the program and at some point terminate it or it ends naturally.
Is there a convenient way to determine which lines executed at least once and which ones did not (e.g., exception handling or conditions that weren't reached?)
A manual way to collect this information would be to constantly step with the debugging and maintain a set of lines where we have passed at least once. However, is there some tool or profiler that already does that?
Edit: Just for clarification: I need to be able to access this information programmatically and not necessarily from a JUnit test.

eclemma would be a good start: a code coverage tool would allow a coverage session to record the information you are looking for.
(source: eclemma.org)

What you're asking about is called "coverage". There are several tools that measure that, some of which integrate into Eclipse. I've used jcoverage and it works (I believe it has a free trial period, after which you'd have to buy it). I've not used it, but you might also try Coverlipse.

If I understand the question correctly you want more than the standard stacktrace data but you don't want to manually instrument your code with, say, log4j debug statements.
The only thing I can think of is to add some sort of bytecode tracing. Refer to Instrumenting Java bytecode. The article references Cobertura which I haven't used but sounds like what you need...

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java Log Coverage tool - java

Related

Is it possible to have Log4J 2 keep everything in memory and output on demand?

Understanding a large Java program

Debugging: to System.out.println() or to not System.out.println()

General debugging log practices

How to identify which lines of code participated in a specific execution of a Java program?

Categories

Resources