From the terminology point of view and in general, what is the difference between a tracing and a logging ?
Thanks!
Logging is not Tracing!
Logging
When you design a big application, you need to have good and flexible error reporting - perhaps across machines - to collect log data in a centralized way. That is a perfect use case for the Logging Application Block where you configure some remote trace listener, and send the log data to a central log server which stores its log messages in a database, log file or whatever. If you use out-of-process communication, you are limited by the network performance already, which in the best case is several thousand logs/s.
Tracing
Besides Error Reporting, you also need to trace your program flow to find out where the performance bottlenecks are; even more importantly, when an error occurs, you have a chance to find out how you did get there. In an ideal world, every function would have some tracing enabled with the function duration, passed parameters, and how far you did get into your function.
If the context is developing an Observability capability across a distributed architecture, it's common for people to talk about metrics, logs, and tracing. In this context, tracing refers to distributed tracing.
Distributed tracing is a specialised type of telemetry (similar to logging, but different), and is usually produced in a highly automated fashion through instrumentation frameworks. The telemetry is sent from the individual services in the system and aggregated by a central service (a distributed tracer), which is able to piece together telemetry from many individual services into a single trace for each request that entered the system. It can then provide a timeline and graph of how a request moved through the services in the system. The main purposes of distributed traces are to investigate performance degradations, error propagation, and dependency interactions throughout distributed systems.
Whereas tracing in a more traditional monolithic context would typically be looking at tracing individual function calls within an application, distributed tracing is typically only concerned with the interactions between services. Telemetry of function-call-level details is possible but rarely implemented.
For more info about distributed tracing, a good intro can be found at: https://opentelemetry.lightstep.com/tracing/
Trace is the least filtered level of logging. Each logging statement has a level of filtering:
trace
debug
warning
error
severe
For example. if the logging library is configured to log with level warning then all warning, error and severe logging statements will be printing messages to the logging output.
Logging is for performance monitoring too. Does not need to be true that only trace is able to find out where the performance bottlenecks are. Both can work in distributed mode.
Related
I see many distributed tracing solutions for microservices. For instance, Spring Cloud Sleuth, Zipkinās Brave, etc. However, I have a monolithic service with many modules separated clearly. Thus, those solutions do not work for me.
IMHO I need a tracing system to tell me which module (analogy to which microservice in a non-monolithic system) spends how much time. However, I could not find any. Thus, I wonder whether my need is actually not a real need but a pseudo need? Or, if it is a real need, how can I find some solutions?
Thanks!
It is probably uncommon to use sleuth for this kind of use-case (sleuth is mainly used to achieve distributed tracing and monitor critical latencies inside a system).
But...Sleuth integrates with logging frameworks like Logback and SLF4J to add unique identifiers that help track and diagnose issues using logs.
So when a request enters into your system sleuth will assign it a TraceId and all the various steps in that request, even across application and thread boundaries, will have the same traceId.
If you want to monitor different complex actions taken inside some module you can wrap these actions with a a dedicated Span.
Example :
Span newSpan = tracer.nextSpan().name("module1Span").start();
try (SpanInScope ws = tracer.withSpanInScope(newSpan.start())) {
// Some logic
} finally {
newSpan.finish();
}
I want to build a more advanced logging mechanism for my java web applications, similar to App engine logs.
My needs are:
Stream logs to a database (for ex. sql, bigquery or something else)
Automatically log important data (like app context, request url, request id, browser user agent, user id, etc.)
For point 1, I ca use a "buffering" implementation, where logs are put into different lists, and periodically a cron (thread) gathers all the logs in memory and write's them to database (which can also be on another server)
For point 2, the only way I found of doing this is to inject needed objects into my classes (subsystems), like ServletContext, HttpServletReqest, current user, etc, all modeled into a custom class (let's say AppLogContext), which then can be used by the logging mechanism.
The problem here is that I don't know if this is a good practice. For example, that means that many classes will have to contain this object which has access to servlet context and http request objects and I'm thinking this may create architectural problems (when building modules, layers etc) or even security issues.
App Engine will automatically log this kind of information (and much more, like latencies, cpu usage etc, but this more complicated), and it can be found in the project's Console logs (also it can duplicate logs to big query tables) and I need something for Jetty or other java web app servers.
So, is there another way of doing this, other patterns, different approaches? (couldn't find 3rd party libraries for any of these points)
Thank you.
You don't really need to invent a bicycle.
There is a common practice that you can follow:
Just log using standard logger to a file
(if you need to see logs in request context) Logback, Log4J and SLF4J supports Mapped Diagnostic Context (MDC), that's what you can use to put current request into every log line (just initialize context in a filter, put request id for example, or generate a random uuid). You can aggregate log entries by this id later
Then use ELK:
Logstash got gather logs into
ElasticSearch for storing logs
to analyze using Kibana
Requirement: Log events like Page Views and form Submits. Each page has ~1 second SLA. The application can have 100's of concurrent users at a time.
Log events are stored into the Database.
Solution: My initial thought was to use an async logging approach where the control returns back to the application and the logging happens in a different thread (via Spring's Thread pool task executor).
However someone suggested using JMS would be a more robust approach. Is the added work(Setting-up queue(s), writing to the queue(s), reading from the queue(s)) required when using this approach worthwhile?
What are some of the best practices / things to look out for (in a production environment) when implementing something like this?
Both approaches are valid, but one is vulnerable if you app unexpectedly stops. In your first scenario, events yet to be written to the database will be lost. Using a persistent JMS queue will mean that those events will be read from the queue and persisted to the database upon restart.
Of course, if your DB writes are so much slower than placing a message of similar size on to a JMS queue, you may be solving the wrong problem?
Using JMS for logging is a complete mismatch. JMS is a Java Abstraction for a Middleware Tool like MQ Series. That is complete overkill, and will let you go through a setup and configuration hell. JMS also lets you place messages in a transactional context, so you already get quickly the idea that JMS might be not much better than Database writes as #rjsang suggested.
This is not that JMS is not a nice technolgy. It is a good technology where it is applied properly.
For Assynchronous logging, you better just depend on a Logging API that directly supports it like Log4j2. In your case, you might be looking to configure a AsyncAppender with a JDBCAppender. Log4j2 has many more appenders as additional options, including one for JMS. However, by at least using a Logging abstraction, you make that all configurable and will make it possible to change your mind at a later time.
In the future we might have something similar to Asynchronous CDI Events, which should work similar to JMS, but would be much more lightweight. Maybe you can get something similar to work by combining CDI Events with EJB Asynchronous Methods. As long as you don't use EJB's with a remote interface, it should also be pretty lightweight.
You could give it a try using fully async and external tooling if you want to. If you have to stick to your SLA at any price and resilience is important for you, you could try using either logstash or process your logs offline. With doing so, you decouple your application from the database and you are no longer depending on the database performance. If the database is slow and you're using async loggers, queues might run full.
With logstash using GELF the whole log processing is handled within a different (or even remote) JVM. Offline processing (e.g. you write CSV logs) allows you to load the log data afterwards into the database.
I'm currently trying to track down a performance issue and I've found a large amount of time is being spent in the code where you request a bean from a ApplicationContext:
ApplicationContext.getBean(String beanName);
Do you know if it's possible to turn on any debug or log information within Spring so that I can see all the objects that are instantiated from this call as well as the times to create them?
I've been trying to profile with Yourkit, but the operation only take 1.4 seconds in total and only on the first call, so Yourkit seems to struggle with short lived, one off calls like this.
It's why I was heading down the logging route.
Spring uses log4j, so you can set your log level to debug and see what you get. It'll be a lot of output. I'd advice not logging it to the console - write it to a file.
How did draw that conclusion? Are you profiling your app with Visual VM, with all the plugins installed?
I'd bet that the app context is not the issue. If the time spent in the app context is large, it should just be on startup when Spring is reading and parsing the configuration, instantiating and wiring beans. Once it's done the cost is amortized over the life of your app. For most web apps this is a startup cost, not a reflection on the experience users have when using your site.
It's far more likely to be in your code. Be sure that you aren't being mislead by not taking a comprehensive view.
I'd like to trace a java application at runtime to log and later analyze every its behaviour.
Is there a possibility to hook into a java application to get runtime information like method calls (with parameters and return values) and the status of an object (i.e. its attributes and whose values)?
My goal is to get a complete understanding of the applications behaviour and how it deals with the data.
If you need highly customized logging and runtime processing, one alternative to profilers is to use aspects and load-time weaving.
We use AspectJ in this way to capture and log the authentication information for users who call a number of low-level methods for debugging purposes and to undo mistaken changes.
Use a profiler. For example JProfiler or one from this overview of opensource java profilers. Whenever I had to find deadlocks for example, these tools were priceless...
In Netbeans the profiler exist and work properly for use it see http://profiler.netbeans.org/
Maybe have a look at Glassbox a troubleshooting agent for Java applications that automatically diagnoses common problems. From Glassbox - Automated monitoring and troubleshooting using AOP:
Glassbox deploys as a war file to your
appserver and then uses AspectJ load
time weaving to monitor application
components and other artifacts, in
order to identify problems like excess
or failed remote calls, slow queries,
too many database queries, thread
contention, even what request
parameters caused failures. All this without
changing the code or the build process.
(...)
Glassbox monitors applications non-invasively by using aspects to track component interactions. We also monitor built-in JMX data, notably on a Java 5 VM we sample thread data (every 100 ms by default). As a request is processed, we summarize noteworthy events such as where time was spent and what parameters were involved in making things slow or fail. We also detect higher-level operations (such as Struts actions or Spring controllers) that we use to report on. Our AJAX Web client then provides summaries of status by operation on the machines being monitored and we generate a more detailed analysis on request. Glassbox allows monitoring clusters of servers: the Web app uses JMX Remote or direct RMI to access data from remote servers. We also provide JMX remote access to the lower-level summary statistics.
It's a nice application, give it a try.