Is there any possibility to store logs from my different application may be in different languages can store logs in single file as per timestamp.
You could retrieve and aggregate every application's logs in some kind of logstash.
Logstash is an open source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favorite “stash.”
If you can force that every one of your applications outputs logs with the same pattern, I guess logstash (plus an elasticsearch or anything of its kind behind) would exactly answer your needs.
Related
Background
We have a web server written in Java that communicates with thousands of mobile apps via HTTPS REST APIs.
For investigation purposes we have to log all API calls - currently this is implemented as a programming #Aspect, and for each API call we save an api_call_log object into a MySQL table with the following attributes:
tenant_id
username
device_uuid
api_method
api_version
api_start_time
api_processing_duration
request_parameters
full_request (JSON)
full_response (JSON)
response_code
Problem
As you can imagine after reaching a certain throughput this solution doesn't scale well, and also querying this table is very slow even with the use of the right MySQL indices.
Approach
That's why we want to use the Elastic Stack to re-implement this solution, however I am a bit stuck at the moment.
Question
I couldn't found any Logstash plugins yet that would suit my needs - should I output this api_call_log object into a log file instead and use Logstash to parse, filter and transform that file?
Exactly this is what I would do in this case. Write your log to a file using a framework like logback, rotate it. If you want easy parsing use json as logging format (also available in logback). Then use Filebeat in order to ingest the logfile as it gets written. If you need to transform/parse the messages in elasticsearch ingest nodes using pipelines.
Consider tagging/enriching the logfiles read by filebeat with machine or enviroment specific informations in order to ask for them in your visualisation or report etc.
The filebeat-to-elastic approach is the simplest one. Try this first. If you can't get your parsing done in elasticsearch pipelines, put a logstash in between.
Using filebeat you'll get many stuff for free like backpressure handling and daily indicies what comes very handy in the logging scenario we are discussing here.
When you need a visualization or search ui, have a look on kibana or grafana.
And if you have more questions, raise a new question here.
Have Fun!
https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-installation.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/ingest.html
I want to build a more advanced logging mechanism for my java web applications, similar to App engine logs.
My needs are:
Stream logs to a database (for ex. sql, bigquery or something else)
Automatically log important data (like app context, request url, request id, browser user agent, user id, etc.)
For point 1, I ca use a "buffering" implementation, where logs are put into different lists, and periodically a cron (thread) gathers all the logs in memory and write's them to database (which can also be on another server)
For point 2, the only way I found of doing this is to inject needed objects into my classes (subsystems), like ServletContext, HttpServletReqest, current user, etc, all modeled into a custom class (let's say AppLogContext), which then can be used by the logging mechanism.
The problem here is that I don't know if this is a good practice. For example, that means that many classes will have to contain this object which has access to servlet context and http request objects and I'm thinking this may create architectural problems (when building modules, layers etc) or even security issues.
App Engine will automatically log this kind of information (and much more, like latencies, cpu usage etc, but this more complicated), and it can be found in the project's Console logs (also it can duplicate logs to big query tables) and I need something for Jetty or other java web app servers.
So, is there another way of doing this, other patterns, different approaches? (couldn't find 3rd party libraries for any of these points)
Thank you.
You don't really need to invent a bicycle.
There is a common practice that you can follow:
Just log using standard logger to a file
(if you need to see logs in request context) Logback, Log4J and SLF4J supports Mapped Diagnostic Context (MDC), that's what you can use to put current request into every log line (just initialize context in a filter, put request id for example, or generate a random uuid). You can aggregate log entries by this id later
Then use ELK:
Logstash got gather logs into
ElasticSearch for storing logs
to analyze using Kibana
I'm Processing info in Google Cloud Dataflow, we tried to use JPA to insert or update the data into our mysql database, but these queries shouted down our server. So we've decided to change our paths...
I want to generate a mysql or .sql file so we can write the new info processed through dataflow. I want to know if there is an implemented way to do so, or do I have to do this by myself?
Let me explain a little more, we have an input from an XML, we process the info into java classes, we have a json dump of the db, so we can see what we have online without making so much calls, with this in mind, we compare the new info with the info we already have, and we decide if it's new or if it's just an update.
How can I do this via Java/Maven? I need code to generate this file...
Yes, Cloud Dataflow processes data in parallel on many machines. As such, it is not very surprising that other services may not be able to keep up or that some quotas are hit.
Depending on your specific use case, you may be able to slow/throttle Dataflow down without changing your approach. One might limit the number of workers, limit parallelism, use IntraBundleParallelization API, etc. This might be a better path, overall. We are also working on more explicit ways to throttle Dataflow.
Now, it is not really feasible for any system to automatically generate a .sql file for your database. However, it should be pretty straightforward to use primitives like ParDo and TextIO.Write to generate such a file via a Dataflow pipeline.
What is the best way to keep a log of user changes in my web application (java/tomcat/struts/mysql)? I give out accounts and each account has multiple users. I want the account administrators to be able to see who did what at any given time. And I'd like to be able to access ALL of it. First, I need a way to know which fields have been changed, then I need to log the changes for each account in a place where they can see them. Obviously, I don't want to slow the app down. I read an answer on this site suggesting keeping a db log - querying the database for changes after each query is sent. Wasn't sure how to do that.
This depends on the nature of your web application. Let's assume your web application is a e-commerce system and it allows the user to add new product, or updating an existing product. When a user perform a specific action like adding a new product, the basic goal is to capture his user name, action and time stamp. Same for updating a product, you might want to keep track what values he updated, what was the old value and when did he change that.
To achieve this, firstly you need to
Create an audit table
Obviously you want to keep track the last modified person, timestamp, created by and etc.
Create a logging mechanism whenever some changes/actions performed.
There are few ways to do this, you can either do it via application or leave everything to database trigger. I would suggest to use triggers to detect any Create/Update/Delete event in the database, and ask the trigger to capture the details and write to the Audit table. I think this is the cleanest and less maintenance way. However, if you want to log using application, you have to make code changes, create new methods to capture the details to the Audit table in your action classes.
More information on MYSQL Trigger here
I was looking on a similar "Method" to log the transactions and other stuffs in my web app. Just while browsing Google, i found this link:
https://www.owasp.org/index.php/Logging_Cheat_Sheet telling about two possible ways to log: Either on database or on filesystem at some log files...
When using the file system, it is preferable to use a separate
partition than those used by the operating system, other application
files and user generated content For file-based logs, apply strict
permissions concerning which users can access the directories, and the
permissions of files within the directories In web applications, the
logs should not be exposed in web-accessible locations, and if done
so, should have restricted access and be configured with a plain text
MIME type (not HTML) When using a database, it is preferable to
utilize a separate database account that is only used for writing log
data and which has very restrictive database , table, function and
command permissions Use standard formats over secure protocols to
record and send event data, or log files, to other systems e.g. Common
Log File System (CLFS), Common Event Format (CEF) over syslog,
possibly Common Event Expression (CEE) in future; standard formats
facilitate integration with centralised logging services
They've beautifully explained the possible ways we can log, what should be logged, what to be avoided too.
Hope it's useful to you.
I am planning to migrate a previously created Java web application to Azure. The application previously used log4j for application level logs that where saved in a locally created file. The problem is that with the Azure Role having multiple instances I must collect and aggregate these logs and also make sure that they are stored in a persistent storage instead of the virtual machines hard drive.
Logging is a critical component of the application but it must not slow down the actual work. I have considered multiple options and I am curious about the best practice, the best solution considering security, log consistency and performance in both storage-time and by later processing. Here is a list of the options:
Using log4j with a custom Appender to store information in Azure SQL.
Using log4j with a custom Appender to store information in Azure Tables storage.
Writing an additional tool that transfers data from local hard drive to either of the above persistent storages.
Is there any other method or are there any complete solutions for this problem for Java?
Which of the above would be best considering the above mentioned criteria?
There's no out-of-the-box solution right now, but... a custom appender for Table Storage makes sense, as you can then query your logs in a similar fashion to diagnostics (perf counters, etc.).
The only consideration is if you're writing log statements in a massive quantity (like hundreds of times per second). At that rate, you'll start to notice transaction costs showing up on the monthly bill. At a penny per 10,000, and 100 per second, you're looking about $250 per instance. If you have multiple instances, the cost goes up from there. With SQL Azure, you'd have no transaction cost, but you'd have higher storage cost.
If you want to go with a storage transfer approach, you can set up Windows Azure diagnostics to watch a directory and upload files periodically to blob storage. The only snag is that Java doesn't have direct support for configuring diagnostics. If you're building your project from Eclipse, you only have a script file that launches everything, so you'd need to write a small .net app, or use something like AzureRunMe. If you're building a Visual Studio project to launch your Java app, then you have the ability to set up diagnostics without a separate app.
There's a blog post from Persistent Systems that just got published, regarding Java and diagnostics setup. I'll update this answer with a link once it's live. Also, have a look at Cloud Ninja for Java, which implements Tomcat logging (and related parsing) by using an external .net exe that sets up diagnostics, as described in the upcoming post.
Please visit my blog and download the document. In this document you can look for chapter "Tomcat Solution Diagnostics" for error logging solution. This document was written long back but you sure can use this method to generate the any kind of Java Based logging (log4j, sure )in Tomcat and view directly.
Chapter 6: Tomcat Solution Diagnostics
Error Logging
Viewing Log Files
http://blogs.msdn.com/b/avkashchauhan/archive/2010/10/29/windows-azure-tomcat-solution-accelerator-full-solution-document.aspx
In any scenario where there is custom application i.e. java.exe, php.exe, python etc, I suggest to create the log file directly at "Local Storage" Folder and then initialize Azure Diagnostics in Worker Role (WorkerRole.cs) to export these custom log files directly from Azure VM to your Azure Blob storage.
How to create custom logs on local storage is described here.
Using Azure Diagnostics and sending logs to Azure blob would be cheapest and robust then any other method u have described.
Finally I decided to write a Log4J Appender. I didn't need to gather diagnostics information, my main goal was only to gather the log files in an easily exchangeable way. My first fear was that it would slow down the application, but with by writing only to memory and only periodically writing out the log data to Azure tables it works perfectly without making too many API calls.
Here are the main steps for my implementation:
First I created an entity class to be stored in Azure Tables, called LogEntity that extends com.microsoft.windowsazure.services.table.client.TableServiceEntity.
Next I wrote the appender that extends org.apache.log4j.AppenderSkeleton containing a java.util.List<LogEntity>.
From the overrided method protected void append(LoggingEvent event) I only added to this collection and then created a thread that periodically empties this list and writes the data to Azure tables.
Finally I added the newly created Appender to my log4j configuration file.
Another alternative;
Can we not continue using log4j the standard way (such as DailyRollingFileAppender) only the file should be created on a UNC path, on a VM (IaaS).
This VM, will only need to have a bit of disk space, but need not have any great processing power. So one could share an available VM, or create a VM with the most minimal configuration, preferably in the same region and cloud service.
The accumulated log files can be accessed via RDP/ FTP etc.
That way one will not incur transaction cost and
cost of developing a special Log4j appender ... it could turn out as a cheaper alternative.
thanks
Jeevan
PS: I am referring more towards, ones application logging and not to the app-server logs (catalina/ manager .log or .out files of Weblogic)