Custom metrics using the CloudWatch agent with the StatsD protocol - java

I have a web application running in EC2 instance. It has different API endpoints. I want to count the number of times each API is called. The web application is in Java.
Can anyone suggest to me some articles where I can find proper Java implementation for integration of statsD with CloudWatch?

Refer their doc page https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-custom-metrics-statsd.html, They have mentioned about publishing the metrics in the same page, for your client side you can refer https://github.com/etsy/statsd/wiki#client-implementations.
Usually I follow a simple approach without using statsd, Log the events in the file and sync the file to the Cloudwatch, In cloudwatch you can configure filters and based on filters, you can increment the custom metrics.

Install CloudWatch Agent on your EC2 instance
Locate and open CW Agent config file
Add statsd section into the config file (JSON format)
{
....,
"statsd": {
"metrics_aggregation_interval": 60,
"metrics_collection_interval": 10,
"service_address": ":8125"
}
}
AWS CloudWatch agent is smart enough to understand custom tags helping you to correctly split statistics gathered from different API methods ("correctly" here means splitting API methods stats by dimension name, not by metric name). So you need a Java client lib supporting tags. For example, DataDog client
Configure the client instance as explained in the package documentation and that's it. Now you can do thing like this at the beginning of your each REST API operation:
statsd.incrementCounter(“InvocationCount”, 1, “host:YOUR-EC2-INSTANCE-NAME”, “operation:YOUR-REST-OPERATION-NAME”);
CloudWatch will handle everything else automatically. You will be able to see you metrics data flowing in the AWS CloudWatch Console under "CWAgent" namespace. Please be aware that average delay between statds client call and the data visibility in CW Console is about 10-15 minutes.
Manually writing statsd calls in each REST API operation may not be a good idea. Decorators will help you to automatically instrument it with just a several lines of code.

Related

Publishing Spring Batch metrics using Micrometer

I have an app that contains 2 dozen of spring batch cron jobs.There is no rest controller as it is an analytics app and it runs daily and read data from db, process it, and then store aggregated data in another db.I want to have spring inbuilt metrics on the jobs using micrometer and push them to Prometheus .As my app is not a webserver app, so still micrometer will be publishing results on HOST:8080? Will actuator automatically start a new server on HOST:8080?or do we need to have application server running on 8080?
My understanding is that actuator and application server can run of different ports as these are different processes ?Even if application server is there or not, actuator should be able to either use same port as application server port, or it can use different port?
So if my application is not a webserver based app, still I can access metrics at localhost:8080/actuator/ and publish to Prometheus?
Prometheus is a pull-based system, meaning you give it a URL from your running application and it will go pull metrics from it. If your application is an ephemeral batch application, it does not make sense to make it a webapp for the only sake of exposing a URL for a short period of time. That's exactly why Prometheus folks created the Push gateway, see When to use the Push Gateway.
Now with is in mind, in order for your batch applications to send metrics to Prometheus, you need:
A Prometheus server
A Pushgateway server
An optional metrics dashbaord (Grafana or similar, Prometheus also provides a built-in UI)
Make your batch applications push metrics to the gateway
A complete example with this setup can be found in the Batch metrics with Micrometer. This example is actually similar to your use case. It shows two jobs scheduled to run every few seconds which store metrics in Micrometer's main registry and a background task that pushes metrics regularly from Micrometer's registry to Prometheus's gateway.
Another option is to use the RSocket protocol, which is provided for free if you use Spring Cloud Dataflow.
For Spring Boot, there are no actuator endpoints for Spring Batch, please refer to Actuator endpoint for Spring Batch for more details about the reasons about this decision.
#Mahmoud I think there are valid use cases for exposing the health endpoints optionally. The first question to consider is when we say a batch operation runs for a short time, how short is that time - a few minutes? I agree there's no need; but how about jobs that run for a few hours? it's important for some jobs that we get metrics especially when such jobs are bound by a business SLA and the operator needs to know if the job is processing at the required operations per second, has the right connection pool size etc.
There are also a variety of implementation details of the running platform - we can use Spring Batch without SCDF, not be in control of the Prometheus gateway to be able to use push, run in a cloud where Istio will pull the metrics automatically etc.
For the OPs question, in general one can run a spring batch job in web instance, as far as I have used Spring Batch with a web instance, the application does shut down after job completion.

API hit count per client across multiple servers

Using Spring-boot-actuator API I need to count the number of API hits per clientID. How can I achieve this? Another challenge is my application is deployed on AWS and Azure. At any time I want to know the total API hit count across all environments.
There are multiple ways to do it. You have use tools like newrelic to capture that.
It uses java agent to bound to each API call.
Another option is you can use logging system to push logs and then accumulate and show using splunk, kibana. there you can create dashboard based on logs to check API hit.
You can implement your own approach, as an API interceptor/ControllerAdvice to send request hit in a separate async thread.But then you have to implement real time aggregration of these hits.

Create custom CRUD on elastic search

So, I have an application that extracts keywords from an elastic search document. I need to somehow run this application when my elastic search receives a new document to index so that the keywords generated get registered and stored with the document. Is there anyway to create a Plugin that extracts the keywords as soon as the document arrives?
"I need to somehow run this application when my elastic search receives a new document to index"
what you are describing requires a combination of an Ingest Pipeline and Ingest Plugin with a listener for indexing events.
https://www.elastic.co/guide/en/elasticsearch/reference/current/ingest.html
the ingest pipleline lets you manipulate the incoming JSON, and the listener provides a hook for you to trigger any code you want.
Elastic.co don't usually provide good documentation for these kinds of things, but checkout the examples they have and their GitHub:
https://github.com/elastic/elasticsearch/tree/master/plugins
I cannot think of an Elasticsearch plugin (I'm not aware of such a thing), but I'd rather recommend using Logstash, and configure the Elasticsearch input plugin to listen to updates in your cluster and use other plugins to react accordingly. To enable Logstash communicating with your application, an idea might be to add a REST endpoint to your app and let Logstash sending requests to that endpoint by using the Http output plugin.

How do you use cron jobs using Elastic Beanstalk and Java?

I want to run cron jobs and use the same code base. I found a few solutions, but they don't appear ideal. For example, with Heroku, you can add a Scheduler element and fill in the commands to run in a web page.
http://blog.rotaready.com/scheduled-tasks-elastic-beanstalk-cron/
It seems overly complicated for load-balanced instances.
It makes use of require('async') in Node, but what would be a Java Spring Boot equivalent?
https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features-managing-env-tiers.html
There doesn't appear to be any security. Any one the net could access the /path to POST and execute the job, causing a denial-of-service attack.
it mentions cron.yaml which doesn't make sense as the app is deployed via a WAR/ZIP file to a Tomcat instance (Spring Boot).
It mentions Amazon DynamoDB, which we don't use. We use MySQL.
It doesn't specify whether the load balancer connection draining timeout is in effect for these jobs (10s).
It mentions "Worker Configuration card on the Configuration page in the environment management console" but there is no Worker Configuration card under Configuration page.
Running a cron job in Elastic Beanstalk
For Python/Django - uses cron.yaml.
I thought of just having a dedicated EC2 instance, but how can I deploy the latest code changes there?
This may also belong on SoftwareEngineering.StackExchange.
There is an easy way to do this using other AWS systems.
You can use CloudWatch to set scheduled events (https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/WhatIsCloudWatchEvents.html). You can set a rule to set the event on a set schedule.
You then have at least two options:
set the event to publish an SNS message and use that SNS to call a web hook on your server. Many examples on how to do this but you will have to make sure you check the signature to ensure the web API is called from the signed SNS. But this would use a public API and may not be something you are comfortable with.
set the event to publish an SQS message. Then set an elastic beanstalk worker to process the SQS message or just run a background script on your main server, which is basically on an infinite loop polling SQS for work to do.
Not sure how familiar you are with these systems so not sure if it will be clear what I am talking about, but there is no way to give a detail solution so hope this is enough to give you ideas.

How to get Tomcat upload/download speed with JMX

Is there any way to get Tomcat upload and download traffic using Java and JMX?
Tomcat version = ?
If you ask about the count of bytes transferred, then yes. The detailed status page in the Manager web application shows that information and it obtains it via JMX.
You can look into org.apache.catalina.manager package classes StatusManagerServlet and StatusTransformer for the actual source code.
If you ask about transfer rate, if I remember correctly there is no such information. It can also be defined in different ways, as it differs across clients.
You can write your own Filter or Valve or AccessLogValve to perform such calculations and expose via JMX.
You can also analyze an access log file.

Categories

Resources