get notified when spring boot healtcheck status changes - java

Is there a way to get notified when the status of the registered health check indicators changes? For example, when the healthcheck indicator of database becomes down, I would like to take some actions.
Actually, my final goal is to export healthcheck status to Prometheus' metrics. So, when there is status change, I want to update health metrics.

I assume your question refers to Micrometer issue 416 and Micrometer-Docs issue 39.
As per documentation, you can register the custom HealthMetricsConfiguration. The value of the gauge is determined by the status the ComposeHealthIndicator returns and is actually changing depending on the state of the single HealthIndicators.
I am using afformentioned HealthMetricsConfiguration (just with different status value mappings as discussed in issue 416).
Wen't ahead and implemented a custom alternating health indicator:
#Component
public class AlternatingHealthIndicator extends AbstractHealthIndicator {
#Override
protected void doHealthCheck(Builder builder) throws Exception {
int minute = LocalDateTime.now().getMinute();
boolean minuteIsEven = minute % 2 == 0;
builder.status(minuteIsEven ? Status.UP : Status.DOWN);
builder.withDetail("description", "UP when current minute is even; DOWN when current minute is odd");
builder.withDetail("currentMinute", minute);
builder.withDetail("minuteIsEven", minuteIsEven);
}
}
The health gauge exported on the Prometheus endpoint is minutely changing from 1=UP to -2=DOWN. Here's a visualization:
Regarding alerting, you can use Grafana alerting or look into Prometheus' Alertmanager.

Related

How to get an alarm when there are no logs for a time period in AWS Cloudwatch?

I have a Java application that runs in AWS Elastic Container Service. Application polls a queue periodically. Sometimes there is no response from the queue and the application hanging forever.
I have enclosed the methods with try-catch blocks with logging exceptions. Even though there are no logs in the Cloudwatch after that. No exceptions or errors.
Is there a way that I can identify this situation. ? (No logs in the Cloudwatch). Like filtering an error log pattern.
So I can restart the service. Any trick or solution would be appreciated.
public void handleProcess() {
try {
while(true) {
Response response = QueueUitils.pollQueue(); // poll the queue
QueueUitils.processMessage(response);
TimeUnit.SECONDS.sleep(WAIT_TIME); // WAIT_TIME = 20
}
} catch (Exception e) {
LOGGER.error("Data Queue operation failed" + e.getMessage());
throw e;
}
}
You can do this with CloudWatch Alarms. I've set up a test Lambda function for this which runs every minute and logs to CloudWatch.
Go to CloudWatch and Click Alarms in the left hand side menu
Click the orange Create Alarm button
Click Select Metric
Then choose Logs, then Log Group Metrics and choose the IncomingLogEvents metric for the relevant log group (the log group to which your application is logging). In my case it's /aws/lambda/test-log-silence
Click Select Metric
Now you can specify how you want to measure the metric. I've chosen the average log entries over 5 minutes, so after 5 minutes if there are no log entries, that value would be zero.
Scroll down, and you set the check to be "Lower Than or Equal To" zero. This will trigger the alarm when there are no log entries for 5 minutes (or whatever you decide to set it to).
Now click next, and you can specify an SNS topic to push the notification to. You can set up an SNS topic to notify you via email, SMS, AWS Lambda, and others.
With reference to brads3290's answer, if you are using AWS CDK:
import * as cloudwatch from '#aws-cdk/aws-cloudwatch';
// ...
const metric = new cloudwatch.Metric({
namespace: 'AWS/Logs',
metricName: 'IncomingLogEvents',
dimensions: { LogGroupName: '/aws/lambda/test-log-silence' },
statistic: "Average",
period: cdk.Duration.minutes(5),
});
const alarm = new cloudwatch.Alarm(this, 'Alarm', {
metric,
threshold: 0,
comparisonOperator: cloudwatch.ComparisonOperator.LESS_THAN_OR_EQUAL_TO_THRESHOLD,
evaluationPeriods: 1,
datapointsToAlarm: 1,
treatMissingData: cloudwatch.TreatMissingData.BREACHING,
});
This should also solve the problem of ignoring missing data.
In my case, I needed to use dimensionsMap{} instead of just dimensions: {}
const metric = new cloudwatch.Metric({
namespace: 'AWS/Logs',
metricName: 'IncomingLogEvents',
dimensionsMap: {
"LogGroupName": "logGroupNamehere.."
},
statistic: "Sum",
period: cdk.Duration.days(1),
});
And the Alarm looks like:
new cloudwatch.Alarm(this, 'no-incoming-logs-alarm', {
metric,
alarmName: `incoming-logs-alarm-${props?.stage}`,
threshold: 1,
comparisonOperator: cloudwatch.ComparisonOperator.LESS_THAN_THRESHOLD,
evaluationPeriods: 1,
datapointsToAlarm: 1,
treatMissingData: cloudwatch.TreatMissingData.MISSING,
alarmDescription: 'Some meaningful description',
});

How to emit custom metrics from SpringBoot application and use it in PCF Autoscaler

I am following this example for emitting the custom metrics and followed these details for registering the metrics in PCF.
Here is the code:
#RestController
public class CustomMetricsController {
#Autowired
private MeterRegistry registry;
#GetMapping("/high_latency")
public ResponseEntity<Integer> highLatency() throws InterruptedException {
int queueLength=0;
Random random = new Random();
int number = random.nextInt(50);
System.out.println("Generate number is : "+number);
if(number % 2 == 0) {
queueLength=99;
} else {
queueLength=200;
}
return new ResponseEntity<>(queueLength, null, HttpStatus.OK);
}
}
application.yml:
management:
endpoints:
web:
exposure:
include: "metrics,prometheus"
endpoint:
metrics:
enabled: true
prometheus:
enabled: true
SecurityConfiguration class:
build.gradle dependencies part:
Steps I followed to register the custom metrics after app deployment to PCF:
Installed metric-registrar on my local machine
Registered endpoint that emitting some Integer(which I am using for Autoscaler Rule)
cf register-metrics-endpoint api high_latency
After this step I can see one Custom-User-Provided service is bounded with my app in PCF.
Installed log-cache plugin on my local to verify the metrics endpoint and here are the logs
At last I added the rule in Autoscaler for custom metric.
This is the error I am getting in Autoscaler event history.
EVENT HISTORY Most Recent: Wed July 17, 2019 at 11:20 AM Autoscaler
did not receive any metrics for high_latency during the scaling
window. Scaling down will be deferred until these metrics are
available.
I looked into this a big. A few things here...
I believe your registration command is wrong, at least for the sample app you referenced.
You're using cf register-metrics-endpoint api high_latency, which means you have an app named api and an endpoint on that app at high_latency which exports metrics using the Prometheus format. For this sample app, the path should be /actuator/prometheus, according to the README & my brief test.
When I used the command cf register-metrics-endpoint app-name /actuator/prometheus, I was able to see the custom metrics in the output from cf tail & I do not see them in your included output.
Ex: (not showing up in your screenshot)
2019-07-17T22:54:37.83-0400 [custom-metrics-demo/0] GAUGE tomcat_global_request_max_seconds:2.006000
2019-07-17T22:54:37.83-0400 [custom-metrics-demo/0] GAUGE system_cpu_count:4.000000
2019-07-17T22:54:37.83-0400 [custom-metrics-demo/0] COUNTER tomcat_sessions_created_sessions_total:12
2019-07-17T22:54:37.83-0400 [custom-metrics-demo/0] COUNTER custom_metric_total:15
2019-07-17T22:54:37.83-0400 [custom-metrics-demo/0] GAUGE jvm_gc_pause_seconds_sum:0.138000
2019-07-17T22:54:37.83-0400 [custom-metrics-demo/0] COUNTER jvm_gc_pause_seconds_count:25
In the referenced sample app, there is no metric called high_latency, so that won't work as the name of your metric, as it will never get generated by the app. You can see all the metrics if you access the /actuator/prometheus endpoint in your browser. The Prometheus format is text based and pretty easy to read.
The last one is tricky and not documented but I can see it in the code for Autoscaler (at the time of writing this). When Autoscaler polls for metrics from LogCache, it only pulls GAUGE & TIMER events, not COUNTER events. If you were to try and use custom_metric from the demo app that would not work because it is a COUNTER metric. You'll just see the message about Autoscaler not seeing any metric events during the scaling window. Make sure that you are picking a GAUGE metric to use for scaling.
It also does not appear that Autoscaler supports using metrics with associated tags. If for example, you wanted to use tomcat_servlet_request_seconds_sum{name="dispatcherServlet",}, I don't think there's a way to tell Autoscaler that the name tag is required to be a certain value. That info is in LogCache, but I don't think Autoscaler uses it at this time. My quick glance through the code, at the time I write this, seems to indicate it's only looking at the metric name. If you're creating a custom metric, this won't matter. Just don't use any tags for the metric.
Hope that helps!

Braintree, Java : Check if subscription is valid and paid for.

I am working on integrating braintree payments into our Spring-MVC based app. We have subscriptions in our application which are charged on monthly basis unless canceled by the user. However, there might be a situation when the payment doesn't go through on next billing cycle and we would like to cancel the services offered.
The approach I have taken is to get a list of all subscriptions which are currently active and track their status. Based upon that, we either let the service exist or run cancellation code.
Is this approach sufficient the way it is mentioned below?
Code :
#Override
#Scheduled(cron = "0 4 5 * * ?")
public void checkIfSubscriptionIsActive() {
List<Payment> paymentList = this.paymentDAO.getAllPayments();
for(Payment payment : paymentList){
Subscription subscription = gateway.subscription().find(payment.getPaypalId());
if(subscription!=null){
Subscription.Status status = subscription.getStatus();
if(status!=null){
if(status.toString().equals("Canceled")||(status.toString().equals("Expired"))||
(status.toString().equals("Past Due"))||(status.toString().equals("Pending"))||
(status.toString().equals("Unrecognized"))){
//Cancel service code
}
}
}
}
}

Twincat ADS event driven reading stops working after a while (Java)

We developed a Java application which uses the TwinCat ADS library (DLL) to read, write and handle events from the Beckhoff PLC (CX5120).
We successfully run this on several machines but unfortunately we’re currently having a problem case where the event handling suddenly stops.
This is the exact scenario we went through:
Read, write and events are handled correctly.
Suddenly we don’t get any events at all anymore, reading and writing are still working correctly though.
Replaced the PLC for another one, started working successfully again. We assumed it was a licensing problem then.
After a week of unattended running, the same problem started again, PLC/ADS library seems not to be triggering events anymore and we can’t seem to get it working again in any way. Reading/writing still working as it should.
Tested using another PC with the Java application, same problem. So something in the PLC seems to freeze up / stop working.
Here's how we have setup the event handling:
// Implementation of the CallbackListenerAdsState interface
public class ADSEventController implements CallbackListenerAdsState {
......
// Register itself as listener for the ADS events (in constructor)
callObject = new AdsCallbackObject();
callObject.addListenerCallbackAdsState(this);
....
// Event handling
public void onEvent(AmsAddr addr, AdsNotificationHeader notification, long user) {
log.info("Got ADS event for handle[{}] and with raw data[{}]", user, notification.getData());
......
// Registering notification handles for PLC variables
// If we already assigned a notification, delete it first (while reconnecting)
JNILong notification = new JNILong();
if(var.getNotification() != null) {
notification = var.getNotification();
AdsCallDllFunction.adsSyncDelDeviceNotificationReq(addr,notification);
}
// Specify attributes of the notificationRequest
AdsNotificationAttrib attr = new AdsNotificationAttrib();
attr.setCbLength(var.getSize());
attr.setNTransMode(AdsConstants.ADSTRANS_SERVERONCHA);
attr.setDwChangeFilter(1000); // 0.01 sec
attr.setNMaxDelay(2000); // 0.02 sec
// Create notificationHandle
long err = AdsCallDllFunction.adsSyncAddDeviceNotificationReq(
addr,
AdsCallDllFunction.ADSIGRP_SYM_VALBYHND, // IndexGroup
var.getHandle(), // IndexOffset
attr, // The defined AdsNotificationAttrib object
var.getHandle(), // Choose arbitrary number
notification);
var.setNotification(notification);
if (err != 0) {
log.error("Error: Add notification: 0x{} for var[{}]", Long.toHexString(err), var.getId());
}
We managed to find the cause.
When we register a variable we get a handle (long) from the PLC, which, in our case unexpectedly started to be negative values after a while.
We also used this long value as user reference for notifications, however, we found the user reference is an unsigned long in the ADS library.
So if we set a negative value of e.g. -1258290964 as ‘arbitrary number’ in the adsSyncAddDeviceNotificationReq call, the CallbackListenerAdsState onEvent method’s parameter ‘user’ (Long) got the unsigned long representation of our signed long user reference, which is 3036676332.
In our Java application we used this user reference to match an event to a specific plc variable by this handle. Since, in our example, we expected -1258290964 but got 3036676332, we never handled any events.

Spring : Send automated email to member and admin after timer expires

I am working on a Spring-MVC application in which there is Service desk functionality I am working on. So, as a part of Service desk, users can create issues and assign a support-team member. In that, they can also assign in how much time issue needs to be resolved. I am setting the time in java.sql.TimeStamp.
Now, when the time expires, I would like to send an email to the support-team admin, the person who created the issue and the support-team member responsible for resolving the issue.
If it was a normal scheduled or cron job, I can just write a #Scheduled method and get it over with, but here, I would like to check for example after 6 hours if the issue was resolved or not. How do I accomplish that? I have no idea to be honest.
Here is service layer part the SupportRequest :
#Service
#Transactional
public class SupportRequestServiceImpl implements SupportRequestService{
private final SupportRequestDAO supportRequestDAO;
#Autowired
public SupportRequestServiceImpl(SupportRequestDAO supportRequestDAO){
this.supportRequestDAO = supportRequestDAO;
}
#Autowired
private SupportTeamService supportTeamService;
#Override
public int addSupportRequest(SupportRequest supportRequest, int assignedTeamId, Long groupId) {
SupportTeam supportTeam = this.supportTeamService.getSupportTeamMemberById(assignedTeamId);
if(!(supportTeam == null)){
supportRequest.setCreationTime(new Timestamp(System.currentTimeMillis()));
supportRequest.setAssignedTeamMemberId(supportTeam.getTeamId());
return this.supportRequestDAO.addSupportRequest(supportRequest,groupId);
}
return 0;
}
}
I don't know what else to show. Thanks a lot.
Update
Will something like this work?
long delay = 1000*60*60*12; // after 12 hrs
Timer timer = new Timer();
Calendar cal = Calendar.getInstance();
timer.schedule(new TimerTask() {
public void run() {
// Task here ...
System.out.println("inside the main");
Integer id = new Integer(10);
Assert.assertNotNull(id);
}
}, delay);
For these kind of scenario, there should be background process running. That process will check for issues that has not been fixed in given time. Then this process will send a message to whoever you want and then continue running in background.
There are different ways of doing this.
1. Batch Process
You can make batch process. Batch process will be running on your server, it will check for expired issues and then send mail to the support-team admin.
2. Techniques for Real-time Updates
You can also you real time update techniques in spring. Using this technique you will fire request after every given period that will check for expire issues. If any issue found that has not been fixed you can send mail. Please read the related document here : Spring MVC 3.2 Preview: Techniques for Real-time Updates
3. Web Socket
Web socked can also be useful for these kind of task. Find the good source here :
SPRING FRAMEWORK 4.0 M2: WEBSOCKET MESSAGING ARCHITECTURES

Categories

Resources