Disable Direct Runner Logs in Apache Beam for Kafka Consumer

Disable Direct Runner Logs in Apache Beam for Kafka Consumer - java

Seen a similar question asked, but on dataflow logging and not direct logging.
Basically, I want to turn off the wave of KafkaIO read (consumer) logs. I have tried setting the logging levels in SDK harness as follows.
var kafkasLogs =
SdkHarnessOptions.SdkHarnessLogLevelOverrides.from(
new HashMap<>(
Map.of(
"org.apache.kafka.clients.consumer.internals.SubscriptionState",
SdkHarnessOptions.LogLevel.ERROR.name())));
options.setSdkHarnessLogLevelOverrides(kafkasLogs); // extends sdkharnessoptions
I have also tried variations of the above but have not been successful in my bid to silence the consumer logs.
What is the way I can shut these logs off without messing with the additional logging of my pipeline?

Related

AWS Latency Metrics Logging Issue Spring Boot

I am trying to log aws latency metrics on the application server. I have tried implementing the last Latency Metrics Logging section of https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/java-dg-logging.html
As mentioned in the instructions there:
I am setting the following while initializing ApplicationContext:
AwsSdkMetrics.enableDefaultMetrics();
AwsSdkMetrics.setMetricNameSpace("SNSMetricsLog");
AwsSdkMetrics.setCredentialProvider(credentialsProvider);
I am using the following in log.properties:
log.folder=log
log.app.fileName=application.log
log.metric.fileName=metric.json
log.level=DEBUG
log.app.batch.fileName=batch.log
log.app.skippedMsg.fileName=skipped.log
log.logger.com.amazonaws.latency=DEBUG
Even after making these changes, the AWS latency metrics are not coming although I am able to see other DEBUG logs.

Is it possible to create custom fields in a Kibana dashboard?

I am using a Java micro-service architecture in my application and generating separate log files for each micro-service.
I am using ELK stack approach to visualize the logs in Kibana, but the problem is whatever the fields that I'm getting from Elastic Search that are related to server logs fields. some example fields are #timestamp,#version,#path,#version.keyword,#host.
i want to customize this fields by adding some fields like customerId,txn-Id,mobile no so that we can analyze the data easily.
I'm using org.apache.logging.log4j2 to write the logs. Can I set above fields (customerId,txn-Id,mobile) to log files? And then Elastic will store these fields with the above default fields and then these custom fields should available in a Kibana dashboard. Is this possible?

It's definitely possible to do that. I've not done it with the log4j2 stack (I have with slf4j/logback), but the basic approach is:
set those fields in the Mapped Diagnostic Context (I'm fairly sure log4j2 supports that)
use a log appender which logs to logstash-structured JSON
configure filebeat to ship the JSON logs
if filebeat is shipping to logstash, you'll need to configure logstash to pass those preformatted JSON logs directly to elasticsearch

It is definitely possible. I am doing that now with my applications. However, the output looks a bit different from yours. The basic guide for doing this can be found at Logging in the Cloud on the Log4j2 web site.
The "normal" log view looks very similar to what you would see when logging to a file.
However, if you select a message you can see the individual fieds.
The Log4j2 configuration uses a TCP Socket appender that is configured to write to a cluster of Logstash servers that use a single DNS entry and to use the Gelf layout.
You can also use MapMessages to capture individual data elements and log them. While this currently works it is slightly cumbersome so I have recently committed improvements that will be available in Log4j 2.15.0.
It is important to note that the Logging in the Cloud page briefly mentions storing your logging configuration in Spring Cloud Config. If you want to have a common base configuration while allowing apps to do some customization this works very, very well. However, The Gelf, Json Template Layout and TCP Appender are all independent from that and can be used without Spring Boot.

How to prevent Vertx from writing logs automatically?

When starting my TCP server using Vertx, I have the following output :
[2018-06-04 12:15:45] [FINEST ] Net server listening on 0.0.0.0:/0:0:0:0:0:0:0:0:8600
[2018-06-04 12:15:45] [INFO ] Server is now listening on port : 8600
I was expected the second line, since I am telling Vertx to write it :
server.listen(res -> {
if (res.succeeded()) {
logger.info("Server is now listening on port : {0, number, #}", server.actualPort());
}
else {
logger.error("Server failed to bind");
}
});
The first line though, is written by Vertx itself. I am bit surprised, since I could not see anywhere in Vertx documentation that this would happen nor how to prevent it from doing so.
How can I make Vertx stop logging automatically?
Thanks in advance.

Well, the manual states that vert.x by default uses java.util.logging, often referred to by its nickname JUL. It's configurable so depending on your use case you should be able to tune the log output. Alternatively vert.x can be instructed to use an external logging framework, they each have their own advantages and disadvantages.
The documentation for JUL isn't really the most helpful prose ever written, fortunately there are plenty of third party sites covering that topic, like http://tutorials.jenkov.com/java-logging/index.html but a quick Google will point you to many others too.
Resuming:
you will need to write a logging.properties file that reflects the output you want to obtain, and where (in logfiles and/or on the console)
you will have to pass that file to your vert.x application via the system property java.util.logging.config.file
Limiting the info produced by certain application parts can be done by using the filtering capabilities present in JUL. So, for example, in your logging.properties you could put
java.util.logging.FileHandler.level=INFO
which will restrict logging that goes to the logfile to INFO or higher. That like for example would already do away with the vert.x log you see in your example. You can also restrict logging per package, group of packages or even individual classes. A nice writeup of these possibilities can be found here: java.util.logging: how to set level by logger package (or prefix)? . I think vert.x uses the prefix io.vertx

How to automate Kafka Testing

We have developed a system using kafka to queue the data and later consume that data to place orders for users.
We have tested certain things manually, but now our aim is automate the process.
Is there any client available to test it? I found out ways to Unit test it using kafka client itself, but my aim is to test the system as whole.
EDIT: our purpose is just API testing i.e., just the back-end, not the UI

You can start Kafka programmatically in your integration test, Kafka uses Zookeeper so firsly look at Zookeeper TestingServer - instance of this class creates and starts the Zk server using the given port.
Next look at KafkaServerStartable.scala, you have to provide configuration that points to your in memory Zk server and invoke startup() method, here is some code:
import kafka.server.KafkaConfig;
import kafka.server.KafkaServerStartable;
import java.util.Properties;
public KafkaTest() {
Properties properties = createProperties();
KafkaConfig kafkaConfig = new KafkaConfig(properties);
KafkaServerStartable kafka = new KafkaServerStartable(kafkaConfig);
kafka.startup();
}
Hope these help:)

You can go for integration-testing or end-to-end testing by bringing up Kafka in a docker container. If you use Apache kafka-clients:2.1.0, then you don't need to deal with ZooKeeper at the API level while producing or consuming the records.
Dockerizing Kafka, and testing helps to cover the scenarios in a single node as well as multi-node Kafka cluster. This way you don't have to test against Mock/In-Memory Kafka once, then real Kafka later. This can be done using TestContainers.
If you have too many test scenarios to cover, you can go for Kafka Declarative Testing like docker-compose style, by which you can eliminate the Kafka client API coding.
Checkout some handy examples here for validating produce and consume.
TestContainers project also supports docker-compose.

As I understood you want to implement end to end tests starting from messages. Me and some people from recently made a research for libraries, tools and frameworks to test Event-driven systems using Kafka.
We found Zerocode which is an automated API testing using declarative language like JSON or YAML. It support REST, SOAP and what we are interested, Messaging. It sends and consumes messages from topics and make assertions in the end, easy to learn and use. Here is the link for more details Zerocode. It seems like a good option although we are starting using it.
You will need to have Kafka brokers and the dependencies running to make this solution to work, but nothing like a docker compose and/or some scripts to bring a environment for tests.
Another way is to implement your own project with Kafka libraries and use the libraries to send and receive messages in the tests.
Unfortunately we couldn't find more options available out there. Kafka has a proposition to create a test kit but it's not in progress yet.

Unfortunately, the approach described by Pavel does not work for Kafka 2.8+ anymore. However, I could make our end-to-end tests with Kafka 3.2 work using the approach taken by KarelDB:
Properties props = TestUtils.createBrokerConfig(
brokerId,
zkConnect,
false,
false,
TestUtils.RandomPort(),
noInterBrokerSecurityProtocol,
noFile,
EMPTY_SASL_PROPERTIES,
true,
false,
TestUtils.RandomPort(),
false,
TestUtils.RandomPort(),
false,
TestUtils.RandomPort(),
Option.<String>empty(),
1,
false,
1,
(short) 1
);
KafkaConfig config = KafkaConfig.fromProps(props);
KafkaServer server = TestUtils.createServer(config, Time.SYSTEM);
// `createServer` will also start your Kafka server.
// To shutdown:
server.shutdown();

Apache Logging - Send log output directly to queue

Am using Standard apache logging (org.apache.log4j.logging )
Currently, taking the data to be logged manually, and publishing in to Apache Active MQ.
Is it possible to configure the logging output to publish directly in to Active MQ??
This might sound stupid, but since both are from Apache, I have a doubt that whether, it has any implicit support, which I could not grab it.

log4j provides JMSAppender out of the box. It allows publishing logging events to JMS Topic.
For configuration specific to ActiveMQ please check the documentation - How do I use log4j JMS appender with ActiveMQ

Not sure if you were looking for log4j-1.x or log4j-2.0, but here are the links for log4j-2.0:
http://logging.apache.org/log4j/2.x/manual/appenders.html#JMSQueueAppender
http://logging.apache.org/log4j/2.x/manual/appenders.html#JMSTopicAppender

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Disable Direct Runner Logs in Apache Beam for Kafka Consumer - java

Related

AWS Latency Metrics Logging Issue Spring Boot

Is it possible to create custom fields in a Kibana dashboard?

How to prevent Vertx from writing logs automatically?

How to automate Kafka Testing

Apache Logging - Send log output directly to queue

Categories

Resources