implementation of sink connector in java package

implementation of sink connector in java package - java

i had started zookeeper ,kafka server ,kafka producer and kafka consumer and i had put jdbc sql connector jar downloaded from confluent and put the jar in the path and i have mentioned plugin.path in connect-standalone properties.and i have run connect-standalone.bat ....\config\connect-standalone.properties ....\config\sink-quickstart-mysql.properties without any error but it has many warnings and it is not getting started,but my data is not getting reflected in tables.what i have missed?can u please help me out i have below warnings
org.reflections.ReflectionsException: could not get type for name io.netty.inter
nal.tcnative.SSLPrivateKeyMethod
at org.reflections.ReflectionUtils.forName(ReflectionUtils.java:312)
at org.reflections.Reflections.expandSuperTypes(Reflections.java:382)
at org.reflections.Reflections.<init>(Reflections.java:140)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader$Inte
rnalReflections.<init>(DelegatingClassLoader.java:433)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scan
PluginPath(DelegatingClassLoader.java:325)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scan
UrlsAndAddPlugins(DelegatingClassLoader.java:261)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.init
PluginLoader(DelegatingClassLoader.java:209)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.init
Loaders(DelegatingClassLoader.java:202)
at org.apache.kafka.connect.runtime.isolation.Plugins.<init>(Plugins.jav
a:60)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone
.java:79)
Caused by: java.lang.ClassNotFoundException: io.netty.internal.tcnative.SSLPriva
teKeyMethod
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at org.reflections.ReflectionUtils.forName(ReflectionUtils.java:310)
... 9 more

No need to write a source connector yourself unless you need to connect kafka to some exotic data source. Popular tools like mysql are already quite well covered. There is already a "jdbc-connector" by confluent that does what you want.
https://docs.confluent.io/current/connect/kafka-connect-jdbc/index.html
You'll need a working kafka-connect installation and then you can "connect" your mysql tables to kafka with an HTTP POST to the kafka connect API. Just specify a comma-separate list of the tables you'd like to be used as sources in the tables.whitelist attribute. For example, something like this....
curl -X POST $KAFKA_CONNECT_API/connectors -H "Content-Type: application/json" -d '{
"name": "jdbc_source_mysql_01",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"connection.url": "jdbc:mysql://mysql:3306/test",
"connection.user": "connect_user",
"connection.password": "connect_password",
"topic.prefix": "mysql-01-",
"poll.interval.ms" : 3600000,
"table.whitelist" : "test.accounts",
"mode":"bulk"
}
}'

Related

Spring + Prometheus + Grafana: Err reading Prometheus: Post "http://localhost:9090/api/v1/query": dial tcp 127.0.0.1:9090: connect: connection refused

Hello I have an app in Spring Boot and I am exposing some metrics on Prometheus. My next goal is to provide these metrics on Grafana in order to obtain some beautiful dashboards. I am using docker on WSL Ubuntu and typed the next commands for Prometheus and Grafana:
docker run -d --name=prometheus -p 9090:9090 -v /mnt/d/Projects/Msc-Thesis-Project/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus --config.file=/etc/prometheus/prometheus.yml
docker run -d --name=grafana -p 3000:3000 grafana/grafana
Below I am giving you the Prometheus dashboard in my browser and as you can see, everything is up and running. My problem is in Grafana configuration where I have to configure Prometheus as Data Source.
In the field URL I am providing the http://localhost:9090 but I am getting the following error:
Error reading Prometheus: Post "http://localhost:9090/api/v1/query": dial tcp 127.0.0.1:9090: connect: connection refused
I've searched everywhere and saw some workarounds that don't apply to me. To be specific I used the following: http://host.docker.internal:9090, http://server-ip:9090 and of course my system's IP address via the ipconfig command http://<ip_address>:9090. Nothing works!!!
I am not using docker-compose but just a prometheus.yml file which is as follows.
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
- job_name: 'Spring Boot Application input'
metrics_path: '/actuator/prometheus'
scrape_interval: 2s
scheme: http
static_configs:
- targets: ['192.168.1.233:8080']
labels:
application: "MSc Project Thesis"
Can you advise me something?

You can use the docker inspect command to find the IP address of the Prometheus container and then replace the localhost word with it.

I'll suggest you to use docker-compose, which better supports in DNS resolving and your issues of localhost will get resolved.

It works for https://stackoverflow.com/a/74061034/4841138
Also, if you deploy the stack by docker compose and all dockers are in same network, you can do that:
URL: http://prometheus:9090
In above, prometheus is the domain name of the prometheus docker, which can be resolved by all dockers within same network.

Connector task state fails to connect

Task state of a connector is getting failed with error:
org.apache.kafka.connect.errors.ConnectException: java.lang.NoClassDefFoundError
I am running kafka connect cluster in distributed mode and I am using kafka(0.10.0.2.5) connect of ambari deployment.
I gave debezium mysql connector path using export CLASSPATH=/path to connector/.
Loaded connector configuration into Kafka Connect using the following command:
curl -i -X POST -H "Accept:application/json" \
-H "Content-Type:application/json" http://localhost:8083/connectors/ \
-d '{
"name": "MYSQL_CONNECTOR",
"config": {
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"database.hostname": "10.224.21.36",
"database.port": "3306",
"database.user": "root",
"database.password": "shobhna",
"database.server.id": "1",
"database.server.name": "demo",
"database.history.kafka.bootstrap.servers": "slnxhadoop04.noid.in:6669",
"database.history.kafka.topic": "dbhistory.demo" ,
"include.schema.changes": "true"
}
}'
Now after checking connector status, I am getting error:
- {"name":"MYSQL_CONNECTOR","connector":{"state":"RUNNING","worker_id":"172.26.177.115:8083"},
"tasks":[{"state":"FAILED","trace":"org.apache.kafka.connect.errors.ConnectException:
java.lang.NoClassDefFoundError:
org/apache/kafka/clients/admin/AdminClient\n\tat
io.debezium.connector.mysql.MySqlConnectorTask.start(MySqlConnectorTask.java:218)\n\tat
io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:45)\n\tat
org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:137)\n\tat
org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:140)\n\tat
org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:175)\n\tat
java.util.concurrent.Executors$RunnableAdapter.cal(Executors.java:511)\n\tat
java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)\n\tat
java.lang.Thread.run(Thread.java:745)\nCaused by:
java.lang.NoClassDefFoundError:
org/apache/kafka/clients/admin/AdminClient\n\tat
io.debezium.relational.history.KafkaDatabaseHistory.initializeStorage(KafkaDatabaseHistory.java:336)\n\tat
io.debezium.connector.mysql.MySqlSchema.intializeHistoryStorage(MySqlSchema.java:260)\n\tat
io.debezium.connector.mysql.MySqlTaskContext.initializeHistoryStorage(MySqlTaskContext.java:194)\n\tat
io.debezium.connector.mysql.MySqlConnectorTask.start(MySqlConnectorTask.java:126)\n\t...
9 more\nCaused by: java.lang.ClassNotFoundException:
org.apache.kafka.clients.admin.AdminClient \n\tat
java.net.URLClassLoader.findClass(URLClassLoader.java:381)\n\tat
java.lang.ClassLoader.loadClass(ClassLoader.java:424)\n\tat
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)\n\tat
java.lang.ClassLoader.loadClass(ClassLoader.java:357)\n\t

It can't find a builtin Kafka class, not your Connector
NoClassDefFoundError:
org/apache/kafka/clients/admin/AdminClient
...
i am using kafka(0.10.0.2.5)
Make sure you're running 1) a Connect Server version that matches your Kafka broker 2) using a Connector that uses code for that version of Connect
For example, AdminClient only exists in Kafka 0.11+.
In the recent HDP releases, you get Kafka 1.1 (different than 0.11), and this is the version that the latest Debezium is built and tested against https://debezium.io/docs/releases/
Debezium needs the AdminClient to make and register topic information, so I'm not sure if it'll work on old version such as 0.10
Its stated in the Kafka wiki that newer versions of Connect Server can communicate with old brokers, but the protocol used by the Connector classes is up for debate.

Java/jmeter http request fails but curl works

I am trying a very basic http request with jmeter, but it seems to always get the error below. I have tried a simple get against google which is fine but the internal servers are not :
java.net.NoRouteToHostException: No route to host (Host unreachable)
I can curl the same url successfully with a 200 response, so not sure if its jmeter or java? The only thing that is unique is that our internal servers are resolving with ipv6, but I would not think that would be the problem?

Try adding the next line to system.properties file (lives in "bin" folder of your JMeter installation)
java.net.preferIPv6Addresses=true
Or pass the aforementioned property via -D command-line argument like:
jmeter -Djava.net.preferIPv6Addresses=true -n -t test.jmx -l result.jtl
References:
Java: Networking Properties
Configuring JMeter
Apache JMeter Properties Customization Guide
Overriding Properties Via The Command Line

Unable to pull JMX data using jolokia from Kafka

I have installed Jolokia in centos 7 machine and trying to pull Kafka metrics using Jolokia agent and integrate with Icinga monitoring tool using Nagios plugin check_jmx4perl. Below are the configuration steps I have followed
Step 1: Downloaded jolokia-jvm-1.3.4-agent.jar
Step 2: Copied to /home/usr/
Step 3: Provided permissions by issuing command chmod a+x /home/usr/jolokia-jvm-1.3.4.jar
Step 4: Added to class path by issuing command export KAFKA_OPTS="$KAFKA_OPTS -javaagent:/home/usr/jolokia-jvm-1.3.4-agent.jar=host=*"
Step 5: Started Zookeeper and Kafka in standalone mode and tried to fetch list of topics which works fine by displaying the message
INFO: No access restrictor found, access to all MBean is allowed
Jolokia: Agent started with URL http://0:0:0:0:0:0:0:0:8778/jolokia/
Step 6: Testing jolokia agent by issuing the command j4psh http://localhost:8778
Connection refused
I have also tried by providing IP address but the issue still remains the same. Do I need to make an entry of the host in etc/hosts file?

Not sure if you are same OP as this question, but:
Perhaps you need to fully qualify the path of the jar. Mine looks like this and works:
export JOLOKIA_HOME=/libs/java/jolokia/1.3.7
export JOLOKIA_JAR=$JOLOKIA_HOME/jolokia-jvm-1.3.7-agent.jar
export KAFKA_OPTS="-javaagent:$JOLOKIA_JAR=port=7778,host=* $KAFKA_OPTS"
When I start Kafka in non-daemon mode, it prints this:
I> No access restrictor found, access to any MBean is allowed
Jolokia: Agent started with URL http://10.8.36.121:7778/jolokia/
Then I point my browser to http://localhost:7778/jolokia/search/: and I get:
{
"request": {
"mbean": "*:*",
"type": "search"
},
"value": [
"kafka.network:name=ResponseQueueTimeMs,request=ListGroups,type=RequestMetrics",
"kafka.server:delayedOperation=topic,name=PurgatorySize,type=DelayedOperationPurgatory",
"kafka.server:delayedOperation=Fetch,name=NumDelayedOperations,type=DelayedOperationPurgatory",
"kafka.network:name=RemoteTimeMs,request=Heartbeat,type=RequestMetrics",
<-- SNIP -->
"kafka.network:name=LocalTimeMs,request=Offsets,type=RequestMetrics"
],
"timestamp": 1504188793,
"status": 200
}
j4psh also connects with:
j4psh http://localhost:7778/jolokia

Add to KAFKA_OPTS:
javaagent:/usr/share/java/kafka/jolokia-jvm-1.6.0-agent.jar -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=localhost -Dcom.sun.management.jmxremote.rmi.port=9999 -Djava.security.auth.login.config=/var/private/sasl_acl/kafka.server.jaas.config

Spark SASL not working on the emr with yarn

So first, I want to say the only thing I have seen address this issue is here: Spark 1.6.1 SASL. However, when adding the configuration for the spark and yarn authentication, it is still not working. Below is my configuration for spark using spark-submit on a yarn cluster on amazon's emr:
SparkConf sparkConf = new SparkConf().setAppName("secure-test");
sparkConf.set("spark.authenticate.enableSaslEncryption", "true");
sparkConf.set("spark.network.sasl.serverAlwaysEncrypt", "true");
sparkConf.set("spark.authenticate", "true");
sparkConf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer");
sparkConf.set("spark.kryo.registrator", "org.nd4j.Nd4jRegistrator");
try {
sparkConf.registerKryoClasses(new Class<?>[]{
Class.forName("org.apache.hadoop.io.LongWritable"),
Class.forName("org.apache.hadoop.io.Text")
});
} catch (Exception e) {}
sparkContext = new JavaSparkContext(sparkConf);
sparkContext.hadoopConfiguration().set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem");
sparkContext.hadoopConfiguration().set("fs.s3a.enableServerSideEncryption", "true");
sparkContext.hadoopConfiguration().set("spark.authenticate", "true");
Note, I added the spark.authenticate to the sparkContext's hadoop configuration in code instead of the core-site.xml (which I am assuming I can do that since other things work as well).
Looking here: https://github.com/apache/spark/blob/master/common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java it seems like both spark.authenticate's are necessary. When I run this application, I get the following stack trace.
17/01/03 22:10:23 INFO storage.BlockManager: Registering executor with local external shuffle service.
17/01/03 22:10:23 ERROR client.TransportClientFactory: Exception while bootstrapping client after 178 ms
java.lang.RuntimeException: java.lang.IllegalArgumentException: Unknown message type: -22
at org.apache.spark.network.shuffle.protocol.BlockTransferMessage$Decoder.fromByteBuffer(BlockTransferMessage.java:67)
at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.receive(ExternalShuffleBlockHandler.java:71)
at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:149)
at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:102)
at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:104)
at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:86)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
at java.lang.Thread.run(Thread.java:745)
In Spark's docs, it says
For Spark on YARN deployments, configuring spark.authenticate to true will automatically handle generating and distributing the shared secret. Each application will use a unique shared secret.
which seems wrong based on the comments in the yarn file above, but with trouble shooting, I am still lost on where I should go to get sasl to work? Am I missing something obvious that is documented somewhere?

So I finally figured it out. The previous StackOverflow thread was technically correct. I needed to add the spark.authenticate to the yarn configuration. Maybe it is possible to do this, but I can't figure out how to add this configuration in the code, which makes sense at a high level why this is the case. I will post my configuration below in case anyone else runs into this issue in the future.
First, I used an aws emr configurations file (An example of this is when using aws cli aws emr create-cluster --configurations file://youpathhere.json)
Then, I added the following json to the file:
[{
"Classification": "spark-defaults",
"Properties": {
"spark.authenticate": "true",
"spark.authenticate.enableSaslEncryption": "true",
"spark.network.sasl.serverAlwaysEncrypt": "true"
}
},
{
"Classification": "core-site",
"Properties": {
"spark.authenticate": "true"
}
}]

I got the same error message on Spark on Dataproc (Google Cloud Platform) after I added configuration options for Spark network encryption.
I initially created the Dataproc cluster with the following command.
gcloud dataproc clusters create test-encryption --no-address \
--service-account=<SERVICE-ACCOUNT> \
--zone=europe-west3-c --region=europe-west3 \
--subnet=<SUBNET> \
--properties 'spark:spark.authenticate=true,spark:spark.network.crypto.enabled=true'
The solution was to add in addition the configuration 'yarn:spark.authenticate=true'. A working Dataproc cluster with RPC encryption of Spark can therefore be created as follows.
gcloud dataproc clusters create test-encryption --no-address \
--service-account=<SERVICE-ACCOUNT> \
--zone=europe-west3-c --region=europe-west3 \
--subnet=<SUBNET> \
--properties 'spark:spark.authenticate=true,spark:spark.network.crypto.enabled=true,yarn:spark.authenticate=true'
I verified the encryption with ngrep. I installed ngrep as follows on the master node.
sudo apt-get update
sudo apt-get install ngrep
I then run ngrep on an arbitrary port 20001.
sudo ngrep port 20001
If you then run a Spark job with the following configuration properties you can see the encrypted communication between driver and worker nodes.
spark.driver.port=20001
spark.blockManager.port=20002
Note, I would always advice also to enable Kerberos on Dataproc to secure authentication for Hadoop, Yarn etc. This can be achieved with the flag --enable-kerberos in the cluster creation command.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

implementation of sink connector in java package - java

Related

Spring + Prometheus + Grafana: Err reading Prometheus: Post "http://localhost:9090/api/v1/query": dial tcp 127.0.0.1:9090: connect: connection refused

Connector task state fails to connect

Java/jmeter http request fails but curl works

Unable to pull JMX data using jolokia from Kafka

Spark SASL not working on the emr with yarn

Categories

Resources