How can I produce messages with Kafka 8.2 API in Java? - java

I'm trying to work with the kafka API in java. I'm using the following maven dependency:
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>0.8.2.0</version>
</dependency>
I'm having trouble connecting to a remote kafka server.
I changed the kafka 'server.properties' file port attribute to be port 8080.
I can start both the zookeeper and the kafka server no problem.
I can also use the console producer and consumer applications that came with the kafka download. (Scala 2.10 version)
I'm using the following client code to create a remote KafkaProducer
Properties propsProducer = new Properties();
propsProducer.put("bootstrap.servers", "172.xx.xx.xxx:8080");
propsProducer.put("key.serializer", org.apache.kafka.common.serialization.ByteArraySerializer.class);
propsProducer.put("value.serializer", org.apache.kafka.common.serialization.ByteArraySerializer.class);
propsProducer.put("topic.metadata.refresh.interval.ms", "0");
KafkaProducer<byte[], byte[]> m_kafkaProducer = new KafkaProducer<byte[], byte[]>(propsProducer);
Once I've created the producer, I can run the following line and get valid topic info returned, granted strTopic is an existing topic name.
List<PartitionInfo> partitionInfo = m_kafkaProducer.partitionsFor(strTopic);
When I try to send a message, I do the following:
ProducerRecord<byte[], byte[]> prMessage = new ProducerRecord<byte[],byte[]>(strTopic, strMessage.getBytes());
RecordMetadata futureData = m_kafkaProducer.send(prMessage).get();
The call to send() blocks indefinitely and when I manually terminate the process, I see that the ERROR Closing socket because of error on kafka server(IOException, Connection Reset by Peer) error.
Also, it's worth nothing that the host.name, advertised.host.name, and advertised.port properties are all still commented out on the 'server.properties' file. Oh, and if I change the line:
propsProducer.put("bootstrap.servers", "172.xx.xx.xxx:8080");
to
propsProducer.put("bootstrap.servers", "127.0.0.1:8080");
and run it on the same server as where the kafka server is installed, it works but I'm trying to work with it remotely.
Appreciate any help and if I can clarify at all let me know.

After lots of digging, I decided to implement the example found here: Kafka Producer Example. I shortened the code and didn't implement a partitioner class. I updated my pom with the dependency listed and I was still having the same issue. Ultimately, I made some configuration changes and everything worked.
The final piece of the puzzle was defining the Kafka server in /etc/hosts of both the server and the client machines. I added the following to both files.
172.xx.xx.xxx serverHost1
Again, the x's are just masks. Then, I set the advertised.host.name in the server.properties file to serverHost1. NOTE: I got that IP after running an ifconfig on the server machine.
I changed the line
propsProducer.put("metadata.broker.list", "172.xx.xx.xxx:8080");
to
propsProducer.put("metadata.broker.list", "serverHost1:8080");
The Kafka API didn't like the fact that I was defining an IP as a string. Instead it was looking up the IP from within the etc/hosts file although the documentation says:
"Hostname the broker will advertise to producers and consumers. If not set, it uses the value for "host.name" if configured. Otherwise, it will use the value returned from java.net.InetAddress.getCanonicalHostName()."
Which will just return the IP, in the string form, I was previously using if not defined in etc/hosts of client machine, otherwise it returns the name paired with the IP (serverHost1 in my case). Also, I never did set the value of host.name either.

Related

Problems using activemq-client jar vs activemq-all jar

I need to use activemq-client rather than the roll-up activemq-all JAR files because the roll-up all contains different versions of other libraries we use.
I'm using maven to manage dependencies, the client jar pulls in:
activemq-client (5.15.8)
slf4j-api 1.7.25
geronimo-jms_1.1_spec (1.1.1)
hawtbuf (1.11)
geronimo-j2ee-management_1.1_spec (1.0.1)
The all jar is just activemq-all (5.15.8)
Using this code, with the activemq-all jar, I can connect and start receiving messages. At the createConnection() call, I get a log message "Successfully connected to ..."
Using the activemq-client jar, it hangs at the createSession() call (and outputs a "failed after 10 attempts, will continue trying" message). I do not get the "Successfully connected to ..." message.
ConnectionFactory factory = new ActiveMQConnectionFactory(user, pass, url);
Connection AMQconn = factory.createConnection();
Session AMQsess = AMQconn.createSession(false, Session.AUTO_ACKNOWLEDGE);
Queue queue = AMQsess.createQueue(queueName);
MessageConsumer AMQconsumer = AMQsess.createConsumer(queue);
I assume I'm missing a dependency somewhere, but I'm not getting a no class def found exceptions, etc.
(I also used activemq version 5.15.9, but our server is 5.15.8, so sticking with that).
The bigger picture (why the client jar vs the roll-up jar): I need to connect to a hornetQ and an AMQ in the same process, and breaking out the individual jars is my attempt at fixing conflicting versions of things in the roll-up jars)
The question omits the URI but the comments seem to indicate that the user is try to connect via a URI of the form: auto://localhost:61616. This would be the problem given the Auto transport makes no sense on the client end as it is meant to detect at the broker side automatically what protocol a connecting client is using and switch to that protocol automatically. The Auto transport allows the broker to support multiple protocols on a single open port that clients would connect to.
The ActiveMQ JMS client must always be using the Openwire protocol (that's what it was built to do) and therefore the URI for the client would be of the form tcp://, ssl:// or failover:// etc.
There are some special convenience classes that kick in if you include the ActiveMQ broker jar that will just map URIs that include scheme's that don't make sense in the client like nio, nio+ssl or auto but they are not included in the client only jar as they really don't belong there since they just aren't intended for use on the client side.

Kafka Connect implementation errors

I was running through the tutorial here: http://kafka.apache.org/documentation.html#introduction
When I get to "Step 7: Use Kafka Connect to import/export data" and attempt to start two connectors I am getting the following errors:
ERROR Failed to flush WorkerSourceTask{id=local-file-source-0}, timed out while waiting for producer to flush outstanding messages, 1 left
ERROR Failed to commit offsets for WorkerSourceTask
Here is the portion of the tutorial:
Next, we'll start two connectors running in standalone mode, which means they run in a single, local, dedicated process. We provide three configuration files as parameters. The first is always the configuration for the Kafka Connect process, containing common configuration such as the Kafka brokers to connect to and the serialization format for data. The remaining configuration files each specify a connector to create. These files include a unique connector name, the connector class to instantiate, and any other configuration required by the connector.
bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties
I have spent some time looking for a solution, but was unable to find anything useful. Any help is appreciated.
Thanks!
The reason I was getting this error was because the first server I created using the config/server.properties was not running. I am assuming that because it is the lead of the topic, the messages could not be flushed and the offsets could not be committed.
Once I started the kafka server using the server propertes (config/server.properties) This issue was resolved.
You need to start Kafka server and Zookeeper before running Kafka Connect.
You need to exec the cmds in "Step 2: Start the server" below first:
bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties
from here:https://mail-archives.apache.org/mod_mbox/kafka-users/201601.mbox/%3CCAK0BMEpgWmL93wgm2jVCKbUT5rAZiawzOroTFc_A6Q=GaXQgfQ#mail.gmail.com%3E
You need to start zookeeper and kafka server first before running that line.
start zookeeper
bin/zookeeper-server-start.sh config/zookeeper.properties
start multiple kafka servers
bin/kafka-server-start.sh config/server.properties
bin/kafka-server-start.sh config/server-1.properties
bin/kafka-server-start.sh config/server-2.properties
start connectors
bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties
Then you will see some lines are written into test.sink.txt:
foo
bar
And you can start the consumer to check it:
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic connect-test --from-beginning
{"schema":{"type":"string","optional":false},"payload":"foo"}
{"schema":{"type":"string","optional":false},"payload":"bar"}
If you configure your Kafka Broker with a hostname such as my.sandbox.com make sure that you modify the config/connect-standalone.properties accordingly:
bootstrap.servers=my.sandbox.com:9092
On Hortonworks HDP the default port is 6667, hence the setting is
bootstrap.servers=my.sandbox.com:6667
If Kerberos is enabled you will need the following settings as well (without SSL):
security.protocol=PLAINTEXTSASL
producer.security.protocol=PLAINTEXTSASL
producer.sasl.kerberos.service.name=kafka
consumer.security.protocol=PLAINTEXTSASL
consumer.sasl.kerberos.service.name=kafka

Get Accumulo instance name

I want to use GeoMesa (GIS extension of Accumulo) and virtualized it using Docker just like this repo. Now I want to connect to the Accumulo instance using Java using:
Instance i = new ZooKeeperInstance("docker_instance",zkIP:port);
Connector conn = i.getConnector(user, new PasswordToken(password));
The connetion does not get established and hangs (just like in this question). I can connect to the ZooKeeper instance using using
./zkCli.sh -server ip:port
So i guess the instance_name is wrong. I used the one noted in the repo linked first. However I don't know where how to check the instance_name needed.
To make my problem reproducable I did setup a digital ocean server with all necessary dependencies and accumulo. I tested that the connection to zookeeper is possible using zkCli and checked the credentials using accumulo shell on the server.
Instance i = new ZooKeeperInstance("DIGITAL_OCEAN","46.101.199.216:2181");
// WARN org.apache.accumulo.core.client.ClientConfiguration - Found no client.conf in default paths. Using default client configuration values.
System.out.println("This is reached");
Connector conn = i.getConnector("root", new PasswordToken("mypassw"));
System.out.println("This is not reached");
As a troubleshooting step, you may be able to extract the instance name by using HdfsZooInstance.getInstance().getInstanceName() or by connecting directly to ZooKeeper and listing the instance names with ls /accumulo/instances/
There are multiple easy ways to get the instance_name: Ether just look to the top of the accumulo status page as elserj noted in the comments or use zkCli to connect to Zookeeper and use ls /accumulo/instances / as Christopher answered.
However I could not manage to connect to accumulo using the ordinary Java Connector. Nevertheless I managed to connect to Accumulo using the Proxy-Settings which is a valid solution for me, even that I still would have liked to find the problem.

kafka + zookeeper remote = error

I am trying to install a kafka & zookeeper instance on a remote server. I only need 1 node of each actually because i only want to provide remote kafka for test purposes.
Kafka and Zookeeper are running from the Apache Kafka tarball you can find there (v0.0.9), inside a Docker image.
Trying to consume / produce using the provided scripts. And trying to produce using own java application. Everythinf is working fine if Kafka & ZK are installed on the local server.
Here is the error I get while trying to produce :
BrokerPartitionInfo:83 - Error while fetching metadata [{TopicMetadata for topic RSS ->
No partition metadata for topic RSS due to kafka.common.LeaderNotAvailableException}] for topic [RSS]: class kafka.common.LeaderNotAvailableException
Kafka properties tested
First :
borker.id=0
port=9092
host.name=<external-ip>
zookeeper.connect=localhost:<PORT>
Second:
borker.id=0
port=9092
host.name=<external-ip>
zookeeper.connect=<external-ip>:<PORT>
Third:
borker.id=0
port=9092
host.name=<external-ip>
zookeeper.connect=<external-ip>:<PORT>
advertised.host.name=<external-ip>
advertised.host.port=<external-ip>
Last:
borker.id=0
port=9092
host.name=</etc/host name>
zookeeper.connect=<external-ip>:<PORT>
advertised.host.name=<external-ip>
advertised.host.port=<external-ip>
Here is my "/etc/hosts"
127.0.0.1 kafka kafka
127.0.0.1 localhost
I followed the Getting Started, which if I understood is a localhost / signle server configurations. I cannot understand what I have to do to get this work with remote calls...
Thanks for your help !
EDIT 1
host.name=localhost
advertised.host.name=politik.cm-cloud.fr
Seems to allow a local consumer (on the server) and producer. But if we want to do the same from a remote server we get
[2015-12-09 12:44:10,826] WARN Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
java.net.NoRouteToHostException: No route to host
The error does not look like connectivity problem with Zookeeper / Kafka.
Just follow the instruction in "quickstart" from http://kafka.apache.org/
BrokerPartitionInfo:83 - Error while fetching metadata [{TopicMetadata for topic RSS ->
Additionally the error indicates there is no partition info i.e topic not yet created . Try creating topics first and then try to produce/consume because when producing to a non existent topic kafka will create the topic based on auto.create.topics.enable in server.properties but remotely it is better to create topics rathen than relying on auto create

Storm BasicDRPC client execute

I am a beginner storm user. I am trying out drpc server in remote mode. I got drpc server started and configured the drpc server location in yaml file. BUT, I am not understanding how the drpc client code should look like:
https://github.com/nathanmarz/storm-starter/blob/master/src/jvm/storm/starter/BasicDRPCTopology.java
Here is what I did:
Launched DRPC server(s) (storm drpc command)
Configure the locations of the DRPC servers (edited the yaml file. Added the local host name)
Submit DRPC topologies to Storm cluster - did this, looks like the topology is up and running.
But how do I get a client to call/execute on this topology? Do I need something like this?https://github.com/mykidong/storm-finagle-drpc-client/blob/master/src/main/java/storm/finagle/drpc/StormDrpcClient.java ?? I tried but I keep getting this error:
storm/starter/DRPCClient.java:[68,18] error: execute(String,String) in DRPCClient cannot implement execute(String,String) in Iface
[ERROR] overridden method does not throw TException
What am I missing here? thanks
Here is Storm DRPC Document
Maybe useful to understand DRPC Call :)
Just Like following code :
DRPCClient client = new DRPCClient("drpc-host", 3772);
String result = client.execute("reach", "http://twitter.com");
Create a client connection to DRPC-Server-Host: drpc-host at 3772 port .
DRPCClient called "reach" function using argument "http://twitter.com"
and return a string named result

Categories

Resources