Update TTL for a particular topic in kafka using Java - java

Update TTL for a topic so records stay in the topic for 10 days. I have to do this for a particular topic only by Leaving all other topics TTL's the same, current configuration, I have to do this using java because I am pushing a topic to kafka through Java. I am setting following properties for pushing a topic to kafka
Properties props = new Properties();
props.put("bootstrap.servers", KAFKA_SERVERS);
props.put("acks", ACKS);
props.put("retries", RETRIES);
props.put("linger.ms", new Integer(LINGER_MS));
props.put("buffer.memory", new Integer(BUFFER_MEMORY));
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");

You can do that using the AdminClient, following a snippet of code that get the current configuration (just for testing) and then update the "retention.ms" config on the topic named "test".
Properties props = new Properties();
props.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
AdminClient adminClient = AdminClient.create(props);
ConfigResource resource = new ConfigResource(ConfigResource.Type.TOPIC, "test");
// get the current topic configuration
DescribeConfigsResult describeConfigsResult =
adminClient.describeConfigs(Collections.singleton(resource));
Map<ConfigResource, Config> config = describeConfigsResult.all().get();
System.out.println(config);
// create a new entry for updating the retention.ms value on the same topic
ConfigEntry retentionEntry = new ConfigEntry(TopicConfig.RETENTION_MS_CONFIG, "50000");
Map<ConfigResource, Config> updateConfig = new HashMap<ConfigResource, Config>();
updateConfig.put(resource, new Config(Collections.singleton(retentionEntry)));
AlterConfigsResult alterConfigsResult = adminClient.alterConfigs(updateConfig);
alterConfigsResult.all();
describeConfigsResult = adminClient.describeConfigs(Collections.singleton(resource));
config = describeConfigsResult.all().get();
System.out.println(config);
adminClient.close();

Related

Huge delay in kafka stream processing

I have a stream topology as follows:
Topology topology = new Topology();
//WS connection processor
topology.addSource(WS_CONNECTION_SOURCE, new StringDeserializer(), new WebSocketConnectionEventDeserializer()
, utilService.getTopicByType(TopicType.CONNECTION_EVENTS_TOPIC))
.addProcessor(SESSION_PROCESSOR, WSUserSessionProcessor::new, WS_CONNECTION_SOURCE)
.addStateStore(sessionStoreBuilder, SESSION_PROCESSOR)
.addSink(WS_STATUS_SINK, utilService.getTopicByType(TopicType.ONLINE_STATUS_TOPIC),
stringSerializer, stringSerializer
, SESSION_PROCESSOR)
//WS session routing
.addSource(WS_ROUTING_BY_SESSION_SOURCE, new StringDeserializer(), new StringDeserializer(),
utilService.getTopicByType(TopicType.NOTIFICATION_TOPIC))
.addProcessor(WS_ROUTING_BY_SESSION_PROCESSOR, WSSessionRoutingProcessor::new,
WS_ROUTING_BY_SESSION_SOURCE)
.addStateStore(userConnectedNodesStoreBuilder, WS_ROUTING_BY_SESSION_PROCESSOR, SESSION_PROCESSOR);
topology.addGlobalStore(serviceDiscoveryGlobalStoreBuilder, "WS_SERVICE_DISCOVERY", new StringDeserializer(),
new ServiceDiscoveryEventDeserializer(), "ws-service-discovery", "WS_SERVICE_DISCOVERY_PROCESSOR",
ServiceDiscoveryProcessor::new);
While analyzing logs, there is huge delay in messages coming in WS_ROUTING_BY_SESSION_PROCESSOR process method. Delay is of the order of 10+ seconds.
This is the configuration used in streams:
Properties properties = new Properties();
properties.setProperty(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaServers);
properties.setProperty(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "500");
properties.setProperty(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, "60000");
properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
properties.put(ConsumerConfig.REQUEST_TIMEOUT_MS_CONFIG, "60000");
properties.put(ConsumerConfig.DEFAULT_API_TIMEOUT_MS_CONFIG, "60000");
Partitions for NOTIFICATION_TOPIC is 10 and number of stream threads is 10

Can't get KafkaProducer/KafkaConsumer to work in Scala

I'm trying to create a simple KafkaProducer and KafkaConsumer so I can send data to a topic on a broker, and then verify that the data was received. I have below the two methods I used to define my consumer and producer, and how I'm sending the message. The send method takes at lest 20 seconds to complete, and as far as I can tell the consumer.poll method never actually finishes, but the longest I've left it was 10 minutes.
Does anyone have a suggestion as to what I'm doing wrong? Is there some property for the producer/consumer that I'm not setting up correctly? Those properties are copied directly from the docs, so I don't understand why they won't work.
KafkaProducer docs
KafkaConsumer docs
"verify we can send to producer" in {
val consumer = createKafkaConsumer("address:9002")
val producer = createKafkaProducer("address:9002")
val message = "I am a message"
val record = new ProducerRecord[String, String]("myTopic", message)
producer.send(record)
TimeUnit.SECONDS.sleep(5)
val records = consumer.poll(5000)
println("records: "+records)
consumer1.close()
}
def createKafkaProducer(kafka: String): KafkaProducer[String,String] = {
val props = new Properties()
props.put("bootstrap.servers", kafka)
props.put("acks", "all")
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer")
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer")
new KafkaProducer[String,String](props)
}
def createKafkaConsumer(kafka: String): KafkaConsumer[String, String] = {
val props = new Properties()
props.put("bootstrap.servers", kafka)
props.put("group.id", "test")
props.put("enable.auto.commit", "true")
props.put("auto.commit.interval.ms", "1000")
props.put("session.timeout.ms", "30000")
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
val consumer = new KafkaConsumer[String, String](props)
consumer.subscribe(Collections.singletonList("myTopic"))
consumer
}
Edit: I've updated my code so that I now get the response from the send method, and it seems that that times out with org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
Turns out I had a DNS issue that meant that I wasn't actually connecting to the broker. Fixing this allowed the messages to go through, there was nothing wrong with the config.

Apache Kafka LEADER_NOT_AVAILABLE

I'm running into an issue with apache Kafka that I don't understand . I subscribe to a topic in my broker called "topic-received" . This is the code :
protected String readResponse(final String idMessage) {
if (props != null) {
kafkaClient = new KafkaConsumer<>(props);
logger.debug("Subscribed to topic-received");
kafkaClient.subscribe(Arrays.asList("topic-received"));
logger.debug("Waiting for reading : topic-received");
ConsumerRecords<String, String> records =
kafkaClient.poll(kafkaConfig.getRead_timeout());
if (records != null) {
for (ConsumerRecord<String, String> record : records) {
logger.debug("Resultado devuelto : "+record.value());
return record.value();
}
}
}
return null;
}
As this is happening, I send a message to "topic-received" from another point . The code is the following one :
private void sendMessageToKafkaBroker(String idTopic, String value) {
Producer<String, String> producer = null;
try {
producer = new KafkaProducer<String, String>(mapProperties());
ProducerRecord<String, String> producerRecord = new
ProducerRecord<String, String>("topic-received", value);
producer.send(producerRecord);
logger.info("Sended value "+value+" to topic-received");
} catch (ExceptionInInitializerError eix) {
eix.printStackTrace();
} catch (KafkaException ke) {
ke.printStackTrace();
} finally {
if (producer != null) {
producer.close();
}
}
}
First time I try , with topic "topic-received", I get a warning like this
"WARN 13164 --- [nio-8085-exec-3] org.apache.kafka.clients.NetworkClient :
Error while fetching metadata with correlation id 1 : {topic-
received=LEADER_NOT_AVAILABLE}"
But if I try again, to this topic "topic-received", works ok, and no warning is presented . Anyway, that's not useful for me, because I have to listen from a topic and send to a topic new each time ( referenced by an String identifier ex: .. 12Erw45-2345Saf-234DASDFasd )
Looking for LEADER_NOT_AVAILABLE in google , some guys talk about adding to server.properties the next lines :
host.name=127.0.0.1
advertised.port=9092
advertised.host.name=127.0.0.1
But it's not working for me ( Don't know why ) .
I have tried to create the topic before all this process with the following code:
private void createTopic(String idTopic) {
String zookeeperConnect = "localhost:2181";
ZkClient zkClient = new ZkClient(zookeeperConnect,10000,10000,
ZKStringSerializer$.MODULE$);
ZkUtils zkUtils = new ZkUtils(zkClient, new
ZkConnection(zookeeperConnect),false);
if(!AdminUtils.topicExists(zkUtils,idTopic)) {
AdminUtils.createTopic(zkUtils, idTopic, 2, 1, new Properties(),
null);
logger.debug("Created topic "+idTopic+" by super user");
}
else{
logger.debug("topic "+idTopic+" already exists");
}
}
No error, but still, it stays listening till the timeout.
I have reviewed the properties of the broker to check if there's any help, but I haven't found anything clear enough . The props that I have used for reading are :
props = new Properties();
props.put("bootstrap.servers", kafkaConfig.getBootstrap_servers());
props.put("key.deserializer", kafkaConfig.getKey_deserializer());
props.put("value.deserializer", kafkaConfig.getValue_deserializer());
props.put("key.serializer", kafkaConfig.getKey_serializer());
props.put("value.serializer", kafkaConfig.getValue_serializer());
props.put("group.id",kafkaConfig.getGroupId());
and , for sending ...
Properties props = new Properties();
props.put("bootstrap.servers", kafkaConfig.getHost() + ":" +
kafkaConfig.getPort());
props.put("group.id", kafkaConfig.getGroup_id());
props.put("enable.auto.commit", kafkaConfig.getEnable_auto_commit());
props.put("auto.commit.interval.ms",
kafkaConfig.getAuto_commit_interval_ms());
props.put("session.timeout.ms", kafkaConfig.getSession_timeout_ms());
props.put("key.deserializer", kafkaConfig.getKey_deserializer());
props.put("value.deserializer", kafkaConfig.getValue_deserializer());
props.put("key.serializer", kafkaConfig.getKey_serializer());
props.put("value.serializer", kafkaConfig.getValue_serializer());
Any clue ? Why , the only way that I have to consume messages from the broker and from the topic, is repeating the request after an error ?
Thanks in advance
This happens when trying to produce messages to a topic that doesn't exist
PLEASE NOTE: In some Kafka installations, the framework can automatically create the topic when it doesn't exist, that explains why you see the issue only once at the very beginning.
This error appears when your Topic name doesn't exist.
To list all topics execute following command:
kafka-topics --list --zookeeper localhost:2181

Example to read from KafkaSpout and write same back to Kafka Bolt

I am trying to write a Storm based code which reads the message from one topic and writes back to another topic. Input topic has data in ProtoBuf format and output will have JSON format. I am not able to achieve it.
This is code which build the topology:
Config conf = new Config();
//set producer properties.
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9093");
props.put("request.required.acks", "1");
props.put("key.serializer", "org.apache.kafka.common.serialization.ByteArraySerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.ByteArraySerializer");
conf.put("kafka.broker.config", props);
conf.put(KafkaBolt.TOPIC, "out-storm");
KafkaBolt bolt = new KafkaBolt()
.withProducerProperties(props)
.withTopicSelector(new DefaultTopicSelector("out-storm")).withTupleToKafkaMapper(new FieldNameBasedTupleToKafkaMapper<String, String>());
BrokerHosts hosts = new ZkHosts("localhost:2181");
SpoutConfig spoutConfig = new SpoutConfig(hosts, "incoming-server", "/" + "incoming-server",
UUID.randomUUID().toString());
spoutConfig.scheme = new SchemeAsMultiScheme(new StringScheme());
KafkaSpout kafkaSpout = new KafkaSpout(spoutConfig);
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("kafka-spout", kafkaSpout);
builder.setBolt("lookup-bolt", new ReportBolt(),4).shuffleGrouping("kafka-spout");
builder.setBolt("kafka-producer-spout", bolt).shuffleGrouping("lookup-bolt");
LocalCluster cluster = new LocalCluster();
Config config = new Config();
config.setDebug(true);
config.put(Config.TOPOLOGY_MAX_SPOUT_PENDING, 1);
config.put("kafka.broker.config", props);
config.put(KafkaBolt.TOPIC, "out-storm");
cluster.submitTopology("KafkaStormSample", config, builder.createTopology());
Thread.sleep(1000000);
In report Bolt I have done this:
System.out.println("HELLO " + input);
JSONObject jo= new JSONObject();
for (String f:input.getFields()){
jo.put(f, input.getValueByField(f));
}
collector.ack(input);
List<Object> list = new ArrayList<Object>();
list.add(jo);
collector.emit(list);
When I am starting getting this error:
5207 [main] WARN o.a.s.d.nimbus - Topology submission exception. (topology name='KafkaStormSample') #error {
:cause nil
:via
[{:type org.apache.storm.generated.InvalidTopologyException
:message nil
:at [org.apache.storm.daemon.common$validate_structure_BANG_ invoke common.clj 181]}]
:trace
[[org.apache.storm.daemon.common$validate_structure_BANG_ invoke common.clj 181]
[org.apache.storm.daemon.common$system_topology_BANG_ invoke common.clj 360]
[org.apache.storm.daemon.nimbus$fn__7064$exec_fn__2461__auto__$reify__7093 submitTopologyWithOpts nimbus.clj 1512]
[org.apache.storm.daemon.nimbus$fn__7064$exec_fn__2461__auto__$reify__7093 submitTopology nimbus.clj 1544]
[sun.reflect.NativeMethodAccessorImpl invoke0 NativeMethodAccessorImpl.java -2]
[sun.reflect.NativeMethodAccessorImpl invoke NativeMethodAccessorImpl.java 62]
[sun.reflect.DelegatingMethodAccessorImpl invoke DelegatingMethodAccessorImpl.java 43]
[java.lang.reflect.Method invoke Method.java 497]
[clojure.lang.Reflector invokeMatchingMethod Reflector.java 93]
[clojure.lang.Reflector invokeInstanceMethod Reflector.java 28]
[org.apache.storm.testing$submit_local_topology invoke testing.clj 301]
[org.apache.storm.LocalCluster$_submitTopology invoke LocalCluster.clj 49]
[org.apache.storm.LocalCluster submitTopology nil -1]
[com.mediaiq.StartStorm main StartStorm.java 81]]}
I think that the problem is that you are referencing the wrong port on your bootstrap.server config. Try changing it to the 9092.

Kafka 0.8.2 consumer

I am implementing a simple Kafka consumer in Java. Here is the code:
public class TestConsumer {
public static void main(String []a) throws Exception{
Properties props = new Properties();
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("partition.assignment.strategy", "round-robin");
props.put("group.id", "test");
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,"localhost:9092");
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);
try{
consumer.subscribe("ay_sparktopic");
Map<String, ConsumerRecords<String, String>> msg = consumer.poll(100);
System.out.println(msg);
}catch(Exception e){
System.out.println("Exception");
}
}
}
Above consumer gives following error message:
16/03/30 18:01:07 WARN ConsumerConfig: The configuration group.id = test was supplied but isn't a known config.
16/03/30 18:01:07 WARN ConsumerConfig: The configuration partition.assignment.strategy = round-robin was supplied but isn't a known config.
Any documentation I check online gives either range or roundrobin as possible assignment strategies and groupId is a custom name to my knowledge. Not sure what would be right config values here.
It looks like you´re trying to use the new consumer API that´s only available in Kafka 0.9+. To use the older API you have to import classes from the kafka.javaapi.consumer.* package instead of the new org.apache.kafka.clients.consumer package.
consumer.subscribe and consumer.poll relates to the new API so if you really want to use the old API, you need to change your code accordingly. If you instead want to use the new consumer API, you need to run Kafka 0.9 or later.
Using the below dependency resolves the issue.
libraryDependencies += "org.apache.kafka" % "kafka_2.11" % "0.9.0.0"
Even when you are having previous version running E.g.,kafka 0.8.2.1.

Categories

Resources