So I am new to working with Apache Kafka and I am trying to create a simple app so I can try to understand the API better. I know this question has been asked a lot here, but how can I clear out the messages/records that are stored on a topic?
Most of the answers I have seen say to change the message retention time or to delete & recreate the topic. Neither of these are options for me as I do not have access to the server.properties file. I am not running Kafka locally, it is hosted on a server. Is there a way to do do it in Java code maybe or something?
If you are searching for a way to delete messages selectively, the new AdminClient API (usable from Java code) provides the following deleteRecords method :
https://kafka.apache.org/11/javadoc/org/apache/kafka/clients/admin/AdminClient.html
Related
I'm new to the streaming community
I'm trying to create a continuous query using kafka topics and flink but I haven't found any examples so I can get an idea of how to get started
can you help me with some examples?
thank you.
For your use case, I'm guessing you want to use kafka as source for continuous data. In this case you can use kafka-source-connector(linked below) and if you want to slice it with time you can use flink's Window Processing Function. This will group your kafka messages streamed in a particular timeframe like a list/map.
Flink Kafka source connector
Flink Window Processing Function
I develop a high load enterprise application.
There are 2 services which should be scaled in specific way. They use Azure EventHubs for messaging. When load increase we need to create one more instance of a service and create one more topic (Event Hub) for communication with other services.
Is there a way to create event hub dynamically from java code? For example if I use Kafka I can just pass name of topic that doesn't exist and it will create it by itself. When I try to do it with Azure EventHubs I have such error:
The messaging entity 'sb://eventhubdev.servicebus.windows.net/newTopic' could not be found.
So... is it possible to create and delete it programmatically?
Google didn't help me with this question clearly enough.
There might be a solution to scale via java, but I would challenge that.
Scaling should be handled by your infrastructure (e.g. kubernetes) and not your code.
Furthermore, I don't know if the eventhub is dynamically enough to be scaled in the first place.
Providing the eventhub could be done via terraform.
See Link for further details:
https://www.terraform.io/docs/providers/azurerm/r/eventhub.html
After long investigation we decided to create new topic by direct API calls as described in this doc: https://learn.microsoft.com/en-us/rest/api/eventhub/eventhubs/createorupdate
I'm planning to write my own Kafka connect CSV connector which will read the data from a CSV file and write the data to a topic. Data should be written to the topic in the form of JSON.
Also I came across kafka-connect-spooldir plugin of confluent. I don't want to use this and write my own.
Can anyone advice me how to go about creating a connector for the same?
The official Kafka documentation has a section on Connector development so that is probably the best first stop.
Kafka also ships with File Connectors (both Source and Sink). Have a look at the code: https://github.com/apache/kafka/tree/trunk/connect/file/src/main/java/org/apache/kafka/connect/file
It should not be too hard to modify these for your use case.
Finally as you mentioned, there are already connectors that can read CSV files and that are open source. So if you're stuck on something you can check how they did it.
Here is my requirement:
My java service is continuous running process. When the java service start, it loads the configuration from one of the table in MySql database.
When there are any changes (can be insert/update/delete, outside the java service also possible), java service should reloaded the configuration immediately.
I saw a post in stackoverflow which is for Oracle database, but I am looking for Mysql database.
I can able to achieve this using separate thread, which polling the database table with some interval. But, polling for change is a not a good way to do. Instead, I am looking for any Watcher/Listener which will notify when there is any change in the MySql table.
Can anybody help me to achieve this?
Note: If this question is already answered somewhere, please post the link in the comment, I will remove this question.
You want to do a Change Data Capture system. Perhaps you could use one based on Kafka.
You can use kafka-connect with Debezium on it. This connector first snapshot your table, then read the mysql binlog to be able to have a consistent kafka topic with insert/modify/delete on the table.
If you don't want use it, perhaps you can fork the code to use the same thing as they do, just modifying the output to be your application listener.
I am creating a storm based project where messages will be filtered by storm. My aim is to allow a user to adapt the filtering performed at runtime by sending configuration information to a zookeeper Znode.
I believe this is possible by setting a zookeeper watcher up within storm but I am struggling to achieve this. I would be gratefull for some guidance or a simple example on how to perfrom this.
I have looked at the Java docs and afraid the way to perfrom this does not seem obvious