Performance Issue: Latency Spike happens sometimes in Kafka Streams - java

I am doing Performance Testing in Kafka Streaming. I created a simple Streams API with Transformer.
// Stream data from input topic
builder.stream(Serdes.String(), Serdes.String(), inTopic)
// convert csv data to avro
.transformValues(new TransformSupplier())
// post converted data to output topic
.to(Serdes.String(), Serdes.ByteArray(), outTopic);
I am using inTopic with 10 partitions and outTopic with 1 partition. I am seeing latency is good and around ~4-6 ms . But, I am facing sudden spike in the latency sometimes and it reaches even upto ~60 - 1000 ms. Then after few seconds, it gradually dropped latency down back to ~4-6 ms. This results in the average latency for my whole experiment to ~67 ms.
What could be the reason for the sudden spike? Suggest me some performance tuning parameters if any.
Note: I have provided default StreamsConfig only.

After some amount of msgs producing, the data should be flushed to the disk.
This may cause the phenomenon you observed.
Please refer to the "log.flush.interval.messages" of kafka configuration: Link
After in prod, I do not recommend you to change this property to improve. You should change your system conf:
/proc/sys/vm/dirty_background_ratio
/proc/sys/vm/dirty_ratio
To improve the efficiency of your own msgs flush

Related

Kafka poll duration with slow network

I have a very silly question, but I cannot find in the Kafka documentation a clear answer.
Imagine this situation.
I have a network latency that go very slow 1Mb/s, and then I have a Kafka topic with 10 messages with a size of 500kbs.
If I set a consumer.poll(500 Milliseconds) taking into consideration the network latency I would get only 2 records in that iteration?
Regards.

storm - finding source(s) of latency

I have a three part topology that's having some serious latency issues but I'm having trouble figuring out where.
kafka -> db lookup -> write to cassandra
The numbers from the storm UI look like this:
(I see that the bolts are running at > 1.0 capacity)
If the process latency for the two bolts is ~65ms why is the 'complete latency' > 400 sec? The 'failed' tuples are coming from timeouts I suspect as the latency value is steadily increasing.
The tuples are connected via shuffleGrouping.
Cassandra lives on AWS so there are likely network limitations en route.
The storm cluster has 3 machines. There are 3 workers in the topology.
Your topology has several problems:
look at the capacity of the decode_bytes_1 and save_to_cassandra spouts. Both are over 1 (the sum of all spouts capacity should be under 1), which means you are using more resources than what do you have available. This is, the topology can't handle the load.
The TOPOLOGY_MAX_SPOUT_PENDING will solve your problem if the throughput of tuples varies during the day. This is, if you have peek hours, and you will be catch up during the off-peek hours.
You need to increase the number of worker machines or optimize the code in the bottle neck spouts (or maybe both). Otherwise you will not be able to process all the tuples.
You probably can improve the cassandra persister by inserting in batches instread of insert tuples one by one...
I seriously recommend you to always set the TOPOLOGY_MAX_SPOUT_PENDING for a conservative value. The max spout pending, means the maximum number of un-acked tuples inside the topology, remember this value is multiplied by the number of spots and the tuples will timeout (fail) if they are not acknowledged 30 seconds after being emitted.
And yes, your problem is having tuples timing out, this is exactly what is happening.
(EDIT) if you are running the dev environment (or just after deploy the topology) you might experience a spike in the traffic generated by messages that were not yet consumed by the spout; it's important you prevent this case to negatively affect your topology -- you never know when you need to restart the production topology, or perform some maintenance --, if this is the case you can handle it as a temporary spike in the traffic --the spout needs to consume all the messages produced while the topology was off-line -- and after a some (or many minutes) the frequency of incoming tuples stabilizes; you can handle this with max pout pending parameter (read item 2 again).
Considering you have 3 nodes in your cluster, and cpu usage of 0,1 you can add more executers to the bolts.
FWIW - it appears that the default value for TOPOLOGY_MAX_SPOUT_PENDING is unlimited. I added a call to stormConfig.put(Config.TOPOLOGY_MAX_SPOUT_PENDING, 500); and it appears (so far) that the problem has been alleviated. Possible 'thundering herd' issue?
After setting the TOPOLOGY_MAX_SPOUT_PENDING to 500:

Benchmarking Kafka - mediocre performance

I'm benchmarking Kafka 0.8.1.1 by streaming 1k size messages on EC2 servers.
I installed zookeeper on two m3.xlarge servers and have the following configuration:
dataDir=/var/zookeeper/
clientPort=2181
initLimit=5
syncLimit=2
server.server1=zoo1:2888:3888
server.server2=zoo2:2888:3888
Second I installed Single Kafka Server on i2.2xlarge machine with 32Gb RAM and additional 6 SSD Drives where each disk partitioned as /mnt/a , mnt/b, etc..... On server I have one broker, single topic on port 9092 and 8 partitions with replication factor 1:
broker.id=1
port=9092
num.network.threads=4
num.io.threads=8
socket.send.buffer.bytes=1048576
socket.receive.buffer.bytes=1048576
socket.request.max.bytes=104857600
log.dirs=/mnt/a/dfs-data/kafka-logs,/mnt/b/dfs-data/kafka-logs,/mnt/c/dfs-data/kafka-logs,/mnt/d/dfs-data/kafka-logs,/mnt/e/dfs-data/kafka-logs,/mnt/f/dfs-data/kafka-logs
num.partitions=8
log.retention.hours=168
log.segment.bytes=536870912
log.cleanup.interval.mins=1
zookeeper.connect=172.31.26.252:2181,172.31.26.253:2181
zookeeper.connection.timeout.ms=1000000
kafka.metrics.polling.interval.secs=5
kafka.metrics.reporters=kafka.metrics.KafkaCSVMetricsReporter
kafka.csv.metrics.dir=/tmp/kafka_metrics
kafka.csv.metrics.reporter.enabled=false
replica.lag.max.messages=10000000
All my tests are done from another instance and latency between instances is less than 1 ms.
I wrote producer/consumer java client using one thread producer and 8 threads consumer, when partition key is a random number from 0 till 7.
I serialized each message using Json by providing custom encoder.
My consumer producer properties are the following:
metadata.broker.list = 172.31.47.136:9092
topic = mytopic
group.id = mytestgroup
zookeeper.connect = 172.31.26.252:2181,172.31.26.253:2181
serializer.class = com.vanilla.kafka.JsonEncoder
key.serializer.class = kafka.serializer.StringEncoder
producer.type=async
queue.enqueue.timeout.ms = -1
batch.num.messages=200
compression.codec=0
zookeeper.session.timeout.ms=400
zookeeper.sync.time.ms=200
auto.commit.interval.ms=1000
number.messages = 100000
Now when I'm sending 100k messages, I'm getting 10k messages per second capacity and about 1 ms latency.
that means that I have 10 Megabyte per second which equals to 80Mb/s, this is not bad, but I would expect better performance from those instances located in the same zone.
Am I missing something in configuration?
I suggest you break down the problem. How fast is it without JSon encoding. How fast is one node, without replication vs with replication. Build a picture of how fast each component should be.
I also suggest you test bare metal machines to see how they compare as they can be significantly faster (unless CPU bound in which case they can be much the same)
According to this benchmark you should be able to get 50 MB/s from one node http://kafka.apache.org/07/performance.html
I would expect you should be able to get close to saturating your 1 Gb links (I assume thats what you have)
Disclaimer: I work on Chronicle Queue which is quite a bit faster, http://java.dzone.com/articles/kafra-benchmark-chronicle
If it makes sense for your application, you could get better performance by streaming byte arrays instead of JSON objects, and convert the byte arrays to JSON objects on the last step of your pipeline.
You might also get better performance if each consumer thread consistently reads from the same topic partition. I think Kafka only allows one consumer to read from a partition at a time, so depending on how you're randomly selecting partitions, its possible that a consumer would be briefly blocked if it's trying to read from the same partition as another consumer thread.
It's also possible you might be able to get better performance using fewer consumer threads or different kafka batch sizes. I use parameterized JUnit tests to help find the best settings for things like number of threads per consumer and batch sizes. Here are some examples I wrote which illustrate that concept:
http://www.bigendiandata.com/2016-10-02-Junit-Examples-for-Kafka/
https://github.com/iandow/kafka_junit_tests
I hope that helps.

Rabbit MQ: Improve queue flushing speed

I have a durable queue which holds persistent messages. The messages arrive into the queue at a rate of about 10 messages per second.
The client is unable to fetch those messages at that rate. As a result the queue on the server keeps growing.
Each message is less than 1 KB and I have a healthy 2 Mbps line between the server and my machine. Using a network monitoring utility, I found that it is hardly using any of that bandwidth.
The client is doing nothing with the messages as of now, just printing them to console so processing time on client is almost 0.
Some other details:
I am using a java client.
I have set the client to prefetch 10000 messages. (also tried with default values)
The round trip time is about 350 ms.
Messages are individually acknowledged.
The available resources are being underutilized and 10 messages per second is hardly any load in my opinion. How do I speed up things so that messages held up in queue are transferred faster to the client. Possibly using some sort of batching.
If you are indidivudally acknowledging messages every 350 ms, I would expect the consumer might achieve about 1/0.35 or about 2.9 messages per second. However, the protocol might not be that efficient and it may need two round trips to the server to acknowledge the message and get the next one. i.e. 1.4 message per second may be more realistic.
A round trip of 350 ms is very high, you can go around the world and back again in that time, so a simple solution may not work best for you. e.g. London -> New York -> Tokyo -> London.
I would try having a broker local to your client instead. This way the round trip is between your client and your local broker.

Benchmarking Apache Mina Total Bandwidth

I am developing a relatively fast paced game (Flash/Apache Mina Server back end) and I am having some difficulty getting an accurate benchmark of the type of bandwidth my current setup would use.
My question is: How do I get an accurate benchmark of the bandwidth required for my tests? What I am doing now wouldn't take into account any overhead?
On the message sent/received methods I am doing
[out/in]Bandwidth+= message.toString().getBytes().length;
I then print out the current values every 250 milliseconds (since that is how frequently "world" updates are done currently) .
With 10 "monsters" all randomly moving around and 1 player randomly moving around I am getting this output.. (1 second window here)
In bandwidth: 1647, Outgoing: 35378
In bandwidth: 1658, Outgoing: 35585
In bandwidth: 1669, Outgoing: 35792
In bandwidth: 1680, Outgoing: 35999
So acting strictly on the size of the messages (outgoing) being passed that works out to about 621 bytes/second or (621/10) 62.1 bytes per second per constantly moving item on screen per person. This seems a little low, a good high speed connection could handle 1000+ object updates per second at this "rate" no problem.
Something definitely smells fishy here. According to the performance testing provided by them: here mina is capable of 20K+ 405 byte requests per second on ~10 connections - way more than what you're seeing.
My guess is that there is some kind of theading\timing issue going on here that is causing the delay. I would enlist the help of a packet tracing application such as wireshark and see if your observations in code mesh with the raw network data. I would also try "flooding" the server side with more data if possible - this might provide some insight to where the issue lies.
I hope this helps, good luck.

Categories

Resources