I have a durable queue which holds persistent messages. The messages arrive into the queue at a rate of about 10 messages per second.
The client is unable to fetch those messages at that rate. As a result the queue on the server keeps growing.
Each message is less than 1 KB and I have a healthy 2 Mbps line between the server and my machine. Using a network monitoring utility, I found that it is hardly using any of that bandwidth.
The client is doing nothing with the messages as of now, just printing them to console so processing time on client is almost 0.
Some other details:
I am using a java client.
I have set the client to prefetch 10000 messages. (also tried with default values)
The round trip time is about 350 ms.
Messages are individually acknowledged.
The available resources are being underutilized and 10 messages per second is hardly any load in my opinion. How do I speed up things so that messages held up in queue are transferred faster to the client. Possibly using some sort of batching.
If you are indidivudally acknowledging messages every 350 ms, I would expect the consumer might achieve about 1/0.35 or about 2.9 messages per second. However, the protocol might not be that efficient and it may need two round trips to the server to acknowledge the message and get the next one. i.e. 1.4 message per second may be more realistic.
A round trip of 350 ms is very high, you can go around the world and back again in that time, so a simple solution may not work best for you. e.g. London -> New York -> Tokyo -> London.
I would try having a broker local to your client instead. This way the round trip is between your client and your local broker.
Related
Write a program that simulates a computer network using discrete time.
The first packet on each router queue makes one hop per time interval.
Each router has only a finite number of buffers. If a packet arrives
and there is no room for it, it is discarded and not retransmitted.
Instead, there is an end-to-end protocol, complete with timeouts and
acknowledgement packets, that eventually regenerate the packet from
the source router.
Plot the throughput of the network as a function of the end-to-end
timeout interval, parameterized by error rate.
I've had a few assignments kind of like this in the past. My best guess is that "hop" signifies transfer from one node in the network to another, and that "time interval" is arbitrary, and essentially represents an update cycle for the entire network.
I have a three part topology that's having some serious latency issues but I'm having trouble figuring out where.
kafka -> db lookup -> write to cassandra
The numbers from the storm UI look like this:
(I see that the bolts are running at > 1.0 capacity)
If the process latency for the two bolts is ~65ms why is the 'complete latency' > 400 sec? The 'failed' tuples are coming from timeouts I suspect as the latency value is steadily increasing.
The tuples are connected via shuffleGrouping.
Cassandra lives on AWS so there are likely network limitations en route.
The storm cluster has 3 machines. There are 3 workers in the topology.
Your topology has several problems:
look at the capacity of the decode_bytes_1 and save_to_cassandra spouts. Both are over 1 (the sum of all spouts capacity should be under 1), which means you are using more resources than what do you have available. This is, the topology can't handle the load.
The TOPOLOGY_MAX_SPOUT_PENDING will solve your problem if the throughput of tuples varies during the day. This is, if you have peek hours, and you will be catch up during the off-peek hours.
You need to increase the number of worker machines or optimize the code in the bottle neck spouts (or maybe both). Otherwise you will not be able to process all the tuples.
You probably can improve the cassandra persister by inserting in batches instread of insert tuples one by one...
I seriously recommend you to always set the TOPOLOGY_MAX_SPOUT_PENDING for a conservative value. The max spout pending, means the maximum number of un-acked tuples inside the topology, remember this value is multiplied by the number of spots and the tuples will timeout (fail) if they are not acknowledged 30 seconds after being emitted.
And yes, your problem is having tuples timing out, this is exactly what is happening.
(EDIT) if you are running the dev environment (or just after deploy the topology) you might experience a spike in the traffic generated by messages that were not yet consumed by the spout; it's important you prevent this case to negatively affect your topology -- you never know when you need to restart the production topology, or perform some maintenance --, if this is the case you can handle it as a temporary spike in the traffic --the spout needs to consume all the messages produced while the topology was off-line -- and after a some (or many minutes) the frequency of incoming tuples stabilizes; you can handle this with max pout pending parameter (read item 2 again).
Considering you have 3 nodes in your cluster, and cpu usage of 0,1 you can add more executers to the bolts.
FWIW - it appears that the default value for TOPOLOGY_MAX_SPOUT_PENDING is unlimited. I added a call to stormConfig.put(Config.TOPOLOGY_MAX_SPOUT_PENDING, 500); and it appears (so far) that the problem has been alleviated. Possible 'thundering herd' issue?
After setting the TOPOLOGY_MAX_SPOUT_PENDING to 500:
I have a web service with a load balancer that maps requests to multiple machines. Each of these requests end up sending a http call to an external API, and for that reason I would like to rate limit the number of requests I send to the external API.
My current design:
Service has a queue in memory that stores all received requests
I rate limit how often we can grab a request from the queue and process it.
This doesn't work when I'm using multiple machines, because each machine has its own queue and rate limiter. For example: when I set my rate limiter to 10,000 requests/day, and I use 10 machines, I will end up processing 100,000 requests/day at full load because each machine processes 10,000 requests/day. I would like to rate limit so that only 10,000 requests get processed/day, while still load balancing those 10,000 requests.
I'm using Java and MYSQL.
use memcached or redis keep api request counter per client. check every request if out rate limit.
if you think checking at every request is too expensive,you can try storm to process request log, and async calculate request counter.
The two things you stated were:
1)"I would like to rate limit so that only 10,000 requests get processed/day"
2)"while still load balancing those 10,000 requests."
First off, it seems like you are using a divide and conquer approach where each request from your end user gets mapped to one of the n machines. So, for ensuring that only the 10,000 requests get processed within the given time span, there are two options:
1) Implement a combiner which will route the results from all n machines to
another endpoint which the external API is then able to access. This endpoint is able
to keep a count of the amount of jobs being processed, and if it's over your threshold,
then reject the job.
2) Another approach is to store the amount of jobs you've processed for the day as a variable
inside of your database. Then, it's common practice to check if your threshold value
has been reached by the value in your database upon the initial request of the job
(before you even pass it off to one of your machines). If the threshold value has been
reached, then reject the job at the beginning. This, coupled with an appropriate message, has an advantage as having a better experience for the end user.
In order to ensure that all these 10,000 requests are still being load balanced so that no 1 CPU is processing more jobs than any other cpu, you should use a simple round robin approach to distribute your jobs over the m CPU's. With round robin, as apposed to a bin/categorization approach, you'll ensure that the job request is being distributed as uniformly as possible over your n CPU's. A downside to round robin, is that depending on the type of job you're processing you might be replicating a lot data as you start to scale up. If this is a concern for you, you should think about implementing a form of locality-sensitive hash (LSH) function. While a good hash function distributes the data as uniformly as possible, LSH exposes you to having a CPU process more jobs than other CPU's if a skew in the attribute you choose to hash against has a high probability of occurring. As always, there's tradeoffs associated with both, so you'll know best for your use cases.
Why not implement a simple counter in your database and make the API client implement the throttling?
User Agent -> LB -> Your Service -> Q -> Your Q consumers(s) -> API Client -> External API
API client checks the number (for today) and you can implement whatever rate limiting algorithm you like. eg if the number is > 10k the client could simply blow up, have the exception put the message back on the queue and continue processing until today is now tomorrow and all the queued up requests can get processed.
Alternatively you could implement a tiered throttling system, eg flat out til 8k, then 1 message every 5 seconds per node up til you hit the limit at which point you can send 503 errors back to the User Agent.
Otherwise you could go the complex route and implement a distributed queue (eg AMQP server) however this may not solve the issue entirely since your only control mechanism would be throttling such that you never process any faster than the something less than the max limit per day. eg your max limit is 10k so you never go any faster than 1 message every 8 seconds.
If you're not adverse to using a library/service https://github.com/jdwyah/ratelimit-java is an easy way to get distributed rate limits.
If performance is of utmost concern you can acquire more than 1 token in a request so that you don't need to make an API request to the limiter 100k times. See https://www.ratelim.it/documentation/batches for details of that
I am developing agents to collect data from different sources, the data should be posted to a channel at high frequency (say every 15 seconds). REST is definitely not a solution. The requirement is clearly fire and forget as status reply is not concerned.
Throughput is more important, message drops are acceptable upto 5%.
Possible solutions I come across are
Message Bus
Multicast
UDP
Any alternatives, please suggest.
IMHO High frequency is too fast too see and 15 seconds you can see. It takes about 0.5 seconds to send a message round the world and back again. You can just about see 15 milli-seconds. And if you are talking about 15 micro-seconds, that is definitely high frequency. I have a persisted messaging solution with a latency of around 0.1 micro-seconds which is 0.0000001 seconds, but I don't suggest you need that.
If all you need is a message every 15 seconds I would use the simplest solution which comes to mind. I would try ActiveMQ which I found to be one of the simplest to get working. You should be able to achieve message rates of up to 20,000 per seconds and decent latencies of about 0.01 seconds and you shouldn't lose any messages.
I'm creating a service on appengine that feeds back measurements to the user. The measurements are collected by polling another server every fifteen minutes (the user needs four measurements over the last hour). The other sever replies with the data immediately so this isn't a "long poll request". I don't expect there to be a high load on the server because there aren't a lot of users (maybe 20 requests a day or so) so there won't be many requests coming in for the data, but because the user needs data over the last hour I am forced to poll continuously. This makes me concerned about billing because the new billing system charges per instance hour at a 15 min granularity and this would mean I'd have an instance actively running 24/7 (as far as I can tell).
Question
So, I expect a low request rate and am not too concerned about latency etc. How can I optomise this setup for the lowest possible billing?
What I had planned
What I was planning to do was try and get away with the free quota for now by setting max idle instance to 1 and only using the frontend to do both polling and serving (I'm guessing site responsivness will suffer a fair amount) because the frontend has far more free instance hours (28) than the backend (9). Can the frontend even be set up to poll every 15 mins?
There's nothing you can really tweak here for this. You'll want to use cron or the task queue for the polling anyway; these use frontend instances, not backend instances. As long as you have multithreading enabled, frontend latency will not be affected, and you'll likely remain within your free quota as long as you don't do enough polling or get enough traffic to require more than one concurrent instance.