I need a way to limit message count on my IRC bot to avoid a global ban from twitch for chat flooding. (They allow 100 messages/30 seconds)
There are two ways I considered doing this both involving a message queue.
Each message starts a thread which takes a counting semaphore. This thread then blocks for 30 seconds and releases after that time. This would be a very clean solution as the queue would be entirely managed by the OS which means less work for me, however, it may result in creating hundreds of threads. These threads will be sleeping for most of their lifetime, but I'm not sure if it considered okay to launch hundreds of threads that do nothing, effectively. They won't take up time slices from the scheduler when they are asleep but they would consume a lot of memory and there would be a lot of overhead in creating them.
Store a stack of timestamps and if a time-stamp is >30 seconds old remove it every time a message needs to be sent. Have a thread running that checks the bottom of the stack every (10-50ms) to see if the time-stamp is >30 seconds old and remove if it is and then send a message from the highest position in the queue that has not been sent if it exists. When a message comes in to be sent it sends it immediately if there are <# messages in the queue.
1 has the downside of creating many threads that do nothing.
2 has the downside of needing 1 thread to poll the message list constantly.
2 could be improved to calculate the time needed to wait till the bottom message in the stack is 30 seconds old and send the message then, but I feel as if I am overcomplicating the problem at that stage.
Any thoughts on which would be the better approach?
Create a sentMessage list with a date for each entry.
Check the list before posting a new message.
Related
Sometimes due to some external problems, I need to requeue a message by basic.reject with requeue = true.
But I don't need to consume it immediately because it will possibly fail again in a short time. If I continuously requeue it, this may result in infinite loop and requeue.
So I need to consume it later, say one minute later,
And I need to know how many times the messages has been requeue so that I can stop requeue it but only reject it to declare it fails to consume.
PS: I am using Java client.
There are multiple solutions to point 1.
First one is the one chosen by Celery (a Python producer/consumer library that can use RabbitMQ as broker). Inside your message, add a timestamp at which the task should be executed. When your consumer gets the message, do not ack it and check its timestamp. As soon as the timestamp is reached, the worker can execute the task. (Note that the worker can continue working on other tasks instead of waiting)
This technique has some drawbacks. You have to increase the QoS per channel to an arbitrary value. And if your worker is already working on a long running task, the delayed task wont be executed until the first task has finished.
A second technique is RabbitMQ-only and is much more elegant. It takes advantage of dead-letter exchanges and Messages TTL. You create a new queue which isn't consumed by anybody. This queue has a dead-letter exchange that will forward the messages to the consumer queue. When you want to defer a message, ack it (or reject it without requeue) from the consumer queue and copy the message into the dead-lettered queue with a TTL equal to the delay you want (say one minute later). At (roughly) the end of TTL, the defered message will magically land in the consumer queue again, ready to be consumed. RabbitMQ team has also made the Delayed Message Plugin (this plugin is marked as experimental yet fairly stable and potential suitable for production use as long as the user is aware of its limitations and has serious limitations in term of scalability and reliability in case of failover, so you might decide whether you really want to use it in production, or if you prefer to stick to the manual way, limited to one TTL per queue).
Point 2. just requires putting a counter in your message and handling this inside your app. You can choose to put this counter in a header or directly in the body.
I'm consistently seeing very long delays (60+ seconds) between two actors, from the time at which the first actor sends a message for the second, and when the second actor's onReceive method is actually called with the message. What kinds of things can I look for to debug this problem?
Details
Each instance of ActorA is sending one message for ActorB with ActorRef.tell(Object, ActorRef). I collect a millisecond timestamp (with System.currentTimeMillis()) right after calling the tell method in ActorA, and getting another one at the start of ActorB's onReceive(Object). The interval between these timestamps is consistently 60 seconds or more. Specifically, when plotted over time, this interval follows a rough saw tooth pattern that ranges from more 60 second to almost 120 seconds, as shown in the graph below.
These actors are early in the data flow of the system, there are several other actors that follow after ActorB. This large gap only occurs between these two specific actors, the gaps between other pairs of adjacent actors is typically less than a millisecond, occassionally a few tens of milliseconds. Additionally, the actual time spent inside any given actor is never more than a second.
Generally, each actor in the system only passes a single message to another actor. One of the actors (subsequent to ActorB) sends a single message to each of a few different actors, and a small percentage (less than 0.1%) of the time, certain actors will send multiple messages to the same subsequent actor (i.e., multiple instances of the subsequent actor will be demanded). When this occurs, the number of multiple messages is typically on the order of a dozen or less.
Can this be explained (explicitely) by the normal reactive nature of Akka? Does it indicate a problem with the way work is distributed or the way the actors are configured? Is there something that can explicitly block a particular actor from spinning up? What other information should I collect or look at to understand the source of this, or to understand whether or not it is actually a problem?
You have a limited thread pool. If your Actors block, they still take up space in the thread pool. New threads will not be created if your thread pool is saturated.
You may want to configure
core-pool-size-factor,
core-pool-size-min, and
core-pool-size-max.
If you expect certain actions to block, you can instead wrap them in Future { blocking { ... } } and register a callback. But it's better to use asynchronous, non-blocking calls.
Background: I need to send many small-size messages to WebSocket clients in asynchronous way. Messages are usually sent in peak, so after some pause I need to send ~5000 messages fast. So the problem is:
I don't want to start 5000 async's in single thread
I don't want to loop "start async"-"wait for complete" 5000 times in serial
I don't want to use 5000 threads, with single "start async"-"wait for complete" per thread
The best way would be to group ~20 asyncs per thread, so I need very specific queue:
lot of means concurrent push/poll in queue
small-sized asynchronous means I want to poll in bundles, like 1 to 20 messages per queue take() (so I can start 1...20 async I/O and wait for completness in single thread)
immediately means that I dont want to wait until 20 messages will be polled, bundle-poll should be used only if queue has lot of messages. Single message should be polled and sent immediately.
So basically: I need structure like queue that has blocking take(1 to X) waiting elements in single blocking call. Pseudocode:
[each of ~50 processing threads]:
messages = queue.blockingTake( max 10 or at least 1 if less than 10 available );
for each message: message.startAsync()
for each message: message.waitToComplete()
repeat
I wouldn't implement a Queue from scratch if it's not really necessary. A few ideas if you're interested:
Queue> if you have only 1 thread doing the offers. If you have more, the collection has to be sync'd. Like, one offerer peek()-s into the queue, sees that the last collection has too many elements so it creates a new one and offers it.
or
A number of running threads where the runnables take elements one by one from the queue.
or
1 queue per sending thread, if you keep the queue references you can then add elements to each of them in a round robin fashion.
or
subclass a BlockingQueue of your choice and create a "Collection take(int i)" method with a rewritten version of the normal take().
I have an requirement where I have to send the alerts when the record in db is not updated/changed for specified intervals. For example, if the received purchase order doesn't processed within one hour, the reminder should be sent to the delivery manager.
The reminder/alert should sent exactly at the interval (including seconds). If the last modified time is 13:55:45 means, the alert should be triggered 14:55:45. There could be million rows needs to be tracked.
The simple approach could be implementing a custom scheduler and all the records will registered with it. But should poll the database to look for the change every second and it will lead to performance problem.
UPDATE:
Another basic approach would be a creating a thread for each record and put it on sleep for 1 hour (or) Use some queuing concept which has timeout. But still it has performance problems
Any thoughts on better approach to implement the same?
probably using internal JMS queue would be better solution - for example you may want to use scheduled message feature http://docs.jboss.org/hornetq/2.2.2.Final/user-manual/en/html/examples.html#examples.scheduled-message with hornetq.
You can ask broker to publish alert message after exactly 1h. From the other hand during processing of some trading activity you can manually delete this message meaning that the trade activity has been processed without errors.
Use Timer for each reminder.i.e. If the last modified time is 17:49:45 means, the alert should be triggered 18:49:45 simply you should create a dynamic timer scheduling for each task it'll call exact after one hour.
It is not possible in Java, if you really insist on the "Real-timeness". In Java you may encouter Garbage collector's stop-the-world phase and you can never guarantee the exact time.
If the approximate time is also permissible, than use some kind of scheduled queue as proposed in other answers, if not, than use real-time Java or some native call.
If we can assume that the orders are entered with increasing time then:
You can use a Queue with elements that have the properties time-of-order and order-id.
Each new entry that is added to the DB is also enqueued to this Queue.
You can check the element at the start of the Queue each minute.
When checking the element at the start of the Queue, if an hour has passed from the time-of-order, then search for the entry with order-id in the DB.
If found and was not updated then send a notification, else dequeue it from the Queue .
I'm really new to programming and having performance problems with my software. Basically I get some data and run a 100 loop on it(i=0;i<100;i++) and during that loop my program makes 1 of 3 decisions, keep the data its working on, discard it, or send a version of it back to the queue to process. The individual work each thread does is very small but there's a lot of it(which is why I'm using a queue server to scale horizontally).
My problem is it never takes close to my entire cpu, my program runs at around 40% per core. After profiling, it seems the majority of the time is spend sending/receiving data from the queue(64% approx. in a part called com.rabbitmq.client.impl.Frame.readFrom(DataInputStream) and com.rabbitmq.client.impl.SocketFrameHandler.readFrame(), 17% approx. is getting it in the format for the queue(I brought down from 40% before) and the rest is spend on my programs logic). Obviously, I want my work to be done faster and want to not have it spend so much time in the queue and I'm wondering if there's a better design I can use.
My code is actually quite large but here's a overview of what it does:
I create a connection to the queue server(rabbitmq and java)
I fork as many threads as I have cpu cores(using the same connection)
Data from thread is
each thread creates its own channel to the queue server using the shared connection.
There'a while loop that pools the server and gets X number of messages without acknowledgments
Once I get a message, I use thread executor to send an acknowledge while my job is running
I parse the message and run my loop
If data is sent back to the queue, I send it to a thread executor that sends it back so my program can proceed with the next data set.
One weird thing I did, was although I use thread executor for acknowledgments and sending to the queue, my main worker thread is just a forked thread(using public void run()) because my program is dedicated to this single process I did that to make sure there was always X number of threads ready to work(and there was no shutting down/respawning of them). The rest is in threads because I figured the rest could wait/be queued while my main program runs.
I'm not sure how to design it better so it spends less time gathering/sending data. Is there any designs, rabbitmq, Java things I can use to help?
If it's not IO wait, then I suspect that it's down to some locking going on inside those methods.
It looks to me like your threads are spending a significant amount of time waiting for them to return. Somewhat counter-intuitively, you might well be able to increase your performance by cutting down on the number of threads, since they'll spend less time tripping over each other and more time actively doing something.
Give it a try and see what affect it has on the profile.