I can see how hashing the field contents and task Id together is used to keep all tuples with the same value in a field going to the same bolt task. However, how can theis be guarenteed if there is more than one worker? Surely bolt tasks are not shared between workers?
The number of tasks is fixed when a Topology is created and never changes. Rebalance commands can change which executors host a task but they are not destroyed, therefore tuples with the same field values will always go to the same task regardless of which worker process or executor it is on.
You may have already seen this but this is a good explanation of Storm's parallelism and provides a bit more detail.
Related
I am currently working on a functionality where I need to make a callback after, lets say 'n' number of threads are completed (or failed). Amongst the 'n' sub-tasks, n-1 tasks will be submitted to one particular thread pool and I will have access to all these n-1 futures.
I have no issue in making the callback upon completion of all futures. I intend to pass an atomic monitor to each of these n-1 tasks and initiate the callback basing on some count.
Now i'm stuck dealing with the other single thread. Basically this thread is a different execution workflow. As per the functionality after initiating these (n-1) tasks/threads, I will make a separate call to a different application (to perform some long running task) and will exit. Once the other application is done with its computation, it will push its results through a different end point.
So, while the separate service is pushing its result, I have to maintain some sort of context to club all these sub-tasks together. I could solve this using a singleton hashmap or a local cache (guava caches) using a unique ID and using the atomic monitor (shared amongst all the sub-tasks).
Before that I wanted to know the pros and cons of this approach. Also, I would really appreciate, if you can propose some sort of design pattern or framework for implementing this workflow in an elegant manner.
I am going through different concurrency model in multi-threading environment (http://tutorials.jenkov.com/java-concurrency/concurrency-models.html)
The article highlights about three concurrency models.
Parallel Workers
The first concurrency model is what I call the parallel worker model. Incoming jobs are assigned to different workers.
Assembly Line
The workers are organized like workers at an assembly line in a factory. Each worker only performs a part of the full job. When that part is finished the worker forwards the job to the next worker.
Each worker is running in its own thread, and shares no state with other workers. This is also sometimes referred to as a shared nothing concurrency model.
Functional Parallelism
The basic idea of functional parallelism is that you implement your program using function calls. Functions can be seen as "agents" or "actors" that send messages to each other, just like in the assembly line concurrency model (AKA reactive or event driven systems). When one function calls another, that is similar to sending a message.
Now I want to map java API support for these three concepts
Parallel Workers : Is it ExecutorService,ThreadPoolExecutor, CountDownLatch API?
Assembly Line : Sending an event to messaging system like JMS & using messaging concepts of Queues & Topics.
Functional Parallelism: ForkJoinPool to some extent & java 8 streams. ForkJoin pool is easy to understand compared to streams.
Am I correct in mapping these concurrency models? If not please correct me.
Each of those models says how the work is done/splitted from a general point of view, but when it comes to implementation, it really depends on your exact problem. Generally I see it like this:
Parallel Workers: a producer creates new jobs somewhere (e.g in a BlockingQueue) and many threads (via an ExecutorService) process those jobs in parallel. Of course, you could also use a CountDownLatch, but that means you want to trigger an action after exactly N subproblems have been processed (e.g you know your big problem may be split in N smaller problems, check the second example here).
Assembly Line: for every intermediate step, you have a BlockingQueue and one Thread or an ExecutorService. On each step the jobs are taken from one BlickingQueue and put in the next one, to be processed further. To your idea with JMS: JMS is there to connect distributed components and is part of the Java EE and was not thought to be used in a high concurrent context (messages are kept usually on the hard disk, before being processed).
Functional Parallelism: ForkJoinPool is a good example on how you could implement this.
An excellent question to which the answer might not be quite as satisfying. The concurrency models listed show some of the ways you might want to go about implementing an concurrent system. The API provides tools used to implementing any of these models.
Lets start with ExecutorService. It allows you to submit tasks to be executed in a non-blocking way. The ThreadPoolExecutor implementation then limits the maximum number of threads available. The ExecutorService does not require the task to perform the complete process as you might expect of a parallel worker. The task may be limited to specific part of the process and send a message upon completion that starts the next step in an assembly line.
The CountDownLatch and the ExecutorService provide a means to block until all workers have completed that may come in handy if a certain process has been divided to different concurrent sub-tasks.
The point of JMS is to provide a means for messaging between components. It does not enforce a specific model for concurrency. Queues and topics denote how a message is sent from a publisher to a subscriber. When you use queues the message is sent to exactly one subscriber. Topics on the other hand broadcast the message to all subscribers of the topic.
Similar behavior could be achieved within a single component by for example using the observer pattern.
ForkJoinPool is actually one implementation of ExecutorService (which might highlight the difficulty of matching a model and an implementation detail). It just happens to be optimized for working with large amount of small tasks.
Summary: There are multiple ways to implement a certain concurrency model in the Java environment. The interfaces, classes and frameworks used in implementing a program may vary regardless of the concurrency model chosen.
Actor model is another example for an Assembly line. Ex: akka
Suppose I need to execute N tasks in the same thread. The tasks may sometimes need some values from an external storage. I have no idea in advance which task may need such a value and when. It is much faster to fetch M values in one go rather than the same M values in M queries to the external storage.
Note that I cannot expect cooperation from tasks themselves, they can be concidered as nothing more than java.lang.Runnable objects.
Now, the ideal procedure, as I see it, would look like
Execute all tasks in a loop. If a task requests an external value, remember this, suspend the task and switch to the next one.
Fetch the values requested at the previous step, all at once.
Remove all completed task (suspended ones don't count as completed).
If there are still tasks left, go to step 1, but instead of executing a task, continue its execution from the suspended state.
As far as I see, the only way to "suspend" and "resume" something would be to remove its related frames from JVM stack, store them somewhere, and later push them back onto the stack and let JVM continue.
Is there any standard (not involving hacking at lower level than JVM bytecode) way to do this?
Or can you maybe suggest another possible way to achieve this (other than starting N threads or making tasks cooperate in some way)?
It's possible using something like quasar that does stack-slicing via an agent. Some degree of cooperation from the tasks is helpful, but it is possible to use AOP to insert suspension points from outside.
(IMO it's better to be explicit about what's going on (using e.g. Future and ForkJoinPool). If some plain code runs on one thread for a while and is then "magically" suspended and jumps to another thread, this can be very confusing to debug or reason about. With modern languages and libraries the overhead of being explicit about the asynchronicity boundaries should not be overwhelming. If your tasks are written in terms of generic types then it's fairly easy to pass-through something like scalaz Future. But that wouldn't meet your requirements as given).
As mentioned, Quasar does exactly that (it usually schedules N fibers on M threads, but you can set M to 1), using bytecode transformations. It even gives each task (AKA "fiber") its own stack trace, so you can dump it and get a complete stack trace without any interference from any other task sharing the thread.
Well you could try this
you need
A mechanism to save the current state of the task because when the task returns its frame would be popped from the call stack. Based on the return value or something like that you can determine weather it completed or not since you would need to re-execute it from the point where it left thus u need to preserve the state information.
Create a Request Data structure for each task. When ever a task wants to request something it logs it there , The data structure should support all the possible request a task can make.
Store these DS in a Map. At the end of the loop you can query this DS to determine the kind of resource required by each task.
get the resource put it in the DS . Start the task from the state when it returned.
The task queries the DS gets the resource.
The task should use this DS when ever it wants to use an external resource.
you would need to design the method in which resource is requested with special consideration since when you will re-execute the task again you would need to call this method yourself so that the task can execute from where it left.
*DS -> Data Structure
hope it helps.
I am newbie to Storm and have created a program to read the incremented numbers for certain time. I have used a counter in Spout and in the "nextTuple()" method the counter is being emitted and incremented
_collector.emit(new Values(new Integer(currentNumber++)));
/* how this method is being continuously called...*/
and in the execute() method of the Tuple class has
public void execute(Tuple input) {
int number = input.getInteger(0);
logger.info("This number is (" + number + ")");
_outputCollector.ack(input);
}
/*this part I am clear as Bolt would receive the input from Spout*/
In my Main class execution I have the following code
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("NumberSpout", new NumberSpout());
builder.setBolt("NumberBolt", new PrimeNumberBolt())
.shuffleGrouping("NumberSpout");
Config config = new Config();
LocalCluster localCluster = new LocalCluster();
localCluster.submitTopology("NumberTest", config, builder.createTopology());
Utils.sleep(10000);
localCluster.killTopology("NumberTest");
localCluster.shutdown();
The programs Perfectly works fine. What currently I am looking here is how does the Storm framework internally calls the nextTuple() method continuously. I am sure that my understanding is missing something here and due to this gap I am unable to connect to the internal logic of this framework.
Can anyone of you guys help me in understanding this portion clearly then it would be a great help for me as I will have to implement this concept in my project. If I am conceptually clear here then I can make a significant progress. Appreciate if anyone can quickly assist me over here. Awaiting responses...
how does the Storm framework internally calls the nextTuple() method continuously.
I believe this actually involves a very detail discussion about the entire life cycle of a storm topology as well as a clear concepts of different entities like workers, executors, tasks etc. The actual processing of a topology is carried out by the StormSubmitter class with its submitTopology method.
The very first thing it does is start uploading the jar using Nimbus's Thrift interface and then calls the submitTopology which eventually submit the topology to Nimbus. The Nimbus then start by normalizing the topology (from doc: The main purpose of normalization is to ensure that every single task will have the same serialization registrations, which is critical for getting serialization working correctly) followed by serialization, zookeeper hand shaking , supervisor and worker process startup and so on. Its too broad to discuss but If you really want to dig more you can go through the life cycle of storm topology where it explain nicely the step by step actions performs during the entire time. ( quick note from the documentation)
First a couple of important notes about topologies:
The actual topology that runs is different than the topology the user
specifies. The actual topology has implicit streams and an implicit
"acker" bolt added to manage the acking framework (used to guarantee
data processing).
The implicit topology is created via the
system-topology! function. system-topology! is used in two places:
- - when Nimbus is creating tasks for the topology code - - in the worker so
it knows where it needs to route messages to code
Now here's few clue I could try to share ...
Spouts or Bolts are actually the components which does the real processing (the logic). In storm terminology they executes as many tasks across the structure.
From the doc page : Each task corresponds to one thread of execution
Now, among many others, one typical responsibility of a worker process (read here) in storm is to monitor weather a topology is active or not and stored that particular state in a variable named storm-active-atom. This variable is used by the tasks to determine whether or not to call the nextTuple method.. So as long as your topology is live (you haven't put your spout code but assuming) till the time your timer is active (as you said for certain time) it will keep calling the nextTuple method. You can dig even further to understand the storm's Acking framework implementation to understand how it understand and acknowledge once a tuple is successfully processed and Guarantee-message-processing
I am sure that my understanding is missing something here and due to this gap I am unable to connect to the internal logic of this framework
Having said this I think its more important to get a clear understanding of how to work with storm rather than how to understand storm in the early stage. e.g instead of learning the internal mechanism of storm its important to realize that if we set a spout to read a file line by line then it keep on emitting each lines using the _collector.emit method till it reaches EOF. And the bolt connected to it receive the same in its execute(tuple input) method
Hope this help you share more with us in future
Ordinary Spouts
There is a loop in the storm's executor daemon that repeatedly calls nextTuple (as well as ack and fail when appropriate) on the corresponding spout instance.
There is no waiting for tuples being processed. Spout simply receives fail for tuples that did not manage to be processed in given timeout.
This can be easily simulated with a topology of a fast spout and a slow processing bolt: the spout will receive a lot of fail calls.
See also the ISpout javadoc:
nextTuple, ack, and fail are all called in a tight loop in a single thread in the spout task. When there are no tuples to emit, it is courteous to have nextTuple sleep for a short amount of time (like a single millisecond) so as not to waste too much CPU.
Trident Spouts
The situation is completely different for Trident-spouts:
By default, Trident processes a single batch at a time, waiting for
the batch to succeed or fail before trying another batch. You can get
significantly higher throughput – and lower latency of processing of
each batch – by pipelining the batches. You configure the maximum
amount of batches to be processed simultaneously with the
topology.max.spout.pending property.
Even while processing multiple batches simultaneously, Trident will order any state updates taking place in the topology among batches.
Is there a way to assure FIFO (first in, first out) behavior with Task Queues on GAE?
GAE Documentation says that FIFO is one of the factors that affect task execution order, but the same documentation says that “the system's scheduling may 'jump' new tasks to the head of the queue” and I have confirmed this behavior with a test. The effect: my events are being processed out of order.
Docs says:
https://developers.google.com/appengine/docs/java/taskqueue/overview-push
The order in which tasks are executed depends on several factors:
The position of the task in the queue. App Engine attempts to process tasks based on FIFO > (first in, first out) order. In general, tasks are inserted into the end of a queue, and
executed from the head of the queue.
The backlog of tasks in the queue. The system attempts to deliver the lowest latency
possible for any given task via specially optimized notifications to the scheduler.
Thus, in the case that a queue has a large backlog of tasks, the
system's scheduling may "jump" new tasks to the head of the queue.
The value of the task's etaMillis property. This property specifies the
earliest time that a task can execute. App Engine always waits until
after the specified ETA to process push tasks.
The value of the task's countdownMillis property. This property specifies the minimum
number of seconds to wait before executing a task. Countdown and eta
are mutually exclusive; if you specify one, do not specify the other.
What I need to do? In my use case, I'll process 1-2 million events/day coming from vehicles. These events can be sent at any interval (1 sec, 1 minute or 1 hour). The order of the event processing has to be assured. I need process by timestamp order, which is generated on a embedded device inside the vehicle.
What I have now?
A Rest servlet that is called by the consumer and creates a Task (Event data is on payload).
After this, a worker servlet get this Task and:
Deserialize Event data;
Put Event on Datastore;
Update Vehicle On Datastore.
So, again, is there any way to assure just FIFO behavior? Or how can I improve this solution to get this?
You need to approach this with three separate steps:
Implement a Sharding Counter to generate a monotonically
increasing ID. As much as I like to use the timestamp from
Google's server to indicate task ordering, it appears that timestamps
between GAE servers might vary more than what your requirement is.
Add your tasks to a Pull Queue instead of a Push Queue. When
constructing your TaskOption, add the ID obtained from Step #1 as a tag.
After adding the task, store the ID somewhere on your datastore.
Have your worker servlet lease Tasks by a certain tag from the Pull Queue.
Query the datastore to get the earliest ID that you need to fetch, and use the ID as
the lease tag. In this way, you can simulate FIFO behavior for your task queue.
After you finished your processing, delete the ID from your datastore, and don't forget to delete the Task from your Pull Queue too. Also, I would recommend you run your task consumptions on the Backend.
UPDATE:
As noted by Nick Johnson and mjaggard, sharding in step #1 doesn't seem to be viable to generate a monotonically increasing IDs, and other sources of IDs would then be needed. I seem to recall you were using timestamps generated by your vehicles, would it be possible to use this in lieu of a monotonically increasing ID?
Regardless of the way to generate the IDs, the basic idea is to use datastore's query mechanism to produce a FIFO ordering of Tasks, and use task Tag to pull specific task from the TaskQueue.
There is a caveat, though. Due to the eventual consistency read policy on high-replication datastores, if you choose HRD as your datastore (and you should, the M/S is deprecated as of April 4th, 2012), there might be some stale data returned by the query on step #2.
I think the simple answer is "no", however partly in order to help improve the situation, I am using a pull queue - pulling 1000 tasks at a time and then sorting them. If timing isn't important, you could sort them and put them into the datastore and then complete a batch at a time. You've still got to work out what to do with the tasks at the beginning and ends of the batch - because they might be out of order with interleaving tasks in other batches.
Ok. This is how I've done it.
1) Rest servlet that is called from the consumer:
If Event sequence doesn't match Vehicle sequence (from datastore)
Creates a task on a "wait" queue to call me again
else
State validation
Creates a task on the "regular" queue (Event data is on payload).
2) A worker servlet gets the task from the "regular" queue, and so on... (same pseudo code)
This way I can pause the "regular" queue in order to do a data maintenance without losing events.
Thank you for your answers. My solution is a mix of them.
You can put the work to be done in a row in the datastore with a create timestamp and then fetch work tasks by that timestamp, but if your tasks are being created too quickly you will run into latency issues.
Don't know the answer myself, but it may be possible that tasks enqueued using a deferred function might execute in order submitted. Likely you will need an engineer from G. to get an answer. Pull queues as suggested seem a good alternative, plus this would allow you to consider batching your put()s.
One note about sharded counters: they increase the probability of monotonically increasing ids, but do not guarantee them.
The best way to handle this, the distributed way or "App Engine way" is probably to modify your algorithm and data collection to work with just a timestamp, allowing arbitrary ordering of tasks.
Assuming this is not possible or too difficult, you could modify your algorithm as follow:
when creating the task don't put the data on payload but in the datastore, in a Kind with an ordering on timestamps and stored as a child entity of whatever entity you're trying to update (Vehicule?). The timestamps should come from the client, not the server, to guarantee the same ordering.
run a generic task that fetch the data for the first timestamp, process it, and then delete it, inside a transaction.
Following this thread, I am unclear as to whether the strict FIFO requirement is for all transactions received, or on a per-vehicle basis. Latter has more options vs. former.