I have a an issue that's very weird to me.
I'm writing a multithreaded client-server framework, and so far, it works pretty well, except one thing.
Consider the following image:
The client can request tasks, which are added to a queue. If any elements are in this queue, they are polled and added to an "executing" queue. The task in question is then executed on a separate thread using an ExecutorService.
The "executing" queue is checked for tasks that has completed whatever it is they're doing, and moved to a "completed" queue. This queue is checked, and replies are dispatched to the appropriate clients.
This all works, except when there are more than one task running in the system.
Each task is held in a TaskRequest object and each task can refer back to its "host" TaskRequest. However, it appears as though the reference from the task is different from the reference of the... well... actual TaskRequest
On the image, I've highlighted the TaskRequest and the resultBag to show they have different addresses and IDs.
This, as I mentioned, is only the case when more than one task is in the system and it is beyond flabbergasting to me.
The complete field is not updating, despite me having checked this by outputting value of the variable after it is set.
Why is the "host" object not updating?
Below is the code for the classes in question, Pastebinned to reduce space.
TaskRequest code
TaskBase code (extended by other tasks)
TaskQueue code (keeping of all the tasks, requested, executing and completed, respectively)
TaskExecutor code (running over the TaskQueue instance, executing tasks, etc)
I am sorry for posting a bit of a wall of text.
Related
I have a task worker written in Java and using a MongoDB 3.4 replica set that runs many threads each doing essentially this.
Run task
Signal that task is complete by updating a document for that task in MongoDB
Run a query to see if all the tasks in this set of tasks are done
If so, continue to next stage of processing
Otherwise, do nothing
As you may be able to see, there is a race condition here; multiple tasks can all finish at about the same time and think that they are the last task to complete. I want to use MongoDB to make sure only one of those tasks is allowed to start the next stage of processing.
I have the following code that is meant to ensure that only one of those tasks can continue (I'm using Jongo to interface with MongoDB).
Chipset modified = chipsets
.findAndModify("{_id: #, status: {$ne: #}}", new Object[] { chipset.getId(), Chipset.Status.Queued })
.with("{$set: {status: #}}", new Object[] { Chipset.Status.Queued })
.returnNew().as(Chipset.class);
if (modified != null)
runNextProcessingStep();
Pretty simple here; I'm just using findAndModify to change the status of the Chipset (set of tasks) to Queued. The one that successfully makes the change gets to execute runNextProcessingStep().
Or that's how I think it should work. In reality, several tasks, even ones that finish 2 seconds apart, are somehow getting back a non-null modified. As I understand it MongoDB should be locking the document when running findAndModify so that a non-null document can be returned no more than once.
I've read Linearizable Reads via findAndModify and have implemented everything said in there. I've set the connection write concern to Majority and the read concern to Linearizable. I've created a unique composite index on _id and status. Still nothing. Perhaps I have misunderstood how findAndModify actually behaves? What am I doing wrong?
Well, this is embarrassing but in the interest of being a good internet citizen I'll update this with what happened. There was another thread that was changing statuses out from under me. I had convinced myself this couldn't be the case but, well, concurrency can be a real pain sometimes. findAndModify works exactly how I thought it should.
I can see how hashing the field contents and task Id together is used to keep all tuples with the same value in a field going to the same bolt task. However, how can theis be guarenteed if there is more than one worker? Surely bolt tasks are not shared between workers?
The number of tasks is fixed when a Topology is created and never changes. Rebalance commands can change which executors host a task but they are not destroyed, therefore tuples with the same field values will always go to the same task regardless of which worker process or executor it is on.
You may have already seen this but this is a good explanation of Storm's parallelism and provides a bit more detail.
Suppose I need to execute N tasks in the same thread. The tasks may sometimes need some values from an external storage. I have no idea in advance which task may need such a value and when. It is much faster to fetch M values in one go rather than the same M values in M queries to the external storage.
Note that I cannot expect cooperation from tasks themselves, they can be concidered as nothing more than java.lang.Runnable objects.
Now, the ideal procedure, as I see it, would look like
Execute all tasks in a loop. If a task requests an external value, remember this, suspend the task and switch to the next one.
Fetch the values requested at the previous step, all at once.
Remove all completed task (suspended ones don't count as completed).
If there are still tasks left, go to step 1, but instead of executing a task, continue its execution from the suspended state.
As far as I see, the only way to "suspend" and "resume" something would be to remove its related frames from JVM stack, store them somewhere, and later push them back onto the stack and let JVM continue.
Is there any standard (not involving hacking at lower level than JVM bytecode) way to do this?
Or can you maybe suggest another possible way to achieve this (other than starting N threads or making tasks cooperate in some way)?
It's possible using something like quasar that does stack-slicing via an agent. Some degree of cooperation from the tasks is helpful, but it is possible to use AOP to insert suspension points from outside.
(IMO it's better to be explicit about what's going on (using e.g. Future and ForkJoinPool). If some plain code runs on one thread for a while and is then "magically" suspended and jumps to another thread, this can be very confusing to debug or reason about. With modern languages and libraries the overhead of being explicit about the asynchronicity boundaries should not be overwhelming. If your tasks are written in terms of generic types then it's fairly easy to pass-through something like scalaz Future. But that wouldn't meet your requirements as given).
As mentioned, Quasar does exactly that (it usually schedules N fibers on M threads, but you can set M to 1), using bytecode transformations. It even gives each task (AKA "fiber") its own stack trace, so you can dump it and get a complete stack trace without any interference from any other task sharing the thread.
Well you could try this
you need
A mechanism to save the current state of the task because when the task returns its frame would be popped from the call stack. Based on the return value or something like that you can determine weather it completed or not since you would need to re-execute it from the point where it left thus u need to preserve the state information.
Create a Request Data structure for each task. When ever a task wants to request something it logs it there , The data structure should support all the possible request a task can make.
Store these DS in a Map. At the end of the loop you can query this DS to determine the kind of resource required by each task.
get the resource put it in the DS . Start the task from the state when it returned.
The task queries the DS gets the resource.
The task should use this DS when ever it wants to use an external resource.
you would need to design the method in which resource is requested with special consideration since when you will re-execute the task again you would need to call this method yourself so that the task can execute from where it left.
*DS -> Data Structure
hope it helps.
From (Android Handler class)
http://developer.android.com/reference/android/os/Handler.html#postAtTime(java.lang.Runnable, long),
the description of the method postAtTime is "Causes the Runnable r to be added to the message queue, to be run at a specific time given by uptimeMillis. The time-base is uptimeMillis()." The parameter uptimeMillis is "The absolute time at which the callback should run, using the uptimeMillis() time-base". My question is if there are still runnables/messages that still need to be run when postAtTime triggers, will those runnables/messages just be discarded(removed from the queue). My question comes from my experience with queues: you only have access to the front of the queue, so I am assuming that that specific message/ runnable gets moved to the front of the queue. What happens to all the ones it skips over? API didn't address this
Think of the time parameter as "no earlier than", not as an exact time.
The runnable is put into the queue and becomes eligible to run at the specified time. It is actually run only after any messages in front of the queue have finished processing.
For further details, you can read the source.
Even after reading http://krondo.com/?p=1209 or Does an asynchronous call always create/call a new thread? I am still confused about how to provide asynchronous calls on an inherently single-threaded system. I will explain my understanding so far and point out my doubts.
One of the examples I read was describing a TCP server providing asynch processing of requests - a user would call a method e.g. get(Callback c) and the callback would be invoked some time later. Now, my first issue here - we have already two systems, one server and one client. This is not what I mean, cause in fact we have two threads at least - one in the server and one on the client side.
The other example I read was JavaScript, as this is the most prominent example of single-threaded asynch system with Node.js. What I cannot get through my head, maybe thinking in Java terms, is this:If I execute the code below (apologies for incorrect, probably atrocious syntax):
function foo(){
read_file(FIle location, Callback c) //asynchronous call, does not block
//do many things more here, potentially for hours
}
the call to read file executes (sth) and returns, allowing the rest of my function to execute. Since there is only one thread i.e. the one that is executing my function, how on earth the same thread (the one and only one which is executing my stuff) will ever get to read in the bytes from disk?
Basically, it seems to me I am missing some underlying mechanism that is acting like round-robin scheduler of some sort, which is inherently single-threaded and might split the tasks to smaller ones or call into a multiothraded components that would spawn a thread and read the file in.
Thanks in advance for all comments and pointing out my mistakes on the way.
Update: Thanks for all responses. Further good sources that helped me out with this are here:
http://www.html5rocks.com/en/tutorials/async/deferred/
http://lostechies.com/johnteague/2012/11/30/node-js-must-know-concepts-asynchrounous/
http://www.interact-sw.co.uk/iangblog/2004/09/23/threadless (.NET)
http://ejohn.org/blog/how-javascript-timers-work/ (intrinsics of timers)
http://www.mobl-lang.org/283/reducing-the-pain-synchronous-asynchronous-programming/
The real answer is that it depends on what you mean by "single thread".
There are two approaches to multitasking: cooperative and interrupt-driven. Cooperative, which is what the other StackOverflow item you cited describes, requires that routines explicitly relinquish ownership of the processor so it can do other things. Event-driven systems are often designed this way. The advantage is that it's a lot easier to administer and avoids most of the risks of conflicting access to data since only one chunk of your code is ever executing at any one time. The disadvantage is that, because only one thing is being done at a time, everything has to either be designed to execute fairly quickly or be broken up into chunks that to so (via explicit pauses like a yield() call), or the system will appear to freeze until that event has been fully processed.
The other approach -- threads or processes -- actively takes the processor away from running chunks of code, pausing them while something else is done. This is much more complicated to implement, and requires more care in coding since you now have the risk of simultaneous access to shared data structures, but is much more powerful and -- done right -- much more robust and responsive.
Yes, there is indeed a scheduler involved in either case. In the former version the scheduler is just spinning until an event arrives (delivered from the operating system and/or runtime environment, which is implicitly another thread or process) and dispatches that event before handling the next to arrive.
The way I think of it in JavaScript is that there is a Queue which holds events. In the old Java producer/consumer parlance, there is a single consumer thread pulling stuff off this queue and executing every function registered to receive the current event. Events such as asynchronous calls (AJAX requests completing), timeouts or mouse events get pushed on to the Queue as soon as they happen. The single "consumer" thread pulls them off the queue and locates any interested functions and then executes them, it cannot get to the next Event until it has finished invoking all the functions registered on the current one. Thus if you have a handler that never completes, the Queue just fills up - it is said to be "blocked".
The system has more than one thread (it has at least one producer and a consumer) since something generates the events to go on the queue, but as the author of the event handlers you need to be aware that events are processed in a single thread, if you go into a tight loop, you will lock up the only consumer thread and make the system unresponsive.
So in your example :
function foo(){
read_file(location, function(fileContents) {
// called with the fileContents when file is read
}
//do many things more here, potentially for hours
}
If you do as your comments says and execute potentially for hours - the callback which handles fileContents will not fire for hours even though the file has been read. As soon as you hit the last } of foo() the consumer thread is done with this event and can process the next one where it will execute the registered callback with the file contents.
HTH