It's about an application which is supposed to process (VAD, Loudness, Clipping) a lot of soundfiles (e.g. 100k). At this time, I create as many worker threads (callables) as I can put into memory, and then run all with a threadPool.invokeAll(), write results to file system, unload processed files and continue at step 1. Due to the fact it's an app with a GUI, i don't want to user to feel like the app "is not responding" while processing all soundfiles. (which it does at this time cause invokeAll is blocking). I'm not sure what is a "good" way to fix this. It shall not be possible for the user to do other things while processing, but I'd like to show a progress bar like "10 of 100000 soundfiles are done". So how do I get there? Do I have to create a "watcher thread", so that every worker hold a callback on it? I'm quite new to multi threading, and don't get the idea of such a mechanism.
If you need to know: I'm using SWT/JFace.
You could use an ExecutorCompletionService for this purpose; if you submit each of the Callable tasks in a loop, you can then call the take method of the completion service - receiving tasks one at a time as they finish. Every time you take a task, you can update your GUI.
As another option, you could implement your own ExecutorService that is also an Observable, allowing the publication of updates to subscribing Observers whenever a task is completed.
You should have a look at SwingWorker. It's a good class for doing lengthy operations whilst reporting back progress to the gui and maintaining a responsive gui.
Using a Swing Worker Thread provides some good information.
Related
I am building a fitness app which continually logs activity on the device. I need to log quite often, but I also don't want to unnecessarily drain the battery of my users which is why I am thinking about batching network calls together and transmitting them all at once as soon as the radio is active, the device is connected to a WiFi or it is charging.
I am using a filesystem based approach to implement that. I persist the data first to a File - eventually I might use Tape from Square to do that - but here is where I encounter the first issues.
I am continually writing new log data to the File, but I also need to periodically send all the logged data to my backend. When that happens I delete the contents of the File. The problem now is how can I prevent both of those operations from happening at the same time? Of course it will cause problems if I try to write log data to the File at the same time as some other process is reading from the File and trying to delete its contents.
I am thinking about using an IntentService essentially act as a queue for all those operations. And since - at least I have read as much - an IntentServices handles Intents sequentially in single worker Thread it shouldn't be possible for two of those operations to happen at the same time, right?
Currently I want to schedule a PeriodicTask with the GcmNetworkManager which would take care of sending the data to the server. Is there any better way to do all this?
1) You are overthinking this whole thing!
Your approach is way more complicated than it has to be! And for some reason none of the other answers point this out, but GcmNetworkManager already does everything you are trying to implement! You don't need to implement anything yourself.
2) Optimal way to implement what you are trying to do.
You don't seem to be aware that GcmNetworkManager already batches calls in the most battery efficient way with automatic retries etc and it also persists the tasks across device boots and can ensure their execution as soon as is battery efficient and required by your app.
Just whenever you have data to save schedule a OneOffTask like this:
final OneoffTask task = new OneoffTask.Builder()
// The Service which executes the task.
.setService(MyTaskService.class)
// A tag which identifies the task
.setTag(TASK_TAG)
// Sets a time frame for the execution of this task in seconds.
// This specifically means that the task can either be
// executed right now, or must have executed at the lastest in one hour.
.setExecutionWindow(0L, 3600L)
// Task is persisted on the disk, even across boots
.setPersisted(true)
// Unmetered connection required for task
.setRequiredNetwork(Task.NETWORK_STATE_UNMETERED)
// Attach data to the task in the form of a Bundle
.setExtras(dataBundle)
// If you set this to true and this task already exists
// (just depends on the tag set above) then the old task
// will be overwritten with this one.
.setUpdateCurrent(true)
// Sets if this task should only be executed when the device is charging
.setRequiresCharging(false)
.build();
mGcmNetworkManager.schedule(task);
This will do everything you want:
The Task will be persisted on the disk
The Task will be executed in a batched and battery efficient way, preferably over Wifi
You will have configurable automatic retries with a battery efficient backoff pattern
The Task will be executed within a time window you can specify.
I suggest for starters you read this to learn more about the GcmNetworkManager.
So to summarize:
All you really need to do is implement your network calls in a Service extending GcmTaskService and later whenever you need to perform such a network call you schedule a OneOffTask and everything else will be taken care of for you!
Of course you don't need to call each and every setter of the OneOffTask.Builder like I do above - I just did that to show you all the options you have. In most cases scheduling a task would just look like this:
mGcmNetworkManager.schedule(new OneoffTask.Builder()
.setService(MyTaskService.class)
.setTag(TASK_TAG)
.setExecutionWindow(0L, 300L)
.setPersisted(true)
.setExtras(bundle)
.build());
And if you put that in a helper method or even better create factory methods for all the different tasks you need to do than everything you were trying to do should just boil down to a few lines of code!
And by the way: Yes, an IntentService handles every Intent one after another sequentially in a single worker Thread. You can look at the relevant implementation here. It's actually very simple and quite straight forward.
All UI and Service methods are by default invoked on the same main thread. Unless you explicitly create threads or use AsyncTask there is no concurrency in an Android application per se.
This means that all intents, alarms, broad-casts are by default handled on the main thread.
Also note that doing I/O and/or network requests may be forbidden on the main thread (depending on Android version, see e.g. How to fix android.os.NetworkOnMainThreadException?).
Using AsyncTask or creating your own threads will bring you to concurrency problems but they are the same as with any multi-threaded programming, there is nothing special to Android there.
One more point to consider when doing concurrency is that background threads need to hold a WakeLock or the CPU may go to sleep.
Just some idea.
You may try to make use of serial executor for your file, therefore, only one thread can be execute at a time.
http://developer.android.com/reference/android/os/AsyncTask.html#SERIAL_EXECUTOR
I have a question about the use of threads in a gui application. Say (as a simplistic example) i have a swing application with a series of images. I have two threads i want to run that fetch an image of a parent respectively. (So for a given number of students, get a mother image and a father image from each server endpoint). The returned image of the father and the mother is then appended on to the image on screen so i have a series of images with a mother, father, mother, father for multiple students.
How can i schedule this in a multithreaded environment? Each call to get a mother or father image has to be in parallel and not block the displaying of the images on screen. Does the image displayed on the screen refresh after each thread returns an image? How will this be structured?
Start with Concurrency in Swing.
The absolute simplest approach might be to use a SwingWorker that has a list of items it needs to look up and allow it to process the list.
The problem with this is it will only run each request one at the other, making it a little slower then other options. The benefit of this is that it provides easy functionality to re-sync with the Event Dispatching Thread so that you can notify the UI or make changes to it safely.
Another option might be to use Executors, in particular a Thread Pool implementation.
This allows you to submit a number of tasks that should be executed at some time in the future, but allows you to control the number of threads that the process can use at any one time.
The drawback is that you become responsible for syncing the changes back to the UI yourself when you want to update the UI, using SwingUtilities.invokeLater
Now. You "could" use both.
Basically you would need to setup some kind of "request" class that would allow you to pass the relevant information to, for example, the "mother" and "father" servers, the original image and possibly some kind of callback interface that would tell you when the final image had being rendered.
The requester would build some kind of Runnable or Callable which would wrap a SwingWorker.
When executed, this "request task" would start the SwingWorker, allowing it to fetch the images, merge them and publish the results, which would notify the callback interface. The "request task" would then simply wait until SwingWorker#get returns before exiting.
As an idea...
I'm developing a big project on Java Swing. It has a database connection, external devices managing and sd-cards processing.
I currently have a lot of heavy processes that run on the EDT thread, and making separated threads for all of them is a long long task that I'm trying to escape... Besides, It would probably introduce a lot of concurrency problems that I am not willing to handle.
The thing is that I want to introduce a loading JLabel with a loading gif while the long busy tasks are being processed. It is also important to highlight that I want my whole swing interface to be blocked while the long tasks are being done, just like it happens now, EXCEPT for the loading label.
Is there a way to actualize that label from another thread?
If you care about creating a good user experience, there really is no escaping using SwingWorker or similar, to offload work form the event dispatcher thread (EDT). If you need to really need to "block" the UI, you should use a JDialog with a progress bar or similar.
The short answer to your question is no. The Jlabel must be instantiated and added from the EDT.
However, you should be able to add the JLabel, you just have to make sure you do it before the long-running blocking tasks starts. Then remove it after it is done.
Anyway, this is a hack, and a lazy workaround for doing the right thing, and is not recommended. You might experience that you spend more time working around the issue and pulling your hair, than just do it properly with SwingWorkers.
I'm working on a project that does some intense math calculations (arrays of matrices, vectors, etc.), so naturally I'm splitting the work into jobs, and submitting them to a CompletionService to perform the work in parallel.
Each of the job objects can fire events to notify applications when the job starts, ends, progresses, and/or fails.
Currently, each of the jobs receive a handle to the entire list of event listeners, and simply iterate through, passing an event object to each one (in the same thread). This doesn't sit well with me, so I'd like to get other peoples' experience with doing this sort of thing with custom events/listeners.
Should I send my events to the GUI thread? Some of the listeners may or may not be GUI-related, and I'd like to not force users of my code to have to manually send their events onto the GUI thread, something like the following:
public class MyFooEventListener implements FooEventListener {
public void notifyJobStarted(FooEvent evt) {
// I want to avoid having users of my library write the following, right?
SwingUtilities.invokeLater(new Runnable(){
// update GUI here.
});
}
}
I wouldn't mind writing my own EventQueue, as this is for a research project in school, and I suppose it would be a good exercise in concurrency. Just trying to figure out what the "proper" way of implementing an event-driven system is, how to properly fire events, etc. Links to articles/tutorials and howtos are also greatly appreciated.
Thanks!
EDIT:
My event model has multiple event types, such as JobStartedEvent, JobEndedEvent, JobProgressEvent, etc. Is this a bad approach? Should I have a single event type, and if so, how do I pass information to the listeners that is not common to all events? Example: I want to pass a double in the range [0-1] for the progress event, but that is not applicable for an event like JobFailureEvent. What's the best approach to handling this?
I could put the extra information in the "source" object itself, but my source objects are the Job objects themselves, and it doesn't sit well with me to "leak" references to the job object, especially while it is running:
FooJob jobObject = (FooJob)event.getSource();
int progressPercent = jobObject.getCurrentProgress() * 100;
progressLabel.setText(progressPercent + "%");
No. Emit your events on whatever thread needs to raise them and leave it up to the users of your subsystem to decide how they wish to handle them. If they wish to message the results to a GUI, fine, if not, they can do whatever they want, eg. queue them to another thread. Just document 'Events are raised on an internal thread and event handlers must not block'.
Anything else puts constraints on users that they may well not want, as you say.
there are many ways to distribute events, each with their own pros and cons. if the consumer is not necessarily the GUI, then you definitely should not tie yourself to the awt EDT. unless you know for sure how the event consumers are going to work i would start simple and go from there. simple being: synchronously notify each consumer. if that ends up delaying the main task, then you should think about asynchronous notification. if the consumer is ultimately the GUI, then the consumer's notification method should be responsible for calling SwingUtilities.invokeLater.
Only threads that directly impact the GUI should be on the EDT. If you have other threads you need synchronized, just use the synchronized keyword (either on the method or on an object)
Spring has event handling and you can define custom events http://static.springsource.org/spring/docs/3.1.x/spring-framework-reference/html/beans.html#context-functionality-events.
EDIT: This is basically a "how to properly implement a data flow engine in Java" question, and I feel this cannot be adequately answered in a single answer (it's like asking, "how to properly implement an ORM layer" and getting someone to write out the details of Hibernate or something), so consider this question "closed".
Is there an elegant way to model a dynamic dataflow in Java? By dataflow, I mean there are various types of tasks, and these tasks can be "connected" arbitrarily, such that when a task finishes, successor tasks are executed in parallel using the finished tasks output as input, or when multiple tasks finish, their output is aggregated in a successor task (see flow-based programming). By dynamic, I mean that the type and number of successors tasks when a task finishes depends on the output of that finished task, so for example, task A may spawn task B if it has a certain output, but may spawn task C if has a different output. Another way of putting it is that each task (or set of tasks) is responsible for determining what the next tasks are.
Sample dataflow for rendering a webpage: I have as task types: file downloader, HTML/CSS renderer, HTML parser/DOM builder, image renderer, JavaScript parser, JavaScript interpreter.
File downloader task for HTML file
HTML parser/DOM builder task
File downloader task for each embedded file/link
If image, image renderer
If external JavaScript, JavaScript parser
JavaScript interpreter
Otherwise, just store in some var/field in HTML parser task
JavaScript parser for each embedded script
JavaScript interpreter
Wait for above tasks to finish, then HTML/CSS renderer (obviously not optimal or perfectly correct, but this is simple)
I'm not saying the solution needs to be some comprehensive framework (in fact, the closer to the JDK API, the better), and I absolutely don't want something as heavyweight is say Spring Web Flow or some declarative markup or other DSL.
To be more specific, I'm trying to think of a good way to model this in Java with Callables, Executors, ExecutorCompletionServices, and perhaps various synchronizer classes (like Semaphore or CountDownLatch). There are a couple use cases and requirements:
Don't make any assumptions on what executor(s) the tasks will run on. In fact, to simplify, just assume there's only one executor. It can be a fixed thread pool executor, so a naive implementation can result in deadlocks (e.g. imagine a task that submits another task and then blocks until that subtask is finished, and now imagine several of these tasks using up all the threads).
To simplify, assume that the data is not streamed between tasks (task output->succeeding task input) - the finishing task and succeeding task don't have to exist together, so the input data to the succeeding task will not be changed by the preceeding task (since it's already done).
There are only a couple operations that the dataflow "engine" should be able to handle:
A mechanism where a task can queue more tasks
A mechanism whereby a successor task is not queued until all the required input tasks are finished
A mechanism whereby the main thread (or other threads not managed by the executor) blocks until the flow is finished
A mechanism whereby the main thread (or other threads not managed by the executor) blocks until certain tasks have finished
Since the dataflow is dynamic (depends on input/state of the task), the activation of these mechanisms should occur within the task code, e.g. the code in a Callable is itself responsible for queueing more Callables.
The dataflow "internals" should not be exposed to the tasks (Callables) themselves - only the operations listed above should be available to the task.
Note that the type of the data is not necessarily the same for all tasks, e.g. a file download task may accept a File as input but will output a String.
If a task throws an uncaught exception (indicating some fatal error requiring all dataflow processing to stop), it must propagate up to the thread that initiated the dataflow as quickly as possible and cancel all tasks (or something fancier like a fatal error handler).
Tasks should be launched as soon as possible. This along with the previous requirement should preclude simple Future polling + Thread.sleep().
As a bonus, I would like to dataflow engine itself to perform some action (like logging) every time task is finished or when no has finished in X time since last task has finished. Something like: ExecutorCompletionService<T> ecs; while (hasTasks()) { Future<T> future = ecs.poll(1 minute); some_action_like_logging(); if (future != null) { future.get() ... } ... }
Are there straightforward ways to do all this with Java concurrency API? Or if it's going to complex no matter what with what's available in the JDK, is there a lightweight library that satisfies the requirements? I already have a partial solution that fits my particular use case (it cheats in a way, since I'm using two executors, and just so you know, it's not related at all to the web browser example I gave above), but I'd like to see a more general purpose and elegant solution.
How about defining interface such as:
interface Task extends Callable {
boolean isReady();
}
Your "dataflow engine" would then just need to manage a collection of Task objects i.e. allow new Task objects to be queued for excecution and allow queries as to the status of a given task (so maybe the interface above needs extending to include id and/or type). When a task completes (and when the engine starts of course) the engine must just query any unstarted tasks to see if they are now ready, and if so pass them to be run on the executor. As you mention, any logging, etc. could also be done then.
One other thing that may help is to use Guice (http://code.google.com/p/google-guice/) or a similar lightweight DI framework to help wire up all the objects correctly (e.g. to ensure that the correct executor type is created, and to make sure that Tasks that need access to the dataflow engine (either for their isReady method or for queuing other tasks, say) can be provided with an instance without introducing complex circular relationships.
HTH, but please do comment if I've missed any key aspects...
Paul.
Look at https://github.com/rfqu/df4j — a simple but powerful dataflow library. If it lacks some desired features, they can be added easily.