I have a highly concurrent code that should be able to increment \ decrement a few counters concurrently and read the values. I don't require exact values at every read, so it might as well be eventually consistent. My main objective is that the write operations are non blocking and require no wait time, basically a few threads want to increment the same counter, call some increment function and don't wait for it to be processed.
However, I'm having hard figuring out a way to make such a counter.
I was thinking about using ConcurrentLinkedQueue<Boolean>. Add an element to the queue, and have another thread to pop elements and count increments \ decrements. However, it's not a BlockingQueue so I'd have to make a thread that constantly tries to poll the queue, feels like a huge waste to have a thread fully dedicated to this task. Just asking for size() is not an option because ConcurrentLinkedQueue the operation isn't constant time and every call has to traverse the entire queue, that would be insane waste of time.
I also looked at AtomicInteger but there is only lazy set operation, no lazy increments, incrementAndGet if I understand correctly would result in a locking based increment-read behavior which is definitely not what I need.
Is using a ConcurrentLinkedQueue<Boolean> and a dedicated polling thread my only option for an eventually consistent counter? Especially considering that I do not know how many threads will be trying to write and read this counter at any moment of time, these are spawned dynamically.
Sounds like java.util.concurrent.LongAdder might be what you're looking for:
This class is usually preferable to AtomicLong when multiple threads update a common sum that is used for purposes such as collecting statistics, not for fine-grained synchronization control. Under low update contention, the two classes have similar characteristics. But under high contention, expected throughput of this class is significantly higher, at the expense of higher space consumption.
My main objective is that the write operations are non-blocking and
require no wait time
It makes me think that like you need to introduce kind of async write.
I'd have to make a thread that constantly tries to poll the queue
You may try to create kind of increment-task using CompletableFuture which uses ForkJoinPool (or created dedicated single thread pool) with one thread which will eventually apply all write operations to some AtomicLong so the worker threads will not be blocked:
public class AsyncCounter {
private final AtomicLong counter = new AtomicLong();
// no-blocking here
public CompletableFuture<Long> inc() {
return CompletableFuture.supplyAsync(counter::incrementAndGet);
}
public long get() {
return counter.get();
}
}
But in any case, it should be tested using JMH because it usually happens that our consideration regarding performance is wrong.
Related
I'm going to use ConcurrentLinkedQueue and Java as an example to a more general question. Let me first explain the question with regards to ConcurrentLinkedQueue. Consider:
ConcurrentLinkedQueue<Integer> queue = new ConcurrentLinkedQueue<>();
while (true) {
Integer item = queue.poll();
if (item != null) {
// do some stuff
}
}
ConcurrentLinkedQueue::poll does not block. So if I was to run this code (and only this code) in its own thread. It would constantly do a redundant operation. Compare this to using something like LinkedBlockingQueue::take that blocks until something is available. How much of a difference does it make?
I realize that the question is very vague and specific to the language and data-structure's implementation. But the questions generalizes to something like this:
How resource consuming is it to run a forever loop that does some small, repetitive operation (like queue.poll())?
Because the operation is small but repetitive, each iteration finishes faster but the loop also runs at a higher frequency, which makes me think that it's worse.
As you have an infinite number of operations to execute, scheduled by the while (true) loop, you would bring one CPU core to constantly 100% utilisation with the polling implementation. Which is not a good idea.
On the other hand take blocks the thread until an item is available. The thread can be put in the background, while other threads can be executed on the CPU. A blocked thread does not consume resources. It is wake up, when an item is available and only then scheduled to be executed on the CPU.
The context switch of the scheduling of a background thread might give you a slightly slower reaction time (this depends on the operation system implementation of thread scheduling and handling interrupts), but over all it should be way better than putting constantly 100% utilisation on the CPU.
So, I have a loop where I create thousands of threads which process my data.
I checked and storing a Thread slows down my app.
It's from my loop:
Record r = new Record(id, data, outPath, debug);
//r.start();
threads.add(r);
//id is 4 digits
//data is something like 500 chars long
It stop my for loop for a while (it takes a second or more for one run, too much!).
Only init > duration: 0:00:06.369
With adding thread to ArrayList > duration: 0:00:07.348
Questions:
what is the best way of storing Threads?
how to make Threads faster?
should I create Threads and run them with special executor, means for example 10 at once, then next 10 etc.? (if yes, then how?)
Consider that having a number of threads that is very high is not very useful.
At least you can execute at the same time a number of threads equals to the number of core of your cpu.
The best is to reuse existing threads. To do that you can use the Executor framework.
For example to create an Executor that handle internally at most 10 threads you can do the followig:
List<Record> records = ...;
ExecutorService executor = Executors.newFixedThreadPool(10);
for (Record r : records) {
executor.submit(r);
}
// At the end stop the executor
executor.shutdown();
With a code similar to this one you can submit also many thousands of commands (Runnable implementations) but no more than 10 threads will be created.
I'm guessing that it is not the .add method that is really slowing you down. My guess is that the hundreds of Threads running in parallel is what really is the problem. Of course a simple command like "add" will be queued in the pipeline and can take long to be executed, even if the execution itself is fast. Also it is possible that your data-structure has an add method that is in O(n).
Possible solutions for this:
* Find a real wait-free solution for this. E.g. prioritising threads.
* Add them all to your data-structure before executing them
While it is possible to work like this it is strongly discouraged to create more than some Threads for stuff like this. You should use the Thread Executor as David Lorenzo already pointed out.
I have a loop where I create thousands of threads...
That's a bad sign right there. Creating threads is expensive.
Presumeably your program creates thousands of threads because it has thousands of tasks to perform. The trick is, to de-couple the threads from the tasks. Create just a few threads, and re-use them.
That's what a thread pool does for you.
Learn about the java.util.concurrent.ThreadPoolExecutor class and related classes (e.g., Future). It implements a thread pool, and chances are very likely that it provides all of the features that you need.
If your needs are simple enough, you can use one of the static methdods in java.util.concurrent.Executors to create and configure a thread pool. (e.g., Executors.newFixedThreadPool(N) will create a new thread pool with exactly N threads.)
If your tasks are all compute bound, then there's no reason to have any more threads than the number of CPUs in the machine. If your tasks spend time waiting for something (e.g., waiting for commands from a network client), then the decision of how many threads to create becomes more complicated: It depends on how much of what resources those threads use. You may need to experiment to find the right number.
Theoretical question. If I have two SwingWorkers and an outputObject with method
public void synchronized outputToPane(String output)
If each SwingWorker has a loop in it as shown:
//SwingWorker1
while(true) {
outputObject.outputToPane("garbage");
}
//SwingWorker2
Integer i=0;
while(true) {
outputObject.outputToPane(i.toString());
i++;
}
How would those interact? does the outputToPane method receive an argument from one thread and block the other one until it finishes with the first, or does it build a queue of tasks that will execute in the order received, or some other option?
The reason I ask:
I have two threads that will be doing some heavy number crunching, one with a non-pausable data stream and the other from a file. I would like them both to output to a central messaging area when they hit certain milestones; however, I CANNOT risk the data stream getting blocked while it waits for the other thread to finish with the output. I will risk losing data then.
synchronized only guarantees mutual exclusion. Is not fair, which in practice means that your workers might alternate quite nicely, or the first one might get precedence and block the second one completely until finished, or anything between.
See Reentrantlock docs for more about fairness. Maybe you could consider using it instead of synchronized. Probably even better alternative would be using a Queue.
I would advise you to have two output object in your messaging area. Because if one thread starts to modify the output answer then the other one will have to wait for it to finish. Even if you can optimize it to make it fast enough, the actual display of info would make your threads slow each others down over time.
Although you might try to synchronize them, the result might not always be 100% safe
We are developing a Java application with several worker threads. These threads will have to deliver a lot of computation results to our UI thread. The order in which the results are delivered does not matter.
Right now, all threads simply push their results onto a synchronized Stack - but this means that every thread must wait for the other threads before results can be delivered.
Is there a data structure that supports simultaneous insertions with each insertion completing in constant time?
Thanks,
Martin
ConcurrentLinkedQueue is designed for high contention. Producers enqueue stuff on one end and consumers collect elements at the other end, so everything will be processed in the order it's added.
ArrayBlockingQueue is a better for lower contention, with lower space overhead.
Edit: Although that's not what you asked for. Simultaneuos inserts? You may want to give every thread one output queue (say, an ArrayBlockingQueue) and then have the UI thread poll the separate queues. However, I'd think you'll find one of the two above Queue implementations sufficient.
Right now, all threads simply push
their results onto a synchronized
Stack - but this means that every
thread must wait for the other threads
before results can be delivered.
Do you have any evidence indicating that this is actually a problem? If the computation performed by those threads is even the least little bit complex (and you don't have literally millions of threads), then lock contention on the result stack is simply a non-issue because when any given thread delivers its results, all others are most likely busy doing their computations.
Take a step back and evaluate whether performance is the key design consideration here. Don't think, know: does profiling back it up?
If not, I'd say a bigger concern is clarity and readability of design, and not introducing new code to maintain. It just so happens that, if you're using Swing, there is a library for doing exactly what you're trying to do, called SwingWorker.
Take a look at java.util.concurrent.ConcurrentLinkedQueue, java.util.concurrent.ConcurrentHashMap or java.util.concurrent.ConcurrentSkipListSet. They might do what you need. ConcurrentSkipListSet, for instance, claims to have "expected average log(n) time cost for the contains, add and remove operations and their variants. Insertion, removal, and access operations safely execute concurrently by multiple threads."
Two other patterns you might want to look at are
each thread has its own collection, when polled it returns the collection and creates a new one, so the collection only holds the pending items between polls. The thread needs to protect operations on its collection, but there is no contention between threads. This is blocking (each thread cannot add to its collection while the UI thread pulls updates from it), but can reduce contention (no contention between threads).
each thread has its own collection, and appends the results to a common queue which is protected using a Lock.tryLock(). The thread continues processing if it fails to acquire the lock. This makes it less likely that a thread will block waiting for the shared queue.
I have a producer app that generates an index (stores it in some in-memory tree data structure). And a consumer app will use the index to search for partial matches.
I don't want the consumer UI to have to block (e.g. via some progress bar) while the producer is indexing the data. Basically if the user wishes to use the partial index, it will just do so. In this case, the producer will potentially have to stop indexing for a while until the user goes away to another screen.
Roughly, I know I will need the wait/notify protocol to achieve this. My question: is it possible to interrupt the producer thread using wait/notify while it is doing its business ? What java.util.concurrent primitives do I need to achieve this ?
The way you've described this, there's no reason that you need wait/notify. Simply synchronize access to your data structure, to ensure that it is in a consistent state when accessed.
Edit: by "synchronize access", I do not mean synchronize the entire data structure (which would end up blocking either producer or consumer). Instead, synchronize only those bits that are being updated, and only at the time that you update them. You'll find that most of the producer's work can take place in an unsynchronized manner: for example, if you're building a tree, you can identify the node where the insert needs to happen, synchronize on that node, do the insert, then continue on.
In your producer thread, you are likely to have some kind of main loop. This is probably the best place to interrupt your producer. Instead of using wait() and notify() I suggest you use the java synchronization objects introduced in java 5.
You could potentially do something like that
class Indexer {
Lock lock = new ReentrantLock();
public void index(){
while(somecondition){
this.lock.lock();
try{
// perform one indexing step
}finally{
lock.unlock();
}
}
}
public Item lookup(){
this.lock.lock();
try{
// perform your lookup
}finally{
lock.unlock();
}
}
}
You need to make sure that each time the indexer releases the lock, your index is in a consistent, legal state. In this scenario, when the indexer releases the lock, it leaves a chance for a new or waiting lookup() operation to take the lock, complete and release the lock, at which point your indexer can proceed to its next step. If no lookup() is currently waiting, then your indexer just reaquires the lock itself and goes on with its next operation.
If you think you might have more that one thread trying to do the lookup at the same time, you might want to have a look at the ReadWriteLock interface and ReentrantReadWriteLock implementation.
Of course this solution is the simple way to do it. It will block either one of the threads that doesn't have the lock. You may want to check if you can just synchronize on your data structure directly, but that might prove tricky since building indexes tends to use some sort of balanced tree or B-Tree or whatnot where node insertion is far from being trivial.
I suggest you first try that simple approach, then see if the way it behaves suits you. If it doesn't, you may either try breaking up the the indexing steps into smaller steps, or try synchronizing on only parts of your data structure.
Don't worry too much about the performance of locking, in java uncontended locking (when only one thread is trying to take the lock) is cheap. As long as most of your locking is uncontented, locking performance is nothing to be concerned about.
The producer application can have two indices: published and in-work. The producer will work only with in-work, the consumer will work only with published. Once the producer done with indexing it can replace in-work one with published (usually swapping one pointer). The producer may also publish copy of the partial index if will bring value. This way you will avoid long term locks -- it will be useful when index accessed by lost of consumers.
No, that's not possible.
The only way of notifying a thread without any explicit code in the thread itself is to use Thread.interrupt(), which will cause an exception in the thread. interrrupt() is usually not very reliable though, because throwing a exception at some random point in the code is a nightmare to get right in all code paths. Beside that, a single try{}catch(Throwable){} somewhere in the thread (including any libraries that you use) could be enough to swallow the signal.
In most cases, the only correct solution is use a shared flag or a queue that the consumer can use to pass messages to the producer. If you worry about the producer being unresponsive or freezing, run it in a separate thread and require it to send heartbeat messages every n seconds. If it does not send a heartbeat, kill it. (Note that determining whether a producer is actually freezing, and not just waiting for an external event, is often very hard to get right as well).