I was trying to intentionally create visibility issues with threads and I got unexpected results:
public class DownloadStatus {
private int totalBytes;
private boolean isDone;
public void increment() {
totalBytes++;
}
public int getTotalBytes() {
return totalBytes;
}
public boolean isDone() {
return isDone;
}
public void done() {
isDone = true;
}
}
public class DownloadFileTask implements Runnable {
DownloadStatus status;
public DownloadFileTask(DownloadStatus status) {
this.status = status;
}
#Override
public void run() {
System.out.println("start download");
for (int i = 0; i < 10_000; i++) { //"download" a 10,000 bytes file each time you run
status.increment(); //each byte downloaded - update the status
}
System.out.println("download ended with: " + status.getTotalBytes()); //**NOTE THIS LINE**
status.done();
}
}
//creating threads, one to download, another to wait for the download to be done.
public static void main(String[] args) {
DownloadStatus status = new DownloadStatus();
Thread t1 = new Thread(new DownloadFileTask(status));
Thread t2 = new Thread(() -> {
while (!status.isDone()) {}
System.out.println("DONE!!");
});
t1.start();
t2.start();
}
So, running this would create a visibility problem - the second thread wouldn't see the updated value since it had cached it before it got written back by the first thread - this causes an endless (while) loop, the second thread is constantly checking the cached isDone(). (at least that's how I think it works).
The thing I don't get is why this visibility problem stops happening when I comment out the line from the second code block that calls status.getTotalBytes().
From my understanding both threads start by caching the status object as-is, so the second thread should constantly check his cached value (and essentially not see the new value updated by the first thread).
Why is this line calling a method in the status object causing this visibility issue? (and more interestingly - why not calling it fixes it?)
What you call a "visibility problem" is actually a data race.
A single thread sees the effects of its operations in the order they are written. That is if you update a variable and then read it, you'll always see the updated value within that thread.
The effects of a thread's execution may be different when viewed from another thread. This is mainly related to the language and the underlying hardware architecture. The compiler may reorder instructions, delay memory writes while keeping values in registers, or the values may be kept in a cache before written to the main memory. Without an explicit memory barrier, the value in the main memory would not be updated. That's what you call the "visibility problem".
It is likely that there is a memory barrier in System.println. So when you execute that line, all updates up to that point will be committed to the main memory, and the other threads can see it. Note that without explicit synchronization, there is still no guarantee that the other threads will see it, because those threads may re-use the value they got for that variable before. There is nothing in the program that tells the compiler/runtime that the values may be changed by other threads.
This is the race condition between two threads. There is nothing to do with status.getTotalBytes() statement in your code. It is the scheduler that decides which thread will run. It is by chance that you are not getting stuck in the infinit loop after commenting the println statement. The main problem in your code that increment and set status should be atomic operation and replace the definition of run method as below. Secondly increment is also not a atomic operation. You can unpredictable results if there is no proper synchronization.
#Override
public void run() {
System.out.println("start download");
incrementAndSetStatus();
}
public synchronized void incrementAndSetStatus(){
for (int i = 0; i < 100000; i++) { //"download" a 10,000 bytes file each time you run
status.increment(); //each byte downloaded - update the status
}
System.out.println("download ended with: " + status.getTotalBytes()); //**NOTE THIS LINE**
status.done();
}
I have written a simple java class to perform part of a search on my data store, but when I ran it against my consecutive version the execution times were slower.
Consecutive Search took 11 milliseconds
Concurrent Search took 23 milliseconds
I have never written a threaded application, I was hoping it would be a pretty simple operation. Can someone point me in the right direction with this code snippet. Be brutal as I have no idea!
public class ExecuteManager {
private Store store;
private ArrayList<ArrayList<UUID>> entityKeys;
public ExecuteManager(Store store){
this.store = store;
this.entityKeys = new ArrayList<ArrayList<UUID>>();
}
// Returns a list of uuids that should be collected
public ArrayList<ArrayList<UUID>> run(UUID entityTypeKey, ArrayList<WhereClauseAbstract> whereClauseList) throws InterruptedException{
ArrayList<SearchThread> queryParts = new ArrayList<SearchThread>();
for (WhereClauseAbstract wc: whereClauseList){
SearchThread st = new SearchThread(entityTypeKey, wc);
st.start();
st.join();
queryParts.add(st);
}
return entityKeys;
}
public class SearchThread extends Thread {
private UUID entityTypeKey;
private WhereClauseAbstract whereClause;
public SearchThread(UUID entityTypeKey, WhereClauseAbstract whereClause){
this.entityTypeKey = entityTypeKey;
this.whereClause = whereClause;
}
public void run(){
// Run search and add to entity key list
entityKeys.add(
store.betterSearchUuid2(entityTypeKey, whereClause.getField(), whereClause.getOperator())
);
}
}
}
You have to be careful that the overhead of creating and passing work to threads don't exceed the work you are doing but the most serious flaw in your code is this
st.start();
st.join();
This means you are always waiting for your background threads to finish immediately after starting them. This means only one is ever running.
For benchmarking purposes I would make sure the code is warmed up and ignore the first 2 - 10 seconds depending on the complexity of what you are doing.
It is worth noting that pulling in data into your CPU cache from a long array is likely to be more expensive than matching your where clause. i.e. you are likely to get the best speed up by applying all the filters to a partition of the data. This is how parallelStream() works.
List<UUID> result = store.parallelStream()
.filter(whereClass)
.map(e -> e.getUUID())
.collect(Collections.toList());
This will collect together all the UUID of the entries which match your predicate using all the CPUs on your machine.
Here is an example of using Java Pool (pool of generics) in order to instantiate TouchEvents for Android:
import java.util.ArrayList;
import java.util.List;
public class Pool<T> {
public interface PoolObjectFactory<T> {
public T createObject();
}
private final List<T> freeObjects;
private final PoolObjectFactory<T> factory;
private final int maxSize;
public Pool(PoolObjectFactory<T> factory, int maxSize) {
this.factory = factory;
this.maxSize = maxSize;
this.freeObjects = new ArrayList<T>(maxSize);
}
public T newObject() {
T object = null;
if (freeObjects.isEmpty()) {
object = factory.createObject();
} else {
object = freeObjects.remove(freeObjects.size() - 1);
}
return object;
}
public void free(T object) {
if (freeObjects.size() < maxSize) {
freeObjects.add(object);
}
}
}
However, I don't really understand how this code works:
if (freeObjects.isEmpty()) {
object = factory.createObject();
} else {
object = freeObjects.remove(freeObjects.size() - 1);
}
Lets say we have:
touchEventPool = new Pool<TouchEvent>(factory, 100);
Does this mean it is going to store an Array of 100 events (and when #101 comes inside, will dispose #1, like first-in-first-out)?
I assume it supposed to hold some maximum number of objects and then dispose the extra. I read book's description like 10 times.. and couldn't get it. Maybe someone explain how this works?
I assume it supposed to hold some maximum number of objects and then dispose the extra. I read book's description like 10 times.. and couldn't get it. Maybe someone explain how this works?
Sort of. The class keeps a cache of pre-created objects in a List called pool. When you ask for a new object (via the newObject method) it will first check the pool to see if an object is available for use. If the pool is empty, it just creates an object and returns it to you. If there is an object available, it removes the last element in the pool and returns it to you.
Annotated:
if (freeObjects.isEmpty()) {
// The pool is empty, create a new object.
object = factory.createObject();
} else {
// The pool is non-empty, retrieve an object from the pool and return it.
object = freeObjects.remove(freeObjects.size() - 1);
}
And when you return an object to the cache (via the free() method), it will only be placed back into the pool if the maximum size of the pool has not been met.
Annotated:
public void free(T object) {
// If the pool is not already at its maximum size.
if (freeObjects.size() < maxSize) {
// Then put the object into the pool.
freeObjects.add(object);
}
// Otherwise, just ignore this call and let the object go out of scope.
}
If the pool's max size has already been reached, the object you are freeing is not stored and is (presumably) subject to garbage collection.
The idea of any pool is in creating controlled environment where (usually) no need to create new (event) instances when some unused free instances can be re-used from the pool.
When you create
touchEventPool = new Pool<TouchEvent>(factory, 100);
you hope 100 instances will be enough in any particular moment of the program live.
So when you want to get 101'st event the process probably will free first 5, 20 or even 99 events and the pool will be able to reuse any of them.
If there will be no free instances then depending on the pool policy the new one will be created or the requestor thread will wait other threads to release one and return to the pool. In this particular implementation the new one will be created.
I think that the main concept of object pool is to reduce frequency of object instanciations.
Does this mean it is going to store an Array of 100 events (and when #101 comes inside, will dispose #1, like first-in-first-out)?Does this mean it is going to store an Array of 100 events (and when #101 comes inside, will dispose #1, like first-in-first-out)?
I don't think so. The maximum number 100 means that of freeObjects but of currently using objects. When an object is not used any more, you shall free it. Then the freed object won't be descarded but be stocked as a freeObject (the max num means that of these spared objects). Next time you need another new object, you don't have to instanciate a new object. All you need is just reusing one of spared freeObjects.
Thus you can avoid costly object instanciations. It can improve in performance.
My main class, generates multiple threads based on some rules. (20-40 threads live for long time).
Each thread create several threads (short time ) --> I am using executer for this one.
I need to work on Multi dimension arrays in the short time threads --> I wrote it like it is in the code below --> but I think that it is not efficient since I pass it so many times to so many threads / tasks --. I tried to access it directly from the threads (by declaring it as public --> no success) --> will be happy to get comments / advices on how to improve it.
I also look at next step to return a 1 dimension array as a result (which might be better just to update it at the Assetfactory class ) --> and I am not sure how to.
please see the code below.
thanks
Paz
import java.util.concurrent.*;
import java.util.logging.Level;
public class AssetFactory implements Runnable{
private volatile boolean stop = false;
private volatile String feed ;
private double[][][] PeriodRates= new double[10][500][4];
private String TimeStr,Bid,periodicalRateIndicator;
private final BlockingQueue<String> workQueue;
ExecutorService IndicatorPool = Executors.newCachedThreadPool();
public AssetFactory(BlockingQueue<String> workQueue) {
this.workQueue = workQueue;
}
#Override
public void run(){
while (!stop) {
try{
feed = workQueue.take();
periodicalRateIndicator = CheckPeriod(TimeStr, Bid) ;
if (periodicalRateIndicator.length() >0) {
IndicatorPool.submit(new CalcMvg(periodicalRateIndicator,PeriodRates));
}
}
if ("Stop".equals(feed)) {
stop = true ;
}
} // try
catch (InterruptedException ex) {
logger.log(Level.SEVERE, null, ex);
stop = true;
}
} // while
} // run
Here is the CalcMVG class
public class CalcMvg implements Runnable {
private double [][][] PeriodRates = new double[10][500][4];
public CalcMvg(String Periods, double[][][] PeriodRates) {
System.out.println(Periods);
this.PeriodRates = PeriodRates ;
}
#Override
public void run(){
try{
// do some work with the data of PeriodRates array e.g. print it (no changes to array
System.out.println(PeriodRates[1][1][1]);
}
catch (Exception ex){
System.out.println(Thread.currentThread().getName() + ex.getMessage());
logger.log(Level.SEVERE, null, ex);
}
}//run
} // mvg class
There are several things going on here which seem to be wrong, but it is hard to give a good answer with the limited amount of code presented.
First the actual coding issues:
There is no need to define a variable as volatile if only one thread ever accesses it (stop, feed)
You should declare variables that are only used in a local context (run method) locally in that function and not globally for the whole instance (almost all variables). This allows the JIT to do various optimizations.
The InterruptedException should terminate the thread. Because it is thrown as a request to terminate the thread's work.
In your code example the workQueue doesn't seem to do anything but to put the threads to sleep or stop them. Why doesn't it just immediately feed the actual worker-threads with the required workload?
And then the code structure issues:
You use threads to feed threads with work. This is inefficient, as you only have a limited amount of cores that can actually do the work. As the execution order of threads is undefined, it is likely that the IndicatorPool is either mostly idle or overfilling with tasks that have not yet been done.
If you have a finite set of work to be done, the ExecutorCompletionService might be helpful for your task.
I think you will gain the best speed increase by redesigning the code structure. Imagine the following (assuming that I understood your question correctly):
There is a blocking queue of tasks that is fed by some data source (e.g. file-stream, network).
A set of worker-threads equal to the amount of cores is waiting on that data source for input, which is then processed and put into a completion queue.
A specific data set is the "terminator" for your work (e.g. "null"). If a thread encounters this terminator, it finishes it's loop and shuts down.
Now the following holds true for this construct:
Case 1: The data source is the bottle-neck. It cannot be speed-up by using multiple threads, as your harddisk/network won't work faster if you ask more often.
Case 2: The processing power on your machine is the bottle neck, as you cannot process more data than the worker threads/cores on your machine can handle.
In both cases the conclusion is, that the worker threads need to be the ones that seek for new data as soon as they are ready to process it. As either they need to be put on hold or they need to throttle the incoming data. This will ensure maximum throughput.
If all worker threads have terminated, the work is done. This can be i.E. tracked through the use of a CyclicBarrier or Phaser class.
Pseudo-code for the worker threads:
public void run() {
DataType e;
try {
while ((e = dataSource.next()) != null) {
process(e);
}
barrier.await();
} catch (InterruptedException ex) {
}
}
I hope this is helpful on your case.
Passing the array as an argument to the constructor is a reasonable approach, although unless you intend to copy the array it isn't necessary to initialize PeriodRates with a large array. It seems wasteful to allocate a large block of memory and then reassign its only reference straight away in the constructor. I would initialize it like this:
private final double [][][] PeriodRates;
public CalcMvg(String Periods, double[][][] PeriodRates) {
System.out.println(Periods);
this.PeriodRates = PeriodRates;
}
The other option is to define CalcMvg as an inner class of AssetFactory and declare PeriodRate as final. This would allow instances of CalcMvg to access PeriodRate in the outer instance of AssetFactory.
Returning the result is more difficult since it involves publishing the result across threads. One way to do this is to use synchronized methods:
private double[] result = null;
private synchronized void setResult(double[] result) {
this.result = result;
}
public synchronized double[] getResult() {
if (result == null) {
throw new RuntimeException("Result has not been initialized for this instance: " + this);
}
return result;
}
There are more advanced multi-threading concepts available in the Java libraries, e.g. Future, that might be appropriate in this case.
Regarding your concerns about the number of threads, allowing a library class to manage the allocation of work to a thread pool might solve this concern. Something like an Executor might help with this.
I have a single writer thread and single reader thread to update and process a pool of arrays(references stored in map). The ratio of writes to read is almost 5:1(latency of writes is a concern).
The writer thread needs to update few elements of an array in the pool based on some events. The entire write operation(all elements) needs to be atomic.
I want to ensure that reader thread reads the previous updated array if writer thread is updating it(something like volatile but on entire array rather than individual fields). Basically, I can afford to read stale values but not block.
Also, since the writes are so frequent, it would be really expensive to create new objects or lock the entire array while read/write.
Is there a more efficient data structure that could be used or use cheaper locks ?
How about this idea: The writer thread does not mutate the array. It simply queues the updates.
The reader thread, whenever it enters a read session that requires a stable snapshot of the array, applies the queued updates to the array, then reads the array.
class Update
{
int position;
Object value;
}
ArrayBlockingQueue<Update> updates = new ArrayBlockingQueue<>(Integer.MAX_VALUE);
void write()
{
updates.put(new Update(...));
}
Object[] read()
{
Update update;
while((update=updates.poll())!=null)
array[update.position] = update.value;
return array;
}
Is there a more efficient data structure?
Yes, absolutely! They're called persistent data structures. They are able to represent a new version of a vector/map/etc merely by storing the differences with respect to a previous version. All versions are immutable, which makes them appropiate for concurrency (writers don't interfere/block readers, and vice versa).
In order to express change, one stores references to a persistent data structure in a reference type such as AtomicReference, and changes what those references point to - not the structures themselves.
Clojure provides a top-notch implementation of persistent data structures. They're written in pure, efficient Java.
The following program exposes how one would approach your described problem using persistent data structures.
import clojure.lang.IPersistentVector;
import clojure.lang.PersistentVector;
public class AtomicArrayUpdates {
public static Map<Integer, AtomicReference<IPersistentVector>> pool
= new HashMap<>();
public static Random rnd = new Random();
public static final int SIZE = 60000;
// For simulating the reads/writes ratio
public static final int SLEEP_TIMÉ = 5;
static {
for (int i = 0; i < SIZE; i++) {
pool.put(i, new AtomicReference(PersistentVector.EMPTY));
}
}
public static class Writer implements Runnable {
#Override public void run() {
while (true) {
try {
Thread.sleep(SLEEP_TIMÉ);
} catch (InterruptedException e) {}
int index = rnd.nextInt(SIZE);
IPersistentVector vec = pool.get(index).get();
// note how we repeatedly assign vec to a new value
// cons() means "append a value".
vec = vec.cons(rnd.nextInt(SIZE + 1));
// assocN(): "update" at index 0
vec = vec.assocN(0, 42);
// appended values are nonsense, just an example!
vec = vec.cons(rnd.nextInt(SIZE + 1));
pool.get(index).set(vec);
}
}
}
public static class Reader implements Runnable {
#Override public void run() {
while (true) {
try {
Thread.sleep(SLEEP_TIMÉ * 5);
} catch (InterruptedException e) {}
IPersistentVector vec = pool.get(rnd.nextInt(SIZE)).get();
// Now you can do whatever you want with vec.
// nothing can mutate it, and reading it doesn't block writers!
}
}
}
public static void main(String[] args) {
new Thread(new Writer()).start();
new Thread(new Reader()).start();
}
}
Another idea, given that the array contains only 20 doubles.
Have two arrays, one for write, one for read.
Reader locks the read array during read.
read()
lock();
read stuff
unlock();
Writer first modifies the write array, then tryLock the read array, if locking fails, fine, write() returns; if locking succeeds, copy the write array to the read array, then release the lock.
write()
update write array
if tryLock()
copy write array to read array
unlock()
Reader can be blocked, but only for the time it takes to copy the 20 doubles, which is short.
Reader should use spin lock, like do{}while(tryLock()==false); to avoid being suspended.
I would do as follows:
synchronize the whole thing and see if the performance is good enough. Considering you only have one writer thread and one reader thread, contention will be low and this could work well enough
private final Map<Key, double[]> map = new HashMap<> ();
public synchronized void write(Key key, double value, int index) {
double[] array = map.get(key);
array[index] = value;
}
public synchronized double[] read(Key key) {
return map.get(key);
}
if it is too slow, I would have the writer make a copy of the array, change some values and put the new array back to the map. Note that array copies are very fast - typically, a 20 items array would most likely take less than 100 nanoseconds
//If all the keys and arrays are constructed before the writer/reader threads
//start, no need for a ConcurrentMap - otherwise use a ConcurrentMap
private final Map<Key, AtomicReference<double[]>> map = new HashMap<> ();
public void write(Key key, double value, int index) {
AtomicReference<double[]> ref = map.get(key);
double[] oldArray = ref.get();
double[] newArray = oldArray.clone();
newArray[index] = value;
//you might want to check the return value to see if it worked
//or you might just skip the update if another writes was performed
//in the meantime
ref.compareAndSet(oldArray, newArray);
}
public double[] read(Key key) {
return map.get(key).get(); //check for null
}
since the writes are so frequent, it would be really expensive to create new objects or lock the entire array while read/write.
How frequent? Unless there are hundreds of them every millisecond you should be fine.
Also note that:
object creation is fairly cheap in Java (think around 10 CPU cycles = a few nanoseconds)
garbage collection of short lived object is generally free (as long as the object stays in the young generation, if it is unreachable it is not visited by the GC)
whereas long lived objects have a GC performance impact because they need to be copied across to the old generation
The following variation is inspired by both my previous answer and one of zhong.j.yu's.
Writers don't interfere/block readers and vice versa, and there are no thread safety/visibility issues, or delicate reasoning going on.
public class V2 {
static Map<Integer, AtomicReference<Double[]>> commited = new HashMap<>();
static Random rnd = new Random();
static class Writer {
private Map<Integer, Double[]> writeable = new HashMap<>();
void write() {
int i = rnd.nextInt(writeable.size());
// manipulate writeable.get(i)...
commited.get(i).set(writeable.get(i).clone());
}
}
static class Reader{
void read() {
double[] arr = commited.get(rnd.nextInt(commited.size())).get();
// do something useful with arr...
}
}
}
You need two static references: readArray and writeArray and a simple mutex to track when write has been changed.
have a locked function called changeWriteArray make changes to a deepCopy of writeArray:
synchronized String[] changeWriteArray(String[] writeArrayCopy, other params go here){
// here make changes to deepCopy of writeArray
//then return deepCopy
return writeArrayCopy;
}
Notice that changeWriteArray is functional programming with effectively no side effect since it is returning a copy that is neither readArray nor writeArray.
whoever calles changeWriteArray must call it as writeArray = changeWriteArray(writeArray.deepCopy()).
the mutex is changed by both changeWriteArray and updateReadArray but is only checked by updateReadArray. If the mutex is set, updateReadArray will simply point the reference of readArray to the actual block of writeArray
EDIT:
#vemv concerning the answer you mentioned. While the ideas are the same, the difference is significant: the two static references are static so that no time is spent actually copying the changes into the readArray; rather the pointer of readArray is moved to point to writeArray. Effectively we are swapping by means of a tmp array that changeWriteArray generates as necessary. Also the locking here is minimal as reading does not require locking in the sense that you can have more than one reader at any given time.
In fact, with this approach, you can keep a count of concurrent readers and check the counter to be zero for when to update readArray with writeArray; again, furthering that reading requires no lock at all.
Improving on #zhong.j.yu's answer, it is really a good idea to queue the writes instead of trying to perform them when they occur. However, we must tackle the problem when updates are coming so fast that the reader would choke on updates continuously coming in. My idea is what if the reades only performs the writes that were queued before the read, and ignoring subsequent writes (those would be tackled by next read).
You will need to write your own synchornised queue. It will be based off a linked list, and would contain only two methods:
public synchronised enqeue(Write write);
This method will atomically enqueue a write. There is a possible deadlock when writes would come faster than it would actually take to enqueue them, but I think there would have to be hundreds of thousands of writes every second to achieve that.
public synchronised Element cut();
This will atomically empty the queue and returns its head (or tail) as the Element object. It will contain a chain of other Elements (Element.next, etc..., just the usual linked list stuff), all those representing a chain of writes since last read. The queue would then be empty, ready to accept new writes. The reader then can trace the Element chain (which will be standalone by then, untouched by subsequent writes), perform the writes, and finally perform the read. While the reader processes the read, new writes would be enqueued in the queue, but those will be next read's problem.
I wrote this once, albeit in C++, to represent a sound data buffer. There were more writes (driver sends more data), than reads (some mathematical stuff over the data), while the writes had to finish as soon as possible. (The data came in real-time, so I needed to save them before next batch was ready in the driver.)
I've got a funny solution using three arrays and a volatile boolean toggle. Basically, both threads have its own array. Additionally, there's a shared array controlled via the toggle.
When the writer finishes and the toggle allows it, it copies the newly written array into the shared array and flips the toggle.
Similarly, before the reader starts, when the toggle allows it, it copies the shared array into its own array and flips the toggle.
public class MolecularArray {
private final double[] writeArray;
private final double[] sharedArray;
private final double[] readArray;
private volatile boolean writerOwnsShared;
MolecularArray(int length) {
writeArray = new double[length];
sharedArray = new double[length];
readArray = new double[length];
}
void read(Consumer<double[]> reader) {
if (!writerOwnsShared) {
copyFromTo(sharedArray, readArray);
writerOwnsShared = true;
}
reader.accept(readArray);
}
void write(Consumer<double[]> writer) {
writer.accept(writeArray);
if (writerOwnsShared) {
copyFromTo(writeArray, sharedArray);
writerOwnsShared = false;
}
}
private void copyFromTo(double[] from, double[] to) {
System.arraycopy(from, 0, to, 0, from.length);
}
}
It depends on the "single writer thread and single reader" assumption.
It never blocks.
It uses a constant (albeit huge) amount of memory.
Repeated calls to read without any intervening write do no copying and vice versa.
The reader does not necessarily see the most recent data, but it sees the data from the first write started after the previous read, if any.
I guess, this could be improved using two shared arrays.