Not reevaluating expensive data in different threads

Not reevaluating expensive data in different threads - java

i have such a method
public Object doSomethingExpensive(String x);
now if i processed this method i can save the result in a HashMap for example, they key is the String x and the value is the result Object.
If the data is present in this map, i dont have to process it again.
But now i get two requests in, at nearly the same time.
And in this case i want to let the second request wait, till the first one is done and the second requests can also get the result of the first one after it is calculated, so i dont have to calculate it twice or parallel twice.
the point is, i can't use
public synchronized Object doSomethingExpensive(String x);
because Object is something other if String x is something other.
So i need some synchronized on that String x.
But synchronized(x) isn't possible because string literals in java....
Also if there wouldn't be a String but an Object as x, then maybe i get in the second requests a similar Object with equal content related to request 1, but they are some other objects each.
Yeah so my question is, how to solve this, how can i prevent calculating for a String x its result twice in parallel, how can i synchronize it and caching the results in a HashMap for example.

I don't know if I understand your problem, If it's to avoid repeated calculation, this great book(Java Concurrency in Practice) gives a example of a solution:
private final Map<String, Future<Object>> cache
= new ConcurrentHashMap<String, Future<Object>>();
public Object doSomethingExpensive(String x) throws InterruptedException {
while (true) {
Future<Object> future = cache.get(x);
if (future == null) {
Callable<Object> callable = new Callable<Object>() {
#Override
public Object call() throws Exception {
// doSomethingExpensive todo
return new Object();
}
};
FutureTask<Object> futureTask = new FutureTask<>(callable);
future = cache.putIfAbsent(x, futureTask);
if (future == null) {
future = futureTask;
futureTask.run();
}
}
try {
return future.get();
} catch (CancellationException e) {
cache.remove(x);
} catch (ExecutionException e) {
throw new RuntimeException(e.getCause());
}
}
}
EDIT:
From comments, using JAVA8#ConcurrentHashMap#computeIfAbsent, really really convenient :
ConcurrentHashMap<String, Object> concurrentHashMap = new ConcurrentHashMap<>();
public Object doSthEx(String key) {
return concurrentHashMap.computeIfAbsent(key, new Function<String, Object>() {
#Override
public Object apply(String s) {
// todo
return new Object();
}
});
}
Or use some library to get more comprehensive features as mentioned in comment:https://github.com/ben-manes/caffeine.

Related

How to continuously scan input into console?

I have taken a task upon myself to learn Java. My idea was to create a simple game with only the text console. The "AI" (timer) will periodically send a string and the player has to write a correct string in response, otherwise s/he loses a life.
My first question therefore is: Is there a simple way to combine timer and scanner? I need it to constantly "watch" the console line for strings.
After some time of searching and tries where I mostly struggled to scan the text while generating or generate strings while scanning I found following code but it has an issue at:
if ((name =in.nextLine(2000)) ==null)
If I rewrite the condition to, for example, compare to !="a" instead of null, the code just ignores the condition and always writes "Too slow!" no matter what. If it is =="a" it always says Hello, a. I completely don't understand why, it seems to ignore the logic.
So the second question would have been, why does it ignore the logic when it is different? And how do I fix it?
public class TimedScanner
{
public TimedScanner(InputStream input)
{
in = new Scanner(input);
}
private Scanner in;
private ExecutorService ex = Executors.newSingleThreadExecutor(new ThreadFactory()
{
#Override
public Thread newThread(Runnable r)
{
Thread t = new Thread(r);
t.setDaemon(true);
return t;
}
});
public static void main(String[] args) {
TimedScanner in = new TimedScanner(System.in);
int playerHealth = 5;
System.out.print("Enter your name: ");
try {
while (playerHealth > 0) {
String name = null;
if ((name = in.nextLine(3000)) ==null) {
System.out.println(name);
System.out.println("Too slow!");
playerHealth--;
} else {
System.out.println(name);
System.out.println("Hello, " + name);
}
}
} catch (InterruptedException | ExecutionException e) {
//TODO Auto-generated catch block
e.printStackTrace();
}
}
public String nextLine(int timeout) throws InterruptedException, ExecutionException
{
Future<String> result = ex.submit(new Worker());
try
{
return result.get(timeout, TimeUnit.MILLISECONDS);
}
catch (TimeoutException e)
{
return null;
}
}
private class Worker implements Callable<String>
{
#Override
public String call() throws Exception
{
return in.nextLine();
}
}
}
This is very barebones idea of what it should do. In the while I plan to put in a randomly picked string, that will be compared with the console input and wrong input = playerHealth--; correct input something else.

2) why does it ignore the logic when it is different? And how do I fix it?
You've stated:
If I rewrite the condition to, for example, compare to !="a" instead
of null, the code just ignores the condition and always writes "Too
slow!" no matter what.
In Java, NEVER (or almost never) compare two strings using == or !=. A String is an Object so comparing them using == means comparing them by address and not by value. So
if ((name = in.nextLine(3000)) != "a")
will always (or almost always) return true because any string returned from in#nextLine, be it "a" or something different, will be allocated on the heap at a different address than your hardcoded "a" string. The reason I'm saying "almost" is because Java uses a concept of String Pool: when creating a new reference to a literal it checks whether a string is already present in the pool in order to reuse it. But you should never rely on ==. Instead, use Object.Equals().
More discusion about Java String Pool here.
1) Is there a simple way to combine timer and scanner?
Well, console UI it's not really friendly with multi-threading when it comes to reading user input, but it can be done...
Your code has an issue: whenever the player loses a life, it has to press Enter twice - when it loses 2 life consecutively, it has to press Enter 3 times in order to receive a positive feedback from "AI". This is because you're not killing the preceding thread / cancelling the preceding task. So I suggest the following code:
private static Scanner in;
public String nextLine(int timeout) throws InterruptedException, ExecutionException
{
//keep a reference to the current worker
Worker worker = new Worker();
Future<String> result = ex.submit(worker);
try
{
return result.get(timeout, TimeUnit.MILLISECONDS);
}
catch (TimeoutException e)
{
//ask the worker thread to stop
worker.interrupt();
return null;
}
}
private class Worker implements Callable<String>
{
//you want the most up-to-date value of the flag, so 'volatile', though it's not really necessary
private volatile boolean interrupt;
#Override
public String call() throws Exception
{
//check whether there's something in the buffer;
while (System.in.available() == 0){
Thread.sleep(20);
//check for the interrupt flag
if(interrupt){
throw new InterruptedException();
}
}
//once this method is called there's no friendly way back - that's why we checked for nr of available bytes previously
return in.nextLine();
}
public void interrupt(){
this.interrupt = true;
}
}

How to block write access to the array from Thread while reading

I have two threads running parallel, and to get information about their internal results, I have created int array of length 8. With respect to their id, they can update relative area on the statu array. They are not let to write others area. Moreover, to correctly get and display statu array, I try to write getStatu method. While getting the result, I want to block others to write to the statu array; unfortunately, I donot get how to block other to write the statu array while I am getting and displaying result in getStatu method. How?
Note: If there is a part to cause misunderstood, tell me my friend, I will fix
class A{
Semaphore semaphore;
int [] statu; // statu is of length 8
void update(int idOfThread, int []statu_){
try {
semaphore.acquire();
int idx = idOfThread * 4;
statu[idx] = statu_[0];
statu[idx+1] = statu_[1];
statu[idx+2] = statu_[2];
statu[idx+3] = statu_[3];
} catch (...) {
} finally {
semaphore.release();
}
}
int[] getStatu(){
// Block write access of threads
// display statu array
// return statu array as return value
// release block, so threads can write to the array
}
}

Apart from using another lock/snc mechanism than Semaphore, just a proposal to improve this a little.
Putting both status[4] arrays into a single array[8] is not hte best solution. Consider task A writing its quadruplet: it must lock out task B reading the same, but there's no point in locking out task B writing B's quadruplet, and vice versa.
Generally speaking, the granularity of what is being locked is one important factor: locking the entire database is nonsense (except for overall processing like backup), however locking individual fields of a record would produce excessive overhead.

There are possibly better ways to get to where you want to, but only you know what you are trying to do. Going with your own scheme, there are things you are doing wrong. First thing, currently you are not achieving the granular locking you are planning to. For that you must have an array of semaphores. So the acquisition will look something like
semaphore[idOfThread].acquire();
Secondly, one thing you've not realised is that controlled access to data among threads is a co-operative activity. You cannot lock on one thread and not care to deal with locking on another and somehow impose the access control.
So unless the caller of your getStatu() will use the same set of semaphores when inspecting the array, your best bet is for getStatu() to make a new int[] array, copying segments of each thread after locking with the respective semaphore. So the array returned by getStatu() will be a snapshot at the point of call.

Please try the below code it will work for you. call afterStatu() in it.
class A {
Semaphore semaphore;
int[] statu; // statu is of length 8
private boolean stuck;
public A() {
}
void update(int idOfThread, int[] statu_) {
// if true, control will not go further
while (stuck);
try {
semaphore.acquire();
int idx = idOfThread * 4;
statu[idx] = statu_[0];
statu[idx + 1] = statu_[1];
statu[idx + 2] = statu_[2];
statu[idx + 3] = statu_[3];
} catch (Exception e) {
} finally {
semaphore.release();
}
}
int[] getStatu() {
// Block write access of threads
stuck = true;
// display statu array
for (int eachStatu : statu) {
System.out.println(eachStatu);
}
// return statu array as return value
return statu;
}
public void afterStatu() {
getStatu();
// release block, so threads can write to the array
stuck = false;
}
}

ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
int[] statu;
void update() {
lock.writeLock().lock();
try {
// update statu
} finally {
lock.writeLock().unlock();
}
}
int[] getStatu() {
lock.readLock().lock();
try {
// return statu
} finally {
lock.readLock().unlock();
}
}

Like ac3 said, only you know what you are trying to do.
Here's a solution that might be useful in the case where every thread that calls update() does so frequently, and calls to getStatu() are infrequent. It's complex, but it allows most of the update() calls to happen without any locking at all.
static final int NUMBER_OF_WORKER_THREADS = ...;
final AtomicReference<CountDownLatch> pauseRequested = new AtomicReference<CountDownLatch>(null);
final Object lock = new Object();
int[] statu = ...
//called in "worker" thread.
void update() {
if (pauseRequested.get() != null) {
pause();
}
... update my slots in statu[] array ...
}
private void pause() {
notifyMasterThatIAmPaused();
waitForMasterToLiftPauseRequest();
}
private void notifyMasterThatIAmPaused() {
pauseRequested.get().countDown();
}
private void waitForMasterToLiftPauseRequest() {
synchronized(lock) {
while (pauseRequested.get() != null) {
lock.wait();
}
}
}
//called in "master" thread
int[] getStatu( ) {
int[] result;
CountDownLatch cdl = requestWorkersToPause();
waitForWorkersToPause(cdl);
result = Arrays.copyOf(statu, statu.length);
liftPauseRequest();
return result;
}
private CountDownLatch requestWorkersToPause() {
cdl = new CountDownLatch(NUMBER_OF_WORKER_THREADS);
pauseRequested.set(cdl);
return cdl;
}
private void waitForWorkersToPause(CountDownLatch cdl) {
cdl.await();
}
private void liftPauseRequest() {
synchronized(lock) {
pauseRequested.set(null);
lock.notifyAll();
}
}

What will happen when two threads execute cache.putIfAbsent at the same time?

I am learning Java Concurrency in Practice, but some code confused me:
private final ConcurrentHashMap<A, Future<V>> cache = new ConcurrentHashMap<A, Future<V>>();
private final Computable<A, V> c;
public Memoizer(Computable<A, V> c) {
this.c = c;
}
/* (non-Javadoc)
* #see com.demo.buildingblocks.Computable#compute(java.lang.Object)
*/
#Override
public V compute(final A arg) throws InterruptedException {
while (true) {
Future<V> f = cache.get(arg);
if (f == null) {
//
Callable<V> eval = new Callable<V>() {
#Override
public V call() throws Exception {
return c.compute(arg);
}
};
FutureTask<V> ft = new FutureTask<V>(eval);
// what will happen when two threads arrive here at the same time?
f = cache.putIfAbsent(arg, ft);
if (f == null) {
f = ft;
ft.run();
}
}
try {
return f.get();
} catch (CancellationException e) {
cache.remove(arg, f);
} catch (ExecutionException e) {
launderThrowable(e);
}
}
}
I just can't understand, since putIfAbsent can only guarantee put operation is atomatic，and they all return null, if the two threads can both enter the run method?

putIfAbsent guarantees thread safety not only in the sense that it won't corrupt your data, but also in the sense that it always works on an up-to-date copy of the data.
Also, not it returns the previous value from the map, if such a value existed. So the first call to putIfAbsent would succeed, and return null, since there is no previous value. The second call will block until the first one succeeds, and then return the first value that was put in the map, causing the second run() to never be called.

The source of all atomic implementations is the compareAndSet(expected, newValue) method. So if 2 threads arrive and each one likes to set an element that is at - let's say - 1 to 3, the following happens:
Thread A: value.compareAndSet(1, 3) - success: value is now 3, return true.
Thread B: value.compareAndSet(1, 3) - error: value is not 1 as expected, return false.
Which of those threads arrive first is undefined, but since the compareAndSet is an atomic function, it is guaranteed that threads won't interfere each other while executing.

One item (int) caching in JAVA with a special use case

I'm trying to cache an int value (counting something from a DB).
This count can potentially take a lot of time, I'd like to first try and do it with a timeout of 200 milliseconds.
But if it's failed I have 2 scenarios:
My cache is populated, return the current value and re-populate it asynchronously.
The cache is not populated, block by populating it and return the value.
I'm saying "cache" but it really is just an int value, I'm not sure a full blown cache is needed here.
I've tried using Supplier from Guava, but I don't find a way to integrate my specific use case with it.
Bear in mind that many threads can enter this entire procedure, I only want the first one to wait in case the cache is not populated.
The rest should not wait and immediately get the cached value, an updated one if some other thread finished re-populating the cache.
Here is a sample code of what I have now:
public class CountRetriever {
private Supplier<Integer> cache = Suppliers.memoize(countSupplier());
private Supplier<Integer> countSupplier() {
return new Supplier<Integer>() {
#Override
public Integer get() {
// Do heavy count from the DB
}
};
}
public int getCount() {
try {
return submitAsyncFetch();
} catch (Exception e) {
// It takes too long, let's use the cache
return cache.get();
}
}
private Integer submitAsyncFetch() {
return executor.submit(new Callable<Integer>() {
#Override
public Integer call() throws Exception {
// Do heavy count from the DB
}
}).get(200, TimeUnit.MILLISECONDS);
}
}

Have you tried something like this?
private int Map<String,Integer> cache = new HashMap<String,Integer>();
public getValue(String key){
synchronized(cache){
Integer value = cache.get(key);
if(value == null) {
value = getValue();
cache.put(key,value);
}
return value;
}
}
Of course if a second thread comes in while the cache is being populated, it will have to wait. Also in this solution, only one Thread can read the value at once, so you might want replace the synchronized block with a ReadWriteLock although I doubt it will have a heavy effect on performance, given how simple the critical block is.

AtomicReference to a mutable object and visibility

Say I have an AtomicReferenceto a list of objects:
AtomicReference<List<?>> batch = new AtomicReference<List<Object>>(new ArrayList<Object>());
Thread A adds elements to this list: batch.get().add(o);
Later, thread B takes the list and, for example, stores it in a DB: insertBatch(batch.get());
Do I have to do additional synchronization when writing (Thread A) and reading (Thread B) to ensure thread B sees the list the way A left it, or is this taken care of by the AtomicReference?
In other words: if I have an AtomicReference to a mutable object, and one thread changes that object, do other threads see this change immediately?
Edit:
Maybe some example code is in order:
public void process(Reader in) throws IOException {
List<Future<AtomicReference<List<Object>>>> tasks = new ArrayList<Future<AtomicReference<List<Object>>>>();
ExecutorService exec = Executors.newFixedThreadPool(4);
for (int i = 0; i < 4; ++i) {
tasks.add(exec.submit(new Callable<AtomicReference<List<Object>>>() {
#Override public AtomicReference<List<Object>> call() throws IOException {
final AtomicReference<List<Object>> batch = new AtomicReference<List<Object>>(new ArrayList<Object>(batchSize));
Processor.this.parser.parse(in, new Parser.Handler() {
#Override public void onNewObject(Object event) {
batch.get().add(event);
if (batch.get().size() >= batchSize) {
dao.insertBatch(batch.getAndSet(new ArrayList<Object>(batchSize)));
}
}
});
return batch;
}
}));
}
List<Object> remainingBatches = new ArrayList<Object>();
for (Future<AtomicReference<List<Object>>> task : tasks) {
try {
AtomicReference<List<Object>> remainingBatch = task.get();
remainingBatches.addAll(remainingBatch.get());
} catch (ExecutionException e) {
Throwable cause = e.getCause();
if (cause instanceof IOException) {
throw (IOException)cause;
}
throw (RuntimeException)cause;
}
}
// these haven't been flushed yet by the worker threads
if (!remainingBatches.isEmpty()) {
dao.insertBatch(remainingBatches);
}
}
What happens here is that I create four worker threads to parse some text (this is the Reader in parameter to the process() method). Each worker saves the lines it has parsed in a batch, and flushes the batch when it is full (dao.insertBatch(batch.getAndSet(new ArrayList<Object>(batchSize)));).
Since the number of lines in the text isn't a multiple of the batch size, the last objects end up in a batch that isn't flushed, since it's not full. These remaining batches are therefore inserted by the main thread.
I use AtomicReference.getAndSet() to replace the full batch with an empty one. It this program correct with regards to threading?

Um... it doesn't really work like this. AtomicReference guarantees that the reference itself is visible across threads i.e. if you assign it a different reference than the original one the update will be visible. It makes no guarantees about the actual contents of the object that reference is pointing to.
Therefore, read/write operations on the list contents require separate synchronization.
Edit: So, judging from your updated code and the comment you posted, setting the local reference to volatile is sufficient to ensure visibility.

I think that, forgetting all the code here, you exact question is this:
Do I have to do additional synchronization when writing (Thread A) and
reading (Thread B) to ensure thread B sees the list the way A left it,
or is this taken care of by the AtomicReference?
So, the exact response to that is: YES, atomic take care of visibility. And it is not my opinion but the JDK documentation one:
The memory effects for accesses and updates of atomics generally follow the rules for volatiles, as stated in The Java Language Specification, Third Edition (17.4 Memory Model).
I hope this helps.

Adding to Tudor's answer: You will have to make the ArrayList itself threadsafe or - depending on your requirements - even larger code blocks.
If you can get away with a threadsafe ArrayList you can "decorate" it like this:
batch = java.util.Collections.synchronizedList(new ArrayList<Object>());
But keep in mind: Even "simple" constructs like this are not threadsafe with this:
Object o = batch.get(batch.size()-1);

The AtomicReference will only help you with the reference to the list, it will not do anything to the list itself. More particularly, in your scenario, you will almost certainly run into problems when the system is under load where the consumer has taken the list while the producer is adding an item to it.
This sound to me like you should be using a BlockingQueue. You can then Limit the memory footprint if you producer is faster than your consumer and let the queue handle all contention.
Something like:
ArrayBlockingQueue<Object> queue = new ArrayBlockingQueue<Object> (50);
// ... Producer
queue.put(o);
// ... Consumer
List<Object> queueContents = new ArrayList<Object> ();
// Grab everything waiting in the queue in one chunk. Should never be more than 50 items.
queue.drainTo(queueContents);
Added
Thanks to #Tudor for pointing out the architecture you are using. ... I have to admit it is rather strange. You don't really need AtomicReference at all as far as I can see. Each thread owns its own ArrayList until it is passed on to dao at which point it is replaced so there is no contention at all anywhere.
I am a little concerned about you creating four parser on a single Reader. I hope you have some way of ensuring each parser does not affect the others.
I personally would use some form of producer-consumer pattern as I have described in the code above. Something like this perhaps.
static final int PROCESSES = 4;
static final int batchSize = 10;
public void process(Reader in) throws IOException, InterruptedException {
final List<Future<Void>> tasks = new ArrayList<Future<Void>>();
ExecutorService exec = Executors.newFixedThreadPool(PROCESSES);
// Queue of objects.
final ArrayBlockingQueue<Object> queue = new ArrayBlockingQueue<Object> (batchSize * 2);
// The final object to post.
final Object FINISHED = new Object();
// Start the producers.
for (int i = 0; i < PROCESSES; i++) {
tasks.add(exec.submit(new Callable<Void>() {
#Override
public Void call() throws IOException {
Processor.this.parser.parse(in, new Parser.Handler() {
#Override
public void onNewObject(Object event) {
queue.add(event);
}
});
// Post a finished down the queue.
queue.add(FINISHED);
return null;
}
}));
}
// Start the consumer.
tasks.add(exec.submit(new Callable<Void>() {
#Override
public Void call() throws IOException {
List<Object> batch = new ArrayList<Object>(batchSize);
int finishedCount = 0;
// Until all threads finished.
while ( finishedCount < PROCESSES ) {
Object o = queue.take();
if ( o != FINISHED ) {
// Batch them up.
batch.add(o);
if ( batch.size() >= batchSize ) {
dao.insertBatch(batch);
// If insertBatch takes a copy we could merely clear it.
batch = new ArrayList<Object>(batchSize);
}
} else {
// Count the finishes.
finishedCount += 1;
}
}
// Finished! Post any incopmplete batch.
if ( batch.size() > 0 ) {
dao.insertBatch(batch);
}
return null;
}
}));
// Wait for everything to finish.
exec.shutdown();
// Wait until all is done.
boolean finished = false;
do {
try {
// Wait up to 1 second for termination.
finished = exec.awaitTermination(1, TimeUnit.SECONDS);
} catch (InterruptedException ex) {
}
} while (!finished);
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Not reevaluating expensive data in different threads - java

Related

How to continuously scan input into console?

How to block write access to the array from Thread while reading

What will happen when two threads execute cache.putIfAbsent at the same time?

One item (int) caching in JAVA with a special use case

AtomicReference to a mutable object and visibility

Categories

Resources