threading: search for a value and stop all threads - java

I need to implement a search method that will search through haystack and return first founded index of needle.
static int search(T needle, T[] haystack, int numThreads)
My question: How can I stop all threads if one of the thread finds result?
For example: I am searching for 5, I have 10 numbers in array such that [2, 4, 5, 6, 1, 4, 5, 8, 9, 3] and there are 2 threads. So first thread will look for first part [0 - 5), second thread will search other part [5 - 10). If thread 2 starts firstly and finds result quicker than other thread, it should return 6 and terminate thread 1 and 2.

The classic way of doing this is to simply have shared data between the threads so that they can communicate with each other. In other words, initialise some flag value to "not found" before starting the threads.
Then, when the threads are running, they process elements in the array until either their elements are exhausted, or the flag value has been set to "found".
In pseudo-code, that would be something like:
main():
global array = createArray(size = 10000, values = random)
global foundIndex = -1
global mutex = createMutex()
startThread(id = 1, func = threadFn, param = (0, 4999))
startThread(id = 2, func = threadFn, param = (5000, 9999))
waitFor(id = 1)
waitFor(id = 2)
print("Result is ", foundIndex)
threadFn(first, last):
for index in first through last inclusive:
if userSpecifiedCheckFound(array[index]):
mutex.lock()
if foundIndex == -1:
foundIndex = index
mutex.unlock()
return
mutex.lock()
localIndex = foundIndex
mutex.unlock()
if localIndex != -1:
return
You can see from that that each instance of the function will set the shared data and return if it finds a value that matches whatever criteria you're looking for. It will also return (without setting the shared data) if another thread has already set the shared data, meaning it can exit early if another thread has already found something.
Just keep in mind that the shared data, foundIndex in this case, needs to be protected from simultaneous changes lest it become corrupted. In the pseudo-code, I've shown how to do that with low-level mutual exclusion semaphores.
In Java, that means using synchronized to achieve the same effect. By way of example, the following code sets up some suitable test data so that the sixteenth cell of the twenty-cell array will satisfy the search criteria.
It then runs two threads, one on each half of the data, until it finds that cell.
public class TestProg extends Thread {
// Shared data.
static int [] sm_array = new int[20];
static int sm_foundIndex = -1;
// Each thread responsible for its own stuff.
private int m_id, m_curr, m_last;
public TestProg(int id, int first, int last) {
m_id = id;
m_curr = first;
m_last = last;
}
// Runnable: continue until someone finds it.
public void run() {
// Try all cells allotted to thread.
while (m_curr <= m_last) {
System.out.println(m_id + ": processing " + m_curr);
// If I find it first, save and exit.
if (sm_array[m_curr] != 0) {
synchronized(this) {
if (sm_foundIndex == -1) {
sm_foundIndex = m_curr;
System.out.println(m_id + ": early exit, I found it");
return;
}
}
}
// If someone else finds it, just exit.
synchronized(this) {
if (sm_foundIndex != -1) {
System.out.println(m_id + ": early exit, sibling found it");
return;
}
}
// Kludge to ensure threads run side-by-side.
try { Thread.sleep(100); } catch(Exception e) {}
m_curr++;
}
}
public static void main(String[] args) {
// Create test data.
for (int i = 0; i < 20; i++) {
sm_array[i] = 0;
}
sm_array[15] = 1;
// Create and start threads.
HelloWorld thread1 = new HelloWorld(1, 0, 9);
HelloWorld thread2 = new HelloWorld(2, 10, 19);
thread1.start();
thread2.start();
// Wait for both to finish, then print result.
try {
thread1.join();
thread2.join();
System.out.println("=> Result was " + sm_foundIndex);
} catch(Exception e) {
System.out.println("Interrupted: " + e);
}
}
}
The output of that code (although threading makes it a little non-deterministic) is:
1: processing 0
2: processing 10
1: processing 1
2: processing 11
1: processing 2
2: processing 12
1: processing 3
2: processing 13
1: processing 4
2: processing 14
1: processing 5
2: processing 15
2: early exit, I found it
1: processing 6
1: early exit, sibling found it
=> Result was 15

You could look at ExecutorCompletionService, once first result is available then cancel all other tasks.
CompletionService uses a supplied Executor to execute tasks and
placing all the future results on a queue from which you can take the results in the order they are completed

Related

Rest parellel calls to service -Multithreading in java

I have a rest call api where max count of result return by the api is 1000.start page=1
{
"status": "OK",
"payload": {
"EMPList":[],
count:5665
}
So to get other result I have to change the start page=2 and again hit the service.again will get 1000 results only.
but after first call i want to make it as a parallel call and I want to collect the result and combine it and send it back to calling service in java. Please suggest i am new to java.i tried using callable but it's not working
It seems to me that ideally you should be able to configure your max count to one appropriate for your use case. I'm assuming you aren't able to do that. Here is a simple, lock-less, multi threading scheme that acts as a simple reduction operation for your two network calls:
// online runnable: https://ideone.com/47KsoS
int resultSize = 5;
int[] result = new int[resultSize*2];
Thread pg1 = new Thread(){
public void run(){
System.out.println("Thread 1 Running...");
// write numbers 1-5 to indexes 0-4
for(int i = 0 ; i < resultSize; i ++) {
result[i] = i + 1;
}
System.out.println("Thread 1 Exiting...");
}
};
Thread pg2 = new Thread(){
public void run(){
System.out.println("Thread 2 Running");
// write numbers 5-10 to indexes 5-9
for(int i = 0 ; i < resultSize; i ++) {
result[i + resultSize] = i + 1 + resultSize;
}
System.out.println("Thread 2 Exiting...");
}
};
pg1.start();
pg2.start();
// ensure that pg1 execution finishes
pg1.join();
// ensure that pg2 execution finishes
pg2.join();
// print result of reduction operation
System.out.println(Arrays.toString(result));
There is a very important caveat with this implementation however. You will notice that both of the threads DO NOT overlap in their memory writes. This is very important as if you were to simply change our int[] result to ArrayList<Integer> this could lead to catastrophic failure in our reduction operation between the two threads called a Race Condition (I believe the standard ArrayList implementation in Java is not thread safe). Since we can guarantee how large our result will be I would highly suggest sticking to my usage of an array for this multi-threaded implementation as ArrayLists hide a lot of implementation logic from you that you likely won't understand until you take a basic data-structures course.

Program not always terminating? [duplicate]

This question already has answers here:
Why is i++ not atomic?
(10 answers)
What is a debugger and how can it help me diagnose problems?
(2 answers)
Closed 4 years ago.
I wanted to test out multithreading for a project of mine, trying to also develop a solution in case something goes wrong.
So I made this small test:
main
public class main
{
static int addToCounter;
static int addToErrorCounter;
public static void main(String[] args) throws InterruptedException
{
int threads = 10;
Executor exec = new Executor();
for (int i = 0; i < threads; i++)
{
double error = Math.random();
testClass aldo = new testClass();
Thread thread = aldo.getThread(300, error);
exec.execute(thread);
}
while (threads != (addToCounter + addToErrorCounter))
{
System.out.println("Not all threads finished, number of finished threads is: " + (addToCounter + addToErrorCounter));
Thread.sleep(50);
}
System.out.println("Number of Threads that finished correctly: " + addToCounter);
}
}
testClass
import test1.main;
public class testClass
{
public Thread getThread(long time, double error)
{
Thread thread = new Thread()
{
public void run()
{
try
{
Thread.sleep(time);
}
catch (InterruptedException e)
{
// TODO Auto-generated catch block
e.printStackTrace();
}
if (error > 0.5)
{
main.addToErrorCounter++;
throw new java.lang.Error("HELLO");
}
System.out.println("I DID THIS!");
main.addToCounter++;
}
};
return thread;
}
}
(you'll have to fix the imports, also I use a custom class Executor, although that's only a wrapper for ExecutorService)
The weird behaviour is that sometimes it works properly, and sometimes it doesn't (total terminated thread count is 9, although I can see clearly it printed "I DID THIS!" and the error exactly 10 times).
Any fix?
The Problem might be a racecondition.
the "++" operator is not atomic.
Imageine the following scenario. There are two Threads at the same time. both want to increase a number and finish.
The initial value of the number is 0.
Thread 0 reads the number, knows now it is 0.
Thread 1 reads the number, knows now it is 0.
Thread 0 (who knew it was 0) now writes 1 to the memory.
Thread 1 does not know, that the number has changed, and still believes the number is 0 so he also writes a 1 to the memory.
You need something like a synchronizing mechanisim, something like a lock, or a semaphore or something else.
have a look at this for more information: http://winterbe.com/posts/2015/04/30/java8-concurrency-tutorial-synchronized-locks-examples/
for your example you could use the "synchronized" example from that link.
add a method to your main class looking like this to increment the addToCounter and also to the addToErrorCounterto remove the effects from your error counter:
synchronized AddToError(int e){
addToError += e;
}
synchronized IncCounter(){
addToCounter++;
}
call those methods in your threads in the testclass instead of incrementing them unsynchronized.
My guess is that the postfix operator (main.addToCounter++) is not atomic. This line of code is probably equivalent to something like:
int temp = main.addToCounter;
main.addToCounter = temp + 1;
return temp;
With multiple threads doin this at the same time, two threads could obtain the same value for temp (because both peform the first line in the above pseudo-code before either performs the second), and hence the counter total will be too small once all threads are complete. See Why is i++ not atomic? for more information.
A quick fix in this situation is to make addToCounter an AtomicInteger, then use addToCounter.incrementAndGet() in place of addToCounter++.

Thread synchronisation using semaphores

This was an interview question , any help would be appreciated
How do you synchronize two threads, out of which one increments a value and the the other displays it ( P.S. the thread which displays the value must only display a value when its a new value )
Ex : int x = 5;
T1 : increments it to 6
T2 : must display 6 ( only once ) and must display it again when it becomes 7
I answered that I would use a semaphore something like:
int c=0; // variable that I used to synchronize
// In T1
if( c = 0 )
{
c++;
x++; // value that is incremented
}
// in T2
if( c == 1 )
{
cout<<x;
c--;
}
He then asked what would you do if there's a context switch from thread T1 to T2 after setting c to 1 but before incrementing x ( As in that case it would enter P2 before incrementing x )
I couldn't answer this part. Any help would be appreciated.
This is a classic use case for a condition variable with the slight hitch that the value can easily update more than once in thread 1 before thread 2 runs to handle it:
// In some scope common to both threads
int c_ = 0; // variable
std::mutex mutex_();
std::condition_variable cond_();
// Thread 1
{
std::lock_guard<std::mutex> lock(mutex_);
++c_;
}
cond_.notify_one();
// Thread 2
{
std::lock_guard<std::mutex> lock( mutex_ );
int cLocal = c_;
while ( !done ) {
cond_.wait( lock, [] { return c_ != cLocal; } );
while ( cLocal++ < c_ )
... // Display new *local* value
}
}
Nice exercise.
You haven't specified the c++ tag in the question, but the question itself contains cout<<x, so you were probably interviewing for a C++ position. Despite that, I'm going to answer in Java since this is an interview question and language shouldn't matter much as long as I avoid using anything too specific to Java.
As your interviewer pointed out, the synchronization has to happen in both directions:
The printing thread must wait for the incrementing thread to finish its job
The incrementing thread must wait for the printing thread to finish its job
So we need something to let us know that the printer is done (so the incrementer can run), and another to let us know that the incrementer is done. I used two semaphores for that:
Working version on Ideone
import java.util.concurrent.Semaphore;
class IncrementDemo {
static int x = 0;
public static void main(String[] args) {
Semaphore incrementLock = new Semaphore(0);
Semaphore printLock = new Semaphore(0);
Thread incrementer = new Thread(() -> {
for(;;) {
incrementLock.acquire(); //Wait to be allowed to increment
x++;
printLock.release(); //Allow the printer to print
}
});
Thread printer = new Thread(() -> {
for (;;) {
incrementLock.release(); //Let the incrementer to its job
printLock.acquire(); //Wait to be allowed to print
System.out.println(x);
}
});
incrementer.setDaemon(false); //Keep the program alive after main() exits
printer.setDaemon(false);
incrementer.start(); //Start both threads
printer.start();
}
}
(I removed the try/catch blocks around acquire for readability).
Output:
1
2
3
4
5
6
7
...
Problems:
There are 2 main problems with parallel code in general.
1. Atomicity
The smallest granularity in code are in fact not the single operations like i++, but the underlying assembly-instructions. Therefore every operation, which involves a write, may not be called from multiple threads. (this differs heavily on your target architecture, but x86 is in contrast to arm64 very restrictive)
But luckily c++ provides the std::atomic operations, which give you a nice plattform independent way to modify variables from multiple threads.
2. Consistency
Both the compiler and the processor are allowed to reorder any instruction as long the consistency of the local thread is preserved. So what does this mean?
Take a look at your first thread
if( c = 0 )
{
c++;
x++; // value that is incremented
}
You have 3 operations c == 0, c++ and x++. Both increments do not depend from each other, hence the compiler would be allowed to swap them. At runtime the core may reorder them too, leaving you in very vague situation. In a sequential world this is perfectly fine and improves the overall performance (unless it leads to security holes like meltdown). Unfortunately neither the compiler or the cpu recognize parallel code, therefore any optimization may break your parallel program.
But once again, c++ provides a built-in solution for this problem called std::memory_order, which enforces are specific consistency-model.
Solutions:
Simple mutex:
A mutex is a simple, but powerfull tool. It solves the problems with Atomicity and Consistency by providing so called critical sections, which prevent parallel execution. This means, that in the given example the if-clause in both threads are sequential and will never be executed in parallel.
The implementation works, but has a flaw. If one of the threads is very slow, the other one will waste a lot of cpu-time by continous checking the newValue flag.
#include <mutex>
std::mutex mutex;
int value = true;
bool newValue = false;
void producer_thread() {
while(true) {
std::lock_guard<std::mutex> lg(mutex);
if (newValue == false) {
value++;
newValue = true;
}
}
}
void consumer_thread() {
while(true) {
std::lock_guard<std::mutex> lg(mutex);
if (newValue == true) {
std::cout << value;
newValue = false;
}
}
}
Condition Variable:
A condition variable is basically just a "wait-for-notify"-construct. You can block the current execution by calling wait until an other thread calls notify. This implementation would be the go-to scenario.
#include <mutex>
#include <condition_variable>
std::mutex mutex;
std::condition_variable cond;
int value = true;
bool newValue = false;
void producer() {
while(true) {
std::unique_lock<std::mutex> ul(mutex);
while (newValue == true) {
cond.wait(ul);
}
value++;
newValue = true;
cond.notify_all();
}
}
void consumer() {
while(true) {
std::unique_lock<std::mutex> ul(mutex);
while (newValue == false) {
cond.wait(ul);
}
std::cout << value;
newValue = false;
cond.notify_all();
}
}

Data Races in an AtomicIntegerArray

In the code below:
I am updating num[1]=0 of an AtomicIntegerArray num 1000 times each in 2 threads.
At the end of the 2 threads in main thread ;shouldn't the value of num[1] be 2000 as there shouldn't be data races in an AtomicIntegerArray .
However I get random values < 2000. Could someone tell me why?
Code:
import java.util.concurrent.atomic.AtomicIntegerArray;
public class AtomicIntegerArr {
private static AtomicIntegerArray num= new AtomicIntegerArray(2);
public static void main(String[] args) throws InterruptedException {
Thread t1 = new Thread(new MyRun1());
Thread t2 = new Thread(new MyRun2());
num.set(0, 10);
num.set(1, 0);
System.out.println("In Main num before:"+num.get(1));
t1.start();
t2.start();
t1.join();
t2.join();
System.out.println("In Main num after:"+num.get(1));
}
static class MyRun1 implements Runnable {
public void run() {
for (int i = 0; i < 1000; i++) {
num.set(1,num.get(1)+1);
}
}
}
static class MyRun2 implements Runnable {
public void run() {
for (int i = 0; i < 1000; i++) {
num.set(1,num.get(1)+1);
}
}
}
}
Edit: Adding num.compareAndSet(1, num.get(1), num.get(1)+1); instead of num.set(1,num.get(1)+1); doesnt work either.
I get random values < 2000. Could someone tell me why?
This is called the lost-update problem.
Because, in the following code:
num.set(1, num.get(1) + 1);
Although each individual operation involved is atomic, the combined operation is not. The single operations from the two threads can interleave, causing updates from one thread to be overwritten with stale value by another thread.
You can use compareAndSet to solve this problem, but you have to check whether the operation is successful, and do it again when it fails.
int v;
do {
v = num.get(1);
} while (!num.compareAndSet(1, v, v+1));
There's also a method for exactly this purpose:
num.accumulateAndGet(1, 1, (x, d)->x+d);
accumulateAndGet(int i, int x, IntBinaryOperator accumulatorFunction)
Atomically updates the element at index i with the results of applying the given function to the current and given values, returning the updated value. The function should be side-effect-free, since it may be re-applied when attempted updates fail due to contention among threads. The function is applied with the current value at index i as its first argument, and the given update as the second argument.
This is a classic race condition. Any time you have a fetch, an operation, and a put, your code is racy.
Consider two threads, both executing num.set(1,num.get(1)+1) at roughly the "same time." First, let's break down what the expression itself is doing:
it fetches num.get(1); let's call this x
it adds 1 to that; let's call this y
it puts that sum in at `num.set(1, y);
Even though the intermediate values in your expression are just values on the stack, and not explicit variables, the operation is the same: get, add, put.
Okay, so back to our two threads. What if the operations are ordered like this?
inital state: n[1] = 5
Thread A | Thread B
========================
x = n[1] = 5 |
| x = n[1] = 5
| y = 5 + 1 = 6
y = 5 + 1 = 6 |
n[1] = 6 |
| n[1] = 6
Since both threads fetched the value before either thread put its added value, they both do the same thing. You have 5 + 1 twice, and the result is 6, not 7!
What you want is getAndIncrement(int idx), or one of the similar methods that does the get, adding, and putting atomically.
These methods can actually all be built on top of the compareAndSet method you identified. But to do that, you need to do the increment within a loop, trying until the compareAndSet returns true. Also, for that to work, you have store that initial num.get(1) value in a local variable, rather than fetching it a second time. In effect, this loop says "keep trying the get-add-put logic until it works without anyone else having raced between the operations." In my example above, Thread B would have noticed that compareAndSet(1, 5, 6) fails (since the actual value at that time is 6, not 5 as expected), and thus retried. This is in fact what all of those atomic methods, like getAndIncrement, do.

Java: Thread joins over 10,000 iterations inconsistent

Alright folks.. I'm back again (seems to be my home lately).
I'm going through the whole cave of programming YouTube vids on multi-threading. This particular one uses 2 threads that go through a for loop which adds 1 to a variable 10,000 times each. So you join them so the result is 20,000 when it's done.
public class main {
private int count = 0;
public static void main(String[] args) {
main main = new main();
main.doWork();
}
public void doWork(){
Thread t1 = new Thread(new Runnable(){
public void run(){
for (int i = 0; i < 10000; i++){
count++;
}
}
});
Thread t2 = new Thread(new Runnable(){
public void run(){
for (int i = 0; i < 10000; i++){
count++;
}
}
});
t1.start();
t2.start();
try {
t1.join();
t2.join();
} catch (InterruptedException ex) {
Logger.getLogger(main.class.getName()).log(Level.SEVERE, null, ex);
}
System.out.println("Count is: " + count);
}
}
Thing is.. when i change the iterations:
i < 10 = 20 (correct)
i < 100 = 200 (correct)
i < 1000 = 2000 (correct)
i < 10000 = 13034 (first run)
= 14516 (second run)
= ... etc..
Why won't it properly handle iterations in the tens of thousands?
You have demonstrated the classic race condition, which occurs when 2 or more threads are reading and writing to the same variable in conflicting ways. This arises because the ++ operator isn't an atomic operation -- multiple operations are occurring, and a thread could be interrupted in between operations, e.g.:
Thread t1 reads count (0), and calculates the incremented value (1), but it hasn't stored the value back to count yet.
Thread t2 reads count (still 0), and calculates the incremented value (1), but it hasn't stored the value back to count yet.
Thread t1 stores its 1 value back to count.
Thread t2 stores its 1 value back to count.
Two updates have occurred, but the net result is only an increase of 1. The set of operations which must not be interrupted is a critical section.
This appears to have happened 20,000 - 13,034 times, or 6,966 times in your first execution. You may have gotten lucky with lower bounds, but regardless of the magnitude of the bounds, the race condition can happen.
In Java, there are several solutions:
Place synchronized blocks around the critical sections (both count++ lines), locking on this.
Change count to an AtomicInteger, which encapsulates such operations atomically on its own. The getAndIncrement method would replace the ++ operator here.

Categories

Resources