Threads not executing in method - java

I am trying to learn threads by building a web crawler, In the following code
searchHelper(website, keyword); finds all the links from a web page and keeps the links that have a keyword within the url, the idea is that searchHelper is called and then for each link found a thread is formed to act as web crawlers, so for example if the FIRST website contains 5 links on it then 5 threads will be formed so that there will be five web crawlers working together, currently the threads are not working so that i only get the results from the first page, I have been able to get it to work without threads, for example if i remove the entire for loop and replace with the while loop then the web crawler program works as expected, any help would appreciate, here is the method that performs the threads
private void search(String website, String keyword)
{
searchHelper(website, keyword);
int limit = queue.size();
Thread[] threads = new Thread[limit];
for(int i = 0; i < limit; i++)
{
threads[i] = new Thread(new Runnable()
{
#Override
public void run()
{
while(!queue.isEmpty() && queue.size() <= 10000)
searchHelper(queue.poll(), keyword);
}
});
threads[i].start();
}
if(results.isEmpty())
text.append("No results, sorry :(");
else
{
text.append("\nList of results:\n\n");
for(String x: results)
text.append(x + "\n");
}
}

You don't wait for the threads to finish. The last part of your code will probably be reached before even a single thread finishes its work, so no results can be displayed.

Add this right after your for-loop:
for(int i = 0; i < limit; i++)
{
threads[i].join();
}
This way your main thread will wait for all of the threads to finish execution before accessing the results.

Related

Rest parellel calls to service -Multithreading in java

I have a rest call api where max count of result return by the api is 1000.start page=1
{
"status": "OK",
"payload": {
"EMPList":[],
count:5665
}
So to get other result I have to change the start page=2 and again hit the service.again will get 1000 results only.
but after first call i want to make it as a parallel call and I want to collect the result and combine it and send it back to calling service in java. Please suggest i am new to java.i tried using callable but it's not working
It seems to me that ideally you should be able to configure your max count to one appropriate for your use case. I'm assuming you aren't able to do that. Here is a simple, lock-less, multi threading scheme that acts as a simple reduction operation for your two network calls:
// online runnable: https://ideone.com/47KsoS
int resultSize = 5;
int[] result = new int[resultSize*2];
Thread pg1 = new Thread(){
public void run(){
System.out.println("Thread 1 Running...");
// write numbers 1-5 to indexes 0-4
for(int i = 0 ; i < resultSize; i ++) {
result[i] = i + 1;
}
System.out.println("Thread 1 Exiting...");
}
};
Thread pg2 = new Thread(){
public void run(){
System.out.println("Thread 2 Running");
// write numbers 5-10 to indexes 5-9
for(int i = 0 ; i < resultSize; i ++) {
result[i + resultSize] = i + 1 + resultSize;
}
System.out.println("Thread 2 Exiting...");
}
};
pg1.start();
pg2.start();
// ensure that pg1 execution finishes
pg1.join();
// ensure that pg2 execution finishes
pg2.join();
// print result of reduction operation
System.out.println(Arrays.toString(result));
There is a very important caveat with this implementation however. You will notice that both of the threads DO NOT overlap in their memory writes. This is very important as if you were to simply change our int[] result to ArrayList<Integer> this could lead to catastrophic failure in our reduction operation between the two threads called a Race Condition (I believe the standard ArrayList implementation in Java is not thread safe). Since we can guarantee how large our result will be I would highly suggest sticking to my usage of an array for this multi-threaded implementation as ArrayLists hide a lot of implementation logic from you that you likely won't understand until you take a basic data-structures course.

(Thread pools in Java) Increasing number of threads creates slow down for simple for loop. Why?

I've got a little bit of work that is easily parallelizable, and I want to use Java threads to split up the work across my four core machine. It's a genetic algorithm applied to the traveling salesman problem. It doesn't sound easily parallelizable, but the first loop is very easily so. The second part where I talk about the actual evolution may or may not be, but I want to know if I'm getting slow down because of the way I'm implementing threading, or if its the algorithm itself.
Also, if anyone has better ideas on how I should be implementing what I'm trying to do, that would be very much appreciated.
In main(), I have this:
final ArrayBlockingQueue<Runnable> queue = new ArrayBlockingQueue<Runnable>(numThreads*numIter);
ThreadPoolExecutor tpool = new ThreadPoolExecutor(numThreads, numThreads, 10, TimeUnit.SECONDS, queue);
barrier = new CyclicBarrier(numThreads);
k.init(tpool);
I have a loop that is done inside of init() and looks like this:
for (int i = 0; i < numCities; i++) {
x[i] = rand.nextInt(width);
y[i] = rand.nextInt(height);
}
That I changed to this:
int errorCities = 0, stepCities = 0;
stepCities = numCities/numThreads;
errorCities = numCities - stepCities*numThreads;
// Split up work, assign to threads
for (int i = 1; i <= numThreads; i++) {
int startCities = (i-1)*stepCities;
int endCities = startCities + stepCities;
// This is a bit messy...
if(i <= numThreads) endCities += errorCities;
tpool.execute(new citySetupThread(startCities, endCities));
}
And here is citySetupThread() class:
public class citySetupThread implements Runnable {
int start, end;
public citySetupThread(int s, int e) {
start = s;
end = e;
}
public void run() {
for (int j = start; j < end; j++) {
x[j] = ThreadLocalRandom.current().nextInt(0, width);
y[j] = ThreadLocalRandom.current().nextInt(0, height);
}
try {
barrier.await();
} catch (InterruptedException ie) {
return;
} catch (BrokenBarrierException bbe) {
return;
}
}
}
The above code is run once in the program, so it was sort of a test case for my threading constructs (this is my first experience with Java threads). I implemented the same sort of thing in a real critical section, specifically the evolution part of the genetic algorithm, whose class is as follows:
public class evolveThread implements Runnable {
int start, end;
public evolveThread(int s, int e) {
start = s;
end = e;
}
public void run() {
// Get midpoint
int n = population.length/2, m;
for (m = start; m > end; m--) {
int i, j;
i = ThreadLocalRandom.current().nextInt(0, n);
do {
j = ThreadLocalRandom.current().nextInt(0, n);
} while(i == j);
population[m].crossover(population[i], population[j]);
population[m].mutate(numCities);
}
try {
barrier.await();
} catch (InterruptedException ie) {
return;
} catch (BrokenBarrierException bbe) {
return;
}
}
}
Which exists in a function evolve() that is called in init() like so:
for (int p = 0; p < numIter; p++) evolve(p, tpool);
Yes I know that's not terribly good design, but for other reasons I'm stuck with it. Inside of evolve is the relevant parts, shown here:
// Threaded inner loop
int startEvolve = popSize - 1,
endEvolve = (popSize - 1) - (popSize - 1)/numThreads;
// Split up work, assign to threads
for (int i = 0; i < numThreads; i++) {
endEvolve = (popSize - 1) - (popSize - 1)*(i + 1)/numThreads + 1;
tpool.execute(new evolveThread(startEvolve, endEvolve));
startEvolve = endEvolve;
}
// Wait for our comrades
try {
barrier.await();
} catch (InterruptedException ie) {
return;
} catch (BrokenBarrierException bbe) {
return;
}
population[1].crossover(population[0], population[1]);
population[1].mutate(numCities);
population[0].mutate(numCities);
// Pick out the strongest
Arrays.sort(population, population[0]);
current = population[0];
generation++;
What I really want to know is this:
What role does the "queue" have? Am I right to create a queue for as many jobs as I think will be executed for all threads in the pool? If the size isn't sufficiently large, I get RejectedExecutionException's. I just decided to do numThreads*numIterations because that's how many jobs there would be (for the actual evolution method that I mentioned earlier). It's weird though.. I shouldn't have to do this if the barrier.await()'s were working, which leads me to...
Am I using the barrier.await() correctly? Currently I have it in two places: inside the run() method for the Runnable object, and after the for loop that executes all the jobs. I would've thought only one would be required, but I get errors if I remove one or the other.
I'm suspicious of contention for the threads, as that is the only thing I can glean from the absurd slowdown (which does scale with the input parameters). I want to know if it is anything to do with how I'm implementing the thread pool and barriers. If not, then I'll have to look inside the crossover() and mutate() methods, I suppose.
First, I think you may have a bug with how you intended to use the CyclicBarrier. Currently you are initializing it with the number of executor threads as the number of parties. You have an additional party, however; the main thread. So I think you need to do:
barrier = new CyclicBarrier(numThreads + 1);
I think this should work, but personally I find it an odd use of the barrier.
When using a worker-queue thread-pool model I find it easier to use a Semaphore or Java's Future model.
For a semaphore:
class MyRunnable implements Runnable {
private final Semaphore sem;
public MyRunnable(Semaphore sem) {
this.sem = sem;
}
public void run() {
// do work
// signal complete
sem.release()
}
}
Then in your main thread:
Semaphore sem = new Semaphore(0);
for (int i = 0; i < numJobs; ++i) {
threadPool.execute(new MyRunnable(sem));
}
sem.acquire(numJobs);
Its really doing the same thing as the barrier, but I find it easier to think about the worker tasks "signaling" that they are done instead of "sync'ing up" with the main thread again.
For example, if you look at the example code in the CyclicBarrier JavaDoc the call to barrier.await() is inside the loop inside the worker. So it is really synching up the multiple long running worker threads and the main thread is not participating in the barrier. Calling barrier.await() at the end of the worker outside the loop is more signaling completion.
As you increase the number of tasks, you increase the overhead using each task adds. This means you want to minimise the number of tasks i.e. the same as the number of cpus you have. For some tasks using double the number of cpus can be better when the work load is not even.
BTW: You don't need a barrier in each task, you can wait for the future of each task to complete by calling get() on each one.

Why does an IllegalThreadStateException occur when Thread.start is called again

public class SieveGenerator{
static int N = 50;
public static void main(String args[]){
int cores = Runtime.getRuntime().availableProcessors();
int f[] = new int[N];
//fill array with 0,1,2...f.length
for(int j=0;j<f.length;j++){
f[j]=j;
}
f[0]=0;f[1]=0;//eliminate these cases
int p=2;
removeNonPrime []t = new removeNonPrime[cores];
for(int i = 0; i < cores; i++){
t[i] = new removeNonPrime(f,p);
}
while(p <= (int)(Math.sqrt(N))){
t[p%cores].start();//problem here because you cannot start a thread which has already started(IllegalThreadStateException)
try{
t[p%cores].join();
}catch(Exception e){}
//get the next prime
p++;
while(p<=(int)(Math.sqrt(N))&&f[p]==0)p++;
}
//count primes
int total = 0;
System.out.println();
for(int j=0; j<f.length;j++){
if(f[j]!=0){
total++;
}
}
System.out.printf("Number of primes up to %d = %d",f.length,total);
}
}
class removeNonPrime extends Thread{
int k;
int arr[];
public removeNonPrime(int arr[], int k){
this.arr = arr;
this.k = k;
}
public void run(){
int j = k*k;
while(j<arr.length){
if(arr[j]%k == 0)arr[j]=0;
j=j+arr[k];
}
}
}
Hi I'm getting an IllegalThreadStateException when I run my code and I've figured it's because I am trying to start a thread that has already been started. So how could I kill
or stop the thread each time, to get around this problem?
how could I kill or stop the thread each time, to get around this problem?
The answer is, you can't. Once started, a Thread may not be restarted. This is clearly documented in the javadoc for Thread. Instead, what you really want to do is new an instance of RemoveNonPrime each time you come around in your loop.
You have a few other problems in your code.
First, you need to increment p before using it again:
for(int i = 0; i < cores; i++){
t[i] = new removeNonPrime(f,p); //<--- BUG, always using p=2 means only multiples of 2 are cleared
}
Second, you might be multithreaded, but you aren't concurrent. The code you have basically only allows one thread to run at a time:
while(p <= (int)(Math.sqrt(N))){
t[p%cores].start();//
try{
t[p%cores].join(); //<--- BUG, only the thread which was just started can be running now
}catch(Exception e){}
//get the next prime
p++;
while(p<=(int)(Math.sqrt(N))&&f[p]==0)p++;
}
Just my $0.02, but what you are trying to do might work, but the logic for selecting the next smallest prime will not always pick a prime, for example if one of the other threads hasn't processed that part of the array yet.
Here is an approach using an ExecutorService, there are some blanks (...) that you will have to fill in:
/* A queue to trick the executor into blocking until a Thread is available when offer is called */
public class SpecialSyncQueue<E> extends SynchronousQueue<E> {
#Override
public boolean offer(E e) {
try {
put(e);
return true;
} catch (InterruptedException ex) {
Thread.currentThread().interrupt();
return false;
}
}
}
ExecutorService executor = new ThreadPoolExecutor(cores, cores, new SpecialSyncQueue(), ...);
void pruneNonPrimes() {
//...
while(p <= (int)(Math.sqrt(N))) {
executor.execute(new RemoveNonPrime(f, p));
//get the next prime
p++;
while(p<=(int)(Math.sqrt(N))&&f[p]==0)p++;
}
//count primes
int total = 0;
System.out.println();
for(int j=0; j<f.length;j++){
if(f[j]!=0){
total++;
}
}
System.out.printf("Number of primes up to %d = %d",f.length,total);
}
class RemoveNonPrime extends Runnable {
int k;
int arr[];
public RemoveNonPrime(int arr[], int k){
this.arr = arr;
this.k = k;
}
public void run(){
int j = k*k;
while(j<arr.length){
if(arr[j]%k == 0)arr[j]=0;
j+=k;
}
}
}
You could implement Runnable instead and use new Thread( $Runnable here ).start() or use a ExecutorService to reuse threads.
* It is never legal to start a thread more than once.
* In particular, a thread may not be restarted once it has completed
* execution.
*
* #exception IllegalThreadStateException if the thread was already started
*/
public synchronized void start() {
In Android, document still mention that we will get IllegalThreadStateException if the thread was already started.
However for some device it will not throw this exception (tested on Kyocera 7.0). In some popular device like Samsung, HTC, it throw throw the exception normally
I answer here because the Android question is mark as duplicated to this question.
Why does an IllegalThreadStateException occur when Thread.start is
called again
Because JDK/JVM implementers coded Thread.start() method that way. Its a reasonable functional expectation to be able to restart a thread after a thread has completed its execution and that is what being suggested in chrisbunney's answer ( and I have put in a comment in that answer ) but if you look at Thread.start() implementation , the very first line is ,
if (threadStatus != 0)
throw new IllegalThreadStateException();
where threadStatus == 0 means NEW state so my guess is that implementation doesn't resets this state to zero after execution has completed & thread is left in TERMINATED state ( non - zero state ). So when you create a new Thread instance on same Runnable , you basically reset this state to zero.
Also, I noticed the usage of word - may & never in same paragraph as different behavior is being pointed out by Phan Van Linh on some OSes,
It is never legal to start a thread more than once. In particular, a
thread may not be restarted once it has completed execution.
I guess what they are trying to say in above Javadoc that even if you don't get IllegalThreadStateException on certain OS, its not legal in Java/Thread class way & you might get unexpected behavior.
The famous thread state diagrams depict the same scenario - no going back from dead state to new.
ThreadPools can be used for delivering tasks to set number of threads. When initiating you set the number of threads. Then you add tasks for the pool. And after you can block until all tasks have finished processing. Here is some sample code.
I am not at all sure I understand the question. All the methods for stopping threads that are executed from other threads are deprecated; the way to stop a thread is to have it check a variable that it and another thread can access (perhaps a volatile variable), and have the running thread check it occasionally to see if it should exit on its own.
I cannot tell why/whether you want to eliminate the running thread and use another one, and I cannot see how the different threads are going to help execute your overall goal any faster. But it's possible I'm just not understanding the math.
The Thread.isAlive() method can tell you if the Thread has already been started. Simply do this where you want to start your thread:
if(!t[p%cores].isAlive()){
t[p%cores].start();
}

Multi-threading with Java, How to stop?

I am writing a code for my homework, I am not so familiar with writing multi-threaded applications. I learned how to open a thread and start it. I better show the code.
for (int i = 0; i < a.length; i++) {
download(host, port, a[i]);
scan.next();
}
My code above connects to a server opens a.length multiple parallel requests. In other words, download opens a[i] connections to get the same content on each iteration. However, I want my server to complete the download method when i = 0 and start the next iteration i = 1, when the the threads that download has opened completes. I did it with scan.next() to stop it by hand but obviously it is not a nice solution. How can I do that?
Edit:
public static long download(String host, int port) {
new java.io.File("Folder_" + N).mkdir();
N--;
int totalLength = length(host, port);
long result = 0;
ArrayList<HTTPThread> list = new ArrayList<HTTPThread>();
for (int i = 0; i < totalLength; i = i + N + 1) {
HTTPThread t;
if (i + N > totalLength) {
t = (new HTTPThread(host, port, i, totalLength - 1));
} else {
t = new HTTPThread(host, port, i, i + N);
}
list.add(t);
}
for (HTTPThread t : list) {
t.start();
}
return result;
}
And In my HTTPThread;
public void run() {
init(host, port);
downloadData(low, high);
close();
}
Note: Our test web server is a modified web server, it gets Range: i-j and in the response, there is contents of the i-j files.
You will need to call the join() method of the thread that is doing the downloading. This will cause the current thread to wait until the download thread is finished. This is a good post on how to use join.
If you'd like to post your download method you will probably get a more complete solution
EDIT:
Ok, so after you start your threads you will need to join them like so:
for (HTTPThread t : list) {
t.start();
}
for (HTTPThread t : list) {
t.join();
}
This will stop the method returning until all HTTPThreads have completed
It's probably not a great idea to create an unbounded number of threads to do an unbounded number of parallel http requests. (Both network sockets and threads are operating system resources, and require some bookkeeping overhead, and are therefore subject to quotas in many operating systems. In addition, the webserver you are reading from might not like 1000s of concurrent connections, because his network sockets are finite, too!).
You can easily control the number of concurrent connections using an ExecutorService:
List<DownloadTask> tasks = new ArrayList<DownloadTask>();
for (int i = 0; i < length; i++) {
tasks.add(new DownloadTask(i));
}
ExecutorService executor = Executors.newFixedThreadPool(N);
executor.invokeAll(tasks);
executor.shutdown();
This is both shorter and better than your homegrown concurrency limit, because your limit will delay starting with the next batch until all threads from the current batch have completed. With an ExceutorService, a new task is begun whenever an old task has completed (and there are still tasks left). That is, your solution will have 1 to N concurrent requests until all tasks have been started, whereas the ExecutorService will always have N concurrent requests.

Java Thread Yielding/ Starvation Problem

I'm writing a code that will run a multithreaded bank. I first create an array of threads with one program, then pass them into another thread that runs a loop to start them. For part of the application, I have a CPU intensive method that basically runs a series of loops within one another. Only problem is, for some reason it is not yielding the way that I think it should. Here is the code that is running the threads:
public void run(){
this.setPriority(MAX_PRIORITY);
int count = 0;
while(count<transactions.length){
int copy = count;
if(transactions[copy] instanceof Jumbler){
System.out.println(copy + " is a jumbler.");
}
else{
System.out.println(copy + " is not a jumbler");
}
transactions[copy].run();
count++;
}
}
Then here is the Jumbler run method:
public void run(){
System.out.println("running jumbler");
Thread.yield();
Thread.currentThread().yield();
try{
Thread.currentThread().sleep(5000);
}catch(InterruptedException e){}
//this.setPriority(MIN_PRIORITY);
System.out.println("still running.");
Thread.yield();
nums = new int[1000];
int i = 0;
do{
Thread.yield();
for(int x=0;x<1000;x++){
Thread.yield();
//System.out.println("in the loop");
nums[x]=(int)(Math.random()*10000)+1;
for(int y = 0;y<1000;y++){
Thread.yield();
//System.out.println("in the the loop");
for(int z = 0;z<100;z++){
Thread.yield();
}
}
}
Thread.yield();
i++;
System.out.println(whichJumble + ": " + i);
}while(i<1000);
}
So, the problem is that I want it to yield, allowing the main method to continue running more threads, but it blocks and waits for the Jumbler to complete (which takes a long time). Any idea why that would happen or how to fix it?
I suppose the issue comes with transactions[copy].run(); in your main loop. This one calls the run method directly but not in another system thread. Instead start the thread with transactions[copy].start();.
It seems that you're spawning the thread correctly (in fact, you're not spawning them at all)
If you want a Thread to start running (concurrently to the current thread) you need to call the start() method of that Thread object, which you don't.
If I understand your code correctly, you want the first snippet to spawn the other threads. Therefore you should change transactions[copy].run() to transactions[copy].start().
(This an educated guess. It would be nice if you showed the definition of the transaction array.)
Here's the typical scheme of launching several Threads:
class MyThread extends Thread {
public void run() {
// Do something here ...
}
}
// Prepare the array
MyThread[] arr = new MyThread[10];
for(int i = 0; i < arr.length; ++i)
arr[i] = new MyThread();
...
// Launch the threads
for(int i = 0; i < arr.length; ++i)
arr[i].start();
Once the thread is running, i don't think you can be guaranteed that priority changes when you call setPriority.
these two statements do the same thing:
Thread.yield();
Thread.currentThread().yield();
but you probably shouldn't be calling yield, let the os do that.

Categories

Resources