Why is this program creating more threads than possible? - java

This is for a custom UDTF in a hive query, CreateLogTable is the UDTF class which I am using as a temp for testing. I am creating one thread per file to be downloaded from Amazon S3 and waiting until another thread becomes available before allocating another file to the thread.
Main Test logic:
CreateLogTable CLT = new CreateLogTable();
int numThreads = 2;
int index = 0;
DownloadFileThread[] dlThreads = new DownloadFileThread[numThreads];
for (S3ObjectSummary oSummary : bucketKeys.getObjectSummaries()) {
while (dlThreads[index] != null && dlThreads[index].isAlive()) {
index += 1;
index = index % numThreads;
}
dlThreads[index] = new DownloadFileThread(CLT , getBucket(oSummary.getBucketName() + "/"
+ oSummary.getKey()), getFile(oSummary.getKey()), index);
dlThreads[index].start();
index += 1;
index = index % numThreads;
}
Thread class (run() method):
try {
System.out.println("Creating thread " + this.threadnum);
this.fileObj = this.S3CLIENT.getObject(new GetObjectRequest(this.filePath, this.fileName));
this.fileIn = new Scanner(new GZIPInputStream(this.fileObj.getObjectContent()));
while (this.fileIn.hasNext()) {
this.parent.forwardToTable(fileIn.nextLine());
}
System.out.println("Finished " + this.threadnum);
} catch (Throwable e) {
System.out.println("Downloading of " + this.fileName + " failed.");
}
The while loop before the thread creation should be looping until it finds a null thread or a dead thread until it exits the loop, in which case a new thread will be created and started. Since I included logging to console, I am able to observe this process, but the output is unexpected:
Creating thread 0
Creating thread 1
Creating thread 0
Creating thread 1
Creating thread 0
Creating thread 1
Creating thread 0
...
Creating thread 1
Creating thread 0
Creating thread 1
Finished 0
Finished 1
Finished 1
Finished 0
Finished 1
Finished 1
...
Finished 0
Finished 1
Finished 0
Finished 1
The above is only the first few lines of output. The issue is that more than two threads are created before any threads complete their tasks.
Why is this happening and how can I fix this?

I reduced your code to this test case:
public class ThreadTest {
private static class SleepThread extends Thread {
private final int index;
SleepThread(int ii) { index = ii; }
#Override
public void run() {
System.out.println("Creating thread " + this.index);
try {
Thread.sleep(5_000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("Finished " + this.index);
}
}
public static void main(String[] args) {
int numThreads = 2;
int index = 0;
SleepThread[] dlThreads = new SleepThread[numThreads];
for (int ii = 0; ii < 10; ++ii) {
while (dlThreads[index] != null && dlThreads[index].isAlive()) {
index += 1;
index = index % numThreads;
}
dlThreads[index] = new SleepThread(index);
dlThreads[index].start();
index += 1;
index = index % numThreads;
}
}
}
Using Sun JDK 1.7.0_75, running this produces the result that you'd expect--two threads start, they exit after five seconds, two more threads start, and so on.
The next thing I'd suspect is that your JVM's implementation of Thread.isAlive() isn't returning true for threads immediately after they are started, although that seems contrary to the documentation for the Thread class.

Try to see this example:
public static void main(String[] args) {
ExecutorService executor = Executors.newFixedThreadPool(5);
for (int i = 0; i < 10; i++) {
Runnable worker = new WorkerThread("" + i);
executor.execute(worker);
}
executor.shutdown();
while (!executor.isTerminated()) {
}
System.out.println("Finished all threads");
}
It's a thread pool using Java 8. A very simple and esay way to make it using the Executors. Very staraight forward way to make it.

The reason why the above code wasn't working was because of something wacky going on with the call to isAlive().
For some reason, no matter what state a thread is in, isAlive() will always return false for me, causing the creation of more and more threads, which replace the old ones in the array, dlThreads.
I solved the issue by creating a custom isWorking() method which simply returns a boolean of whether or not the thread's run() method has completed. Here is what the Thread class looks like now:
//this.isWorking initialized to true during instantiation
#Override
public void run() {
try {
System.out.println("Creating thread " + this.threadnum + " for " + filePath + "/" + fileName);
this.fileObj = this.S3CLIENT.getObject(new GetObjectRequest(this.filePath, this.fileName));
this.fileIn = new Scanner(new GZIPInputStream(this.fileObj.getObjectContent()));
while (this.fileIn.hasNext()) {
this.parent.forwardToTable(fileIn.nextLine());
}
System.out.println("Finished " + this.threadnum);
this.isWorking = false;
} catch (Throwable e) {
System.out.println("Downloading of " + this.fileName + " failed.");
e.printStackTrace();
this.isWorking = false;
}
}
public boolean isWorking(){
return this.isWorking;
}
However, after implementing this and being satisfied that my multithreaded script works, I switched over to using an Executor, as suggested by other users, which slightly improved performance and made the code much cleaner.

Related

Alternate index of Array and print numbers using 2 Threads

my exercise is composed of an SharedResource with an array, a NumberGenerator Class and a SumClass.
2 Threads of NumberGenerator and 2 Threads of SumClass.
I have to insert numbers in the array from the SharedResource with both threads of NumberGenerator Class.
This part is done correctly.
My problem is in the run method of the NumberGenerator Class.
These Threads must read alternatively the index of the array, the first thread, lets call it H1, read the index 0-2-4-6... even index, the another one, H2, odd index... But when 1 thread is running the other must wait.
So the output should be something like this:
Array:[10,35,24,18]
Thread name: H1 - Number:10
Thread name: H2 - Number:35
Thread name: H1 - Number:24
Thread name: H2 - Number:18
My problem resides in the notify and wait methods. When I call the wait method, it automatically stop both Threads even when a notifyAll is present.
I need to just use 1 method, I cant use a evenPrint method and an OddPrint method for example.
Code of my class:
public class SumatoriaThread extends Thread{
String name;
NumerosCompartidos compartidos; // Object that contains the array.
double sumatoria; //Parameter used to sum all the numbers
static int pos = 0; // pos of the object
static int threadOrder = 0; //just a way to set the order of the Threads.
int priority; //Value of the priority
public SumatoriaThread(NumerosCompartidos compartidos, String name){
this.name = name;
this.compartidos = compartidos;
sumatoria = 0;
priority = threadOrder;
threadOrder++;
System.out.println("Hilo " + this.name + " creado.");
}
public void run() {
//Array Length
int length = this.compartidos.length();
int i = 0;
while (pos < length) {
//Don't actually know if this is correct.
synchronized (this) {
//Just to be sure that the first Thread run before the second one.
if (priority == 1) {
try {
priority = 0;
//Call a wait method until the next notify runs.
this.wait();
} catch (InterruptedException ex) {
Logger.getLogger(SumatoriaThread.class.getName()).log(Level.SEVERE, null, ex);
}
} else {
//Even numbers
if (pos % 2 == 0) {
System.out.println("Nombre Thead: " + Thread.currentThread().getName() + " valor numero: " + this.compartidos.getValor(pos));
pos++;
//To activate the Thread which is waiting.
this.notifyAll();
try {
//Then I set to wait this one.
this.wait();
} catch (Exception ex) {
ex.printStackTrace();
}
} else {
System.out.println("Nombre Thead: " + Thread.currentThread().getName() + " valor numero: " + this.compartidos.getValor(pos + 1));
pos++;
this.notifyAll();
try {
this.wait();
} catch (InterruptedException ex) {
Logger.getLogger(SumatoriaThread.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
}
}
}
}

offbynull coroutines not consuming all

com.offbynull.coroutines version 1.1.0 consumers only consumes 7500 messages.
Please help me understand why this code only consumes 7500 messages instead of 30000.
public class DemoProducerConsumer {
public static int cnt = 0;
public static final int MAX = 10000;
public static class Producer implements Coroutine {
#Override
public void run(Continuation ctn) throws Exception {
String thName = Thread.currentThread().getName();
System.out.println(thName + ") Producer starting...");
Consumer consumer = new Consumer();
for (int i = 0; i < 3; i++) {
consumer.consume(ctn, "Hello:" + i);
}
System.out.println(thName + ") Producer published 3 messages");
}
}
public static class Consumer {
public void consume(Continuation ctn, String message) {
String thName = Thread.currentThread().getName();
System.out.println(thName + ")" + message);
cnt++; // <<< SUSPECT bug here.
ctn.suspend(); // <<< SUSPECT bug here.
}
}
public static final void main(String... args) throws InterruptedException {
String thName = Thread.currentThread().getName();
System.err.println(thName + ") Preparing Producer ");
new Thread(new Runnable() {
public void run() {
cnt = 0;
Producer producer = new Producer();
CoroutineRunner runner = new CoroutineRunner(producer);
for (int i = 0; i < MAX; i++) {
runner.execute();
}
System.out.println(thName + ") Producer Looped " + MAX + " times.");
}
}).start();
System.err.println(thName + ") Waiting " + (MAX * 3) + " message to be consumed...");
Thread.sleep(10000);
System.err.println(thName + ") Message consumed:" + cnt);
System.err.println(thName + ") Exiting...");
}
}
I plan to use this with Thread Pool to implement a higher performance MVC server.
Separation of consumer and producer is a must.
Author of coroutines here. You seem to be misunderstanding how the execute() method works. Everytime you call suspend(), execute() will return. When you call execute() again, it'll continue executing the method from the point which you suspended.
So, if you want to completely execute your coroutine MAX times, you need to change your main loop to the following:
for (int i = 0; i < MAX; i++) {
boolean stillExecuting;
do {
stillExecuting = runner.execute();
} while (stillExecuting);
}
In addition to that, since you're accessing the field cnt from separate threads, you should probably be marking cnt as volatile:
public static volatile int cnt = 0;
Running with the above changes produces what you expect for your output:
main) Producer Looped 10000 times.
main) Message consumed:30000
main) Exiting...
Also, you should spend some time evaluating whether coroutines are a good fit for your usecase. I don't understand the problem you're trying to solve, but it sounds like normal Java threading constructs may be a better fit.

Java thread not responding to volatile boolean flag

I am new to Java concurrency, and I met a very strange problem:
I read from a large file and used several worker threads to work on the input (some complicated string matching tasks). I used a LinkedBlockingQueue to transmit the data to the worker threads, and a volatile boolean flag in the worker class to respond to the signal when the end-of-file is reached.
However, I cannot get the worker thread to stop properly. The CPU usage by this program is almost zero in the end, but the program won't terminate normally.
The simplified code is below. I have removed the real code and replaced them with a simple word counter. But the effect is the same. The worker thread won't terminate after the whole file is processed, and the boolean flag is set to true in the main thread.
The class with main
public class MultiThreadTestEntry
{
private static String inputFileLocation = "someFile";
private static int numbOfThread = 3;
public static void main(String[] args)
{
int i = 0;
Worker[] workers = new Worker[numbOfThread];
Scanner input = GetIO.getTextInput(inputFileLocation);
String temp = null;
ExecutorService es = Executors.newFixedThreadPool(numbOfThread);
LinkedBlockingQueue<String> dataQueue = new LinkedBlockingQueue<String>(1024);
for(i = 0 ; i < numbOfThread ; i ++)
{
workers[i] = new Worker(dataQueue);
workers[i].setIsDone(false);
es.execute(workers[i]);
}
try
{
while(input.hasNext())
{
temp = input.nextLine().trim();
dataQueue.put(temp);
}
}
catch (InterruptedException e)
{
Thread.currentThread().interrupt();
}
input.close();
for(i = 0 ; i < numbOfThread ; i ++)
{
workers[i].setIsDone(true);
}
es.shutdown();
try
{
es.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
} catch (InterruptedException e)
{
Thread.currentThread().interrupt();
}
}
}
The worker class
public class Worker implements Runnable
{
private LinkedBlockingQueue<String> dataQueue = null;
private volatile boolean isDone = false;
public Worker(LinkedBlockingQueue<String> dataQueue)
{
this.dataQueue = dataQueue;
}
#Override
public void run()
{
String temp = null;
long count = 0;
System.out.println(Thread.currentThread().getName() + " running...");
try
{
while(!isDone || !dataQueue.isEmpty())
{
temp = dataQueue.take();
count = temp.length() + count;
if(count%1000 == 0)
{
System.out.println(Thread.currentThread().getName() + " : " + count);
}
}
System.out.println("Final result: " + Thread.currentThread().getName() + " : " + count);
}
catch (InterruptedException e)
{
}
}
public void setIsDone(boolean isDone)
{
this.isDone = isDone;
}
}
Any suggestions to why this happens?
Thank you very much.
As Dan Getz already said your worker take() waits until an element becomes available but the Queue may be empty.
In your code you check if the Queue is empty but nothing prevents the other Workers to read and remove an element from the element after the check.
If the Queue contains only one element and t1 and t2 are two Threads
the following could happen:
t2.isEmpty(); // -> false
t1.isEmpty(); // -> false
t2.take(); // now the queue is empty
t1.take(); // wait forever
in this case t1 would wait "forever".
You can avoid this by using pollinstead of take and check if the result is null
public void run()
{
String temp = null;
long count = 0;
System.out.println(Thread.currentThread().getName() + " running...");
try
{
while(!isDone || !dataQueue.isEmpty())
{
temp = dataQueue.poll(2, TimeUnit.SECONDS);
if (temp == null)
// re-check if this was really the last element
continue;
count = temp.length() + count;
if(count%1000 == 0)
{
System.out.println(Thread.currentThread().getName() + " : " + count);
}
}
System.out.println("Final result: " + Thread.currentThread().getName() + " : " + count);
}
catch (InterruptedException e)
{
// here it is important to restore the interrupted flag!
Thread.currentThread().interrupt();
}
}

Multi thread is slower than one [duplicate]

This question already has answers here:
Java: How to use Thread.join
(3 answers)
Closed 8 years ago.
I am writing application using multi threads to count number of char inside txt file.
File contains 10 000 000 chars. 10 000 rows and 1 000 columns.
EDITED
About first part of the question:
Prevoius questions was about threads, I used a thread.join(); in wrong way.
Second part:
Could you help me improve the performance and safety? Here is my code (Use of the Semaphore is required):
public class MultiThread implements Runnable {
HashMap<String, AtomicInteger> asciiMap = Maps.newHashMap();
LinkedList<String> asciiLines = ReadDataFromFile.lines;
Semaphore mutex = new Semaphore(1);
AtomicInteger i = new AtomicInteger(0);
int index;
#Override
public void run() {
long actual = 0;
try {
Calculate calculate = new Calculate();
long multiStart = System.currentTimeMillis();
Thread first = new Thread(calculate);
Thread second = new Thread(calculate);
Thread third = new Thread(calculate);
first.start();
second.start();
third.start();
first.join();
second.join();
third.join();
long multiEnd = System.currentTimeMillis();
actual = multiEnd - multiStart;
} catch (InterruptedException ex) {
Logger.getLogger(MultiThread.class.getName()).log(Level.SEVERE, null, ex);
}
int sum = 0;
for (Map.Entry<String, AtomicInteger> entry : asciiMap.entrySet()) {
System.out.println("Char: " + entry.getKey() + " , number: " + entry.getValue());
sum = sum + entry.getValue().get();
}
System.out.println("Time: " + actual);
}
int increment() {
try {
mutex.acquire();
index = i.incrementAndGet();
mutex.release();
} catch (InterruptedException ex) {
Logger.getLogger(MultiThread.class.getName()).log(Level.SEVERE, null, ex);
}
return index;
}
public class Calculate implements Runnable {
public Calculate() {
}
#Override
public void run() {
while (i.get() < asciiLines.size()) {
for (String oneCharacter : asciiLines.get(i.get()).split("")) {
if (asciiMap.containsKey(oneCharacter)) {
asciiMap.replace(oneCharacter, new AtomicInteger(asciiMap.get(oneCharacter).incrementAndGet()));
} else {
asciiMap.put(oneCharacter, new AtomicInteger(1));
}
}
i = new AtomicInteger(increment());
}
}
}
}
Every element inside LinkedList contains one row (1 000 chars).
Your code does absolutely no multithreading. Thread.join means wait until that thread has finished executing, then continue the current thread of execution. Right now, your code is executing each thread serially. You want to interleave your calls to start and join.
Thread first = new Thread(calculate);
Thread third = new Thread(calculate);
Thread second = new Thread(calculate);
first.start();
second.start();
third.start();
first.join();
second.join();
third.join();

Many ProducerS and many ConsumerS. Making the last producer alive killing the consumers

I have a standard producer consumer problem. Producer puts data into the stack(buffer) consumers take it.
I would like to have many producers and consumers.
the problem is I would like to make only the last living producer to be able to call b.stop()
for(int i = 0; i < 10; i++){
try{
// sleep((int)(Math.random() * 1));
}catch(Exception e){e.printStackTrace();}
b.put((int) (Math.random()* 10));
System.out.println("i = " + i);
}
b.stop();
so then I call b.stop() which changes running field in Buffer to false and notifiesAll()
End then I get:
i = 9 // number of iteration this is 10th iteration
Consumer 2.: no data to take. I wait. Memory: 0
Consumer 1.: no data to take. I wait. Memory: 0
Consumer 3.: no data to take. I wait. Memory: 0
they should die then, so I made method stop() but it did not work.
Code is running please check it
import java.util.Stack;
public class Buffer {
private static int SIZE = 4;
private int i;//number of elements in buffer
public Stack<Integer> stack;
private volatile boolean running;
public Buffer() {
stack = new Stack<>();
running = true;
i = 0;
}
synchronized public void put(int val){
while (i >= SIZE) {
try {
System.out.println("Buffer full, producer waits");
wait();
} catch (InterruptedException exc) {
exc.printStackTrace();
}
}
stack.push(val);//txt = s;
i++;
System.out.println("Producer inserted " + val + " memory: " + i);
if(i - 1 == 0)
notifyAll();
System.out.println(stack);
}
public synchronized Integer get(Consumer c) {
while (i == 0) {
try {
System.out.println(c + ": no data to take. I wait. Memory: " + i);
wait();
} catch (InterruptedException exc) {
exc.printStackTrace();
}
}
if(running){
int data = stack.pop();
i--;
System.out.println(c+ ": I took: " + data +" memory: " + i);
System.out.println(stack);
if(i + 1 == SIZE){//if the buffer was full so the producer is waiting
notifyAll();
System.out.println(c + "I notified producer about it");
}
return data;}
else
return null;
}
public boolean isEmpty(){
return i == 0;
}
public synchronized void stop(){//I THOUGH THIS WOULD FIX IT~!!!!!!!!!!!!!!
running = false;
notifyAll();
}
public boolean isRunning(){
return running;
}
}
public class Producer extends Thread {
private Buffer b;
public Producer(Buffer b) {
this.b = b;
}
public void run(){
for(int i = 0; i < 10; i++){
try{
// sleep((int)(Math.random() * 1));
}catch(Exception e){e.printStackTrace();}
b.put((int) (Math.random()* 10));
System.out.println("i = " + i);
}
b.stop();
}
}
public class Consumer extends Thread {
Buffer b;
int nr;
static int NR = 0;
public Consumer(Buffer b) {
this.b = b;
nr = ++NR;
}
public void run() {
Integer i = b.get(this);
while (i != null) {
System.out.println(nr + " I received : " + i);
i = b.get(this);
}
System.out.println("Consumer " + nr + " is dead");
}
public String toString() {
return "Consumer " + nr + ".";
}
}
public class Main {
public static void main(String[] args) {
Buffer b = new Buffer();
Producer p = new Producer(b);
Consumer c1 = new Consumer(b);
Consumer c2 = new Consumer(b);
Consumer c3 = new Consumer(b);
p.start();
c1.start();c2.start();c3.start();
}
}
What you have to realise is that your threads could be waiting in either of two locations:
In the wait loop with i == 0 - in which case notifyall will kick all of them out. However, if i is still 0 they will go straight back to waiting again.
Waiting for exclusive access to the object (i.e. waiting on a synchronized method) - in which case (if you fix issue 1 above and the lock will be released) they will go straight into a while (i == 0) loop.
I would suggest you change your while ( i == 0 ) loop to while ( running && i == 0 ). This should fix your problem. Since your running flag is (correctly) volatile all should tidily exit.
In your stop method, you set running to false, but your while loop is running as long as i == 0. Set i to something different than zero and it should fix it.
BTW, I don't understand why you have a running variable and a separate i variable, which is actually the variable keeping a thread running.
I would rethink your design. Classes should have a coherent set of responsibilities; making a class responsible for both consuming objects off the queue, while also being responsible for shutting down other consumers, seems to be something you'd want to seperate.
In answer to the to make only the last living producer to be able to call b.stop().
You should add an AtomicInteger to your Buffer containing the number of producers and make each producer call b.start() (which increments it) in its constructor.
That way you can decrement it in b.stop() and only when it has gone to zero should running be set to false.

Categories

Resources