I've made a class that counts words in given files within the same directory. Seeing as the files are very large, I've decided to achieve the count of multiple files using multiple threads.
When running the DriverClass as specified below, it get's stuck at thread one.
What am I doing wrong? As I'm iterating over queue.take(), one would expect the parser to wait for something to retrieve and move on. Getting stuck at thread 1 makes me suspect an error when putting() into the queue.
Thank's, in advance!
DriverClass:
public class WordCountTest {
public static void main(String[] args){
if (args.length<1){
System.out.println("Please specify, atleast, one file");
}
BlockingQueue<Integer> threadQueue = new LinkedBlockingQueue<>();
Runnable r;
Thread t;
for (int i = 0; i<args.length; i++){
r = new WordCount(args[i], threadQueue);
t = new Thread(r);
t.start();
int total = 0;
for (int k = 0; k<args.length; k++){
try {
total += threadQueue.take();
} catch (InterruptedException e){
}
}
System.out.println("Total wordcount: " + total);
}
}
}
WordCountClass:
public class WordCount implements Runnable {
private int myId = 0;
private String _file;
private BlockingQueue<Integer> _queue;
private static int id = 0;
public WordCount(String file, BlockingQueue<Integer> queue){
_queue = queue;
_file = file;
myId = ++id;
}
#Override
public void run() {
System.out.println("Thread " + myId + " running");
try {
_queue.put(countWord(_file));
} catch (InterruptedException e){
}
}
public int countWord(String file){
int count = 0;
try {
Scanner in = new Scanner(new FileReader(file));
while (in.hasNext()){
count++;
in.next();
}
} catch (IOException e){
System.out.println("File," + file + ",not found");
}
return count;
}
}
The problem is that you're using a nested loop, when you should be using two separate loops: one to start the WordCounts, another to collect the results, something like
public class WordCountTest {
public static void main(String[] args){
Queue<Integer> threadQueue = new ConcurrentLinkedQueue<>();
ExecutorService executor = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
CountDownLatch latch = new CountDownLatch(args.length);
for (int i = 0; i<args.length; i++){
CompletableFuture.runAsync(new WordCount(args[i], threadQueue), executor)
.thenRunAsync(latch.countDown(), executor);
}
latch.await();
int sum = 0;
for(Integer i : threadQueue) {
sum += i;
}
}
}
Or however you want to implement it, the point being that you shouldn't start collecting results until all of the WordCounts have started.
You are waiting for all the results after the first thread is started. Perhaps you intended to wait for the results after all the threads have started.
Note: if you create more threads than you have CPUs its likely to be slower. I suggest using a fixed thread pool instead.
Related
I am trying to learn Multi-threading and I am trying to print odd & even number using two thread but i am not sure how to synchronized the for loop and make it print from 1 to 10 in order.
public class Counter implements Runnable {
public static void main(String[] args) {
Thread t1 = new Thread(new Counter(1, " ODD")); // Thread 1 runs the Odd number
Thread t2 = new Thread(new Counter(0, " EVEN")); // Thread 2 runs the Even number
t1.start();
t2.start();
}
constructor:
int num; // gets the number
String name; // gets the name
public Counter(int i, String name) {
this.num = i;
this.name = name;
}
This is the Loop im using to create Odd and Even number and i dont know how to synchronized this loop.
public void printNum() {
synchronized (this) {
for (int j = this.num; j <= 10; j += 2) {
System.out.println(name + "-->" + j);
}
}
}
#Override
public void run() {
//this will run the printNum to the Threads
printNum();
}
Mb something like this
public class Counter implements Runnable{
#Override
public void run() {
try {
printNum();
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
}
public static void main(String[] args){
Thread t1 = new Thread(new Counter(1, " ODD")); // Thread 1 runs the Odd number
Thread t2 = new Thread(new Counter(0, " EVEN")); // Thread 2 runs the Even number
t2.start();
t1.start();
}
int num; // gets the number
String name; // gets the name
public Counter(int i, String name) {
this.num = i;
this.name = name;
}
public void printNum() throws InterruptedException {
synchronized (this) {
for (int j = this.num; j <= 10; j += 2) {
System.out.println(name + "-->" + j);
Thread.sleep(100);
}
}
}
}
Result:
public class HelloWorld {
public static void main(String[] args) {
counter e = new counter();
counter o = new counter();
e.neighbor = o;
o.neighbor = e;
e.wait = false;
o.wait = true;
e.count = 0;
o.count = 1;
Thread te = new Thread(e);
Thread to = new Thread(o);
te.start();
to.start();
}
static class counter implements Runnable{
public counter neighbor = null;
public boolean wait = false;
public int count = -1;
#Override
public void run(){
while (count <= 10){
if (!wait){
System.out.print("count = " + count + "\n");
count += 2;
wait = true;
neighbor.wait = false;
}
}
wait = true;
neighbor.wait = false;
}
}
}
Often when you have two threads interdependent on each other, like in these case where odd needs to wait until even has finished and vice versa, we need to establish some kind of relation in order for them to communicate with each other, the reason why your code wasn't working was because synchronize makes the thread wait until the other one has finished, in the loop however, the entire loop is considered one task and one thread will wait for the other to finish their loop first.
I have the following task :
Create Class called: ElementsProvider(int n, List elements) that will provide N random elements into given list.
Elements Provider will be an thread.
Try create 4 instances, each of this instance will add 1000 random elements into the list.
start all instances at once and wait until they end.
print list size.
And here is what is did ,
Main:
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
public class ElementsProvider implements Runnable{
private final List<Integer> list;
private final int n;
public ElementsProvider(List<Integer> list, int n){
this.list = list;
this.n = n;
}
#Override
public void run() {
Random random = new Random();
for (int i = 0; i < n; i++) {
list.add(random.nextInt());
}
}
public static void main(String[] args) throws InterruptedException {
List<Integer> list = new ArrayList<>();
int n = 1000;
ElementsProvider e1 = new ElementsProvider(list, n);
ElementsProvider e2 = new ElementsProvider(list, n);
ElementsProvider e3 = new ElementsProvider(list, n);
ElementsProvider e4 = new ElementsProvider(list, n);
Thread t1 = new Thread(e1);
Thread t2 = new Thread(e2);
Thread t3 = new Thread(e3);
Thread t4 = new Thread(e4);
t1.start();
t2.start();
t3.start();
t4.start();
t1.join();
t2.join();
t3.join();
t4.join();
System.out.println(list);
}
}
Apparently I got that the task is not ok.
Feedback that I got is :
wrong, try to print list size, it will be different each time You run the program.
Can someone point me where I am mistaking please?
You proposed this change in a comment on your original question, above:
#Override
public void run() {
synchronized (ElementsProvider.class) {
Random random = new Random();
for (int i = 0; i < n; i++) {
list.add(random.nextInt());
}
}
}
O.K., That will ensure that your program always prints the correct answer, but it does so by making your program effectively single-threaded. When you put the entire body of the threads' run() method in a single synchronized block, you prevent them from running concurrently. But, running concurrently is the only reason to use threads.
You need to synchronize a smaller part of the code. The only variable that the threads share is the list. There is no reason for new Random() to be inside the synchronized block, and there is no reason for random.nextInt() to be inside it. The only thing that needs to be inside the synchronized block is the list.add() call.
I'd add a static semaphore to the your ElementsProvider class:
public class ElementsProvider implements Runnable {
private final List<Integer> list;
private final int n;
private static Semaphore semaphore = new Semaphore(1);
public ElementsProvider(List<Integer> list, int n) {
this.list = list;
this.n = n;
}
#Override
public void run() {
Random random = new Random();
List<Integer> l = new ArrayList<>(n);
for (int i = 0; i < n; i++) {
l.add(random.nextInt());
}
try {
semaphore.acquire();
System.out.println("Adding " + l.size() + " elements to list");
list.addAll(l);
} catch (Exception e) {
e.printStackTrace();
} finally {
semaphore.release();
}
}
}
public class ThreadsDemo {
public static int n = 0;
private static final int NTHREADS = 300;
public static void main(String[] argv) throws InterruptedException {
final CountDownLatch cdl = new CountDownLatch(NTHREADS);
for (int i = 0; i < NTHREADS; i++) {
new Thread(new Runnable() {
public void run() {
// try {
// Thread.sleep(10);
// } catch (InterruptedException e) {
// e.printStackTrace();
// }
n += 1;
cdl.countDown();
}
}).start();
}
cdl.await();
System.out.println("fxxk, n is: " + n);
}
}
Why the output is "n is: 300"? n isn't explicitly synchronized. And if I uncomment "Thread.sleep", the output is "n is: 299 or less".
I changed your code this way:
private static final int NTHREADS = 300;
private static AtomicInteger n = new AtomicInteger();
public static void main(String[] argv) throws InterruptedException {
final CountDownLatch cdl = new CountDownLatch(NTHREADS);
for (int i = 0; i < NTHREADS; i++) {
new Thread(new Runnable() {
public void run() {
n.incrementAndGet();
cdl.countDown();
}
}).start();
}
cdl.await();
System.out.println("fxxk, n is: " + n);
}
You have to deal with racing-conditions. All the 300 threads are modifying n concurrently. For example: if two threads would have read and increment n concurrently than both increment n to the same value.
That was the reason why n wasn't always 300, you lost one increment in such a situation. And this situation could have occurred zero or many times.
I changed n from int to AtomicInteger which is thread safe. Now everything works as expected.
You better use AtomicInteger.
This question will help you with description and example: Practical uses for AtomicInteger
Static context need to have lock on the class and not on the Object. If you need a static variable to be synchronized and do not need it to be cached inside the thread locally you need to declare it as volatile.
public class ThreadsDemo {
public static int n = 0;
private static final int NTHREADS = 30;
public static void main(String[] argv) throws InterruptedException {
final CountDownLatch cdl = new CountDownLatch(NTHREADS);
for (int i = 0; i < NTHREADS; i++) {
new Thread(new Runnable() {
public void run() {
for (int j = 0; j < 1000; j++) // run a long time duration
n += 1;
cdl.countDown();
}
}).start();
}
cdl.await();
System.out.println("fxxk, n is: " + n);
}
}
output "n is: 29953"
I think the reason is, the threads run a short time duration, and the jvm don't make a context switch.
Java static field will be synchronized among threads?
No. You should make it volatile or synchronize all access to it, depending on your usage patterns.
I made a program to count words from individual files,
but how can i modify my program, so it gives the total amount of words from all files (as ONE value).
My code looks like this:
public class WordCount implements Runnable
{
public WordCount(String filename)
{
this.filename = filename;
}
public void run()
{
int count = 0;
try
{
Scanner in = new Scanner(new File(filename));
while (in.hasNext())
{
in.next();
count++;
}
System.out.println(filename + ": " + count);
}
catch (FileNotFoundException e)
{
System.out.println(filename + " blev ikke fundet.");
}
}
private String filename;
}
With a Main-Class:
public class Main
{
public static void main(String args[])
{
for (String filename : args)
{
Runnable tester = new WordCount(filename);
Thread t = new Thread(tester);
t.start();
}
}
}
And how to avoid race conditions?
Thank you for your help.
A worker thread:
class WordCount extends Thread
{
int count;
#Override
public void run()
{
count = 0;
/* Count the words... */
...
++count;
...
}
}
And a class to use them:
class Main
{
public static void main(String args[]) throws InterruptedException
{
WordCount[] counters = new WordCount[args.length];
for (int idx = 0; idx < args.length; ++idx) {
counters[idx] = new WordCount(args[idx]);
counters[idx].start();
}
int total = 0;
for (WordCount counter : counters) {
counter.join();
total += counter.count;
}
System.out.println("Total: " + total);
}
}
Many hard drives don't do a great job of reading multiple files concurrently. Locality of reference has a big impact on performance.
You can either use Future to get the count number and in the end add up all the counts or use a static variable and increment it in a synchronized manner i.e. use explicitely synchronized or use Atomic Increment
What if your Runnable took two arguments:
a BlockingQueue<String> or BlockingQueue<File> of input files
an AtomicLong
In a loop, you would get the next String/File from the queue, count its words, and increment the AtomicLong by that amount. Whether the loop is while(!queue.isEmpty()) or while(!done) depends on how you feed files into the queue: if you know all the files from the start, you can use the isEmpty version, but if you're streaming them in from somewhere, you want to use the !done version (and have done be a volatile boolean or AtomicBoolean for memory visibility).
Then you feed these Runnables to an executor, and you should be good to go.
You can create some listener to get a feedback from the thread.
public interface ResultListener {
public synchronized void result(int words);
}
private String filename;
private ResultListener listener;
public void run()
{
int count = 0;
try
{
Scanner in = new Scanner(new File(filename));
while (in.hasNext())
{
in.next();
count++;
}
listener.result(count);
}
catch (FileNotFoundException e)
{
System.out.println(filename + " blev ikke fundet.");
}
}
}
You can add a contructor parameter for the listener just like for your filename.
public class Main
{
private static int totalCount = 0;
private static ResultListener listener = new ResultListener(){
public synchronized void result(int words){
totalCount += words;
}
}
public static void main(String args[])
{
for (String filename : args)
{
Runnable tester = new WordCount(filename, listener);
Thread t = new Thread(tester);
t.start();
}
}
}
You can make the count volatile and static so all the threads can increment it.
public class WordCount implements Runnable
{
private static AtomicInteger count = new AtomicInteger(0); // <-- now all threads increment the same count
private String filename;
public WordCount(String filename)
{
this.filename = filename;
}
public static int getCount()
{
return count.get();
}
public void run()
{
try
{
Scanner in = new Scanner(new File(filename));
while (in.hasNext())
{
in.next();
count.incrementAndGet();
}
System.out.println(filename + ": " + count);
}
catch (FileNotFoundException e)
{
System.out.println(filename + " blev ikke fundet.");
}
}
}
Update: haven't done java in a while, but the point about making it a private static field still stands... just make it an AtomicInteger.
You could create a Thread pool with a synchronized task queue that would hold all of the files you wish to count the words for.
When your thread pool workers come online they could ask the task queue for a file to count.
After the worker completes their job then they could notify the main thread of their final number.
The main thread would have a synchronized notify method that would add up all of the worker threads' results.
Hope this helps.
Or you can have all the threads update a single word count variable. count++ is atomic if count is word-sided (an int should suffice).
EDIT: Turns out the Java specs are just silly enough that count++ is not atomic. I have no idea why. Anyway, look at AtomicInteger and its incrementAndGet method. Hopefully this is atomic (I don't know what to expect now...), and you don't need any other synchronization mechanisms - just store your count in an AtomicInteger.
The given solution is shared with consideration to Java8 concurrent package involving Executors and Future for multithreading.
First, callable class created for processing individual file
public class WordCounter implements Callable {
Path bookPath;
public WordCounter(Path bookPath) {
this.bookPath = bookPath;
}
#Override
public Map<String, Long> call() throws Exception {
Map<String, Long> wordCount = new HashMap<>();
wordCount = Files.lines(bookPath).flatMap(line -> Arrays.stream(line.trim().split(" ")).parallel())
.map(word -> word.replaceAll("[^a-zA-Z]", "").toLowerCase().trim())
.filter(word -> word.length() > 0)
.map(word -> new SimpleEntry<>(word, 1))
.collect(Collectors.groupingBy(SimpleEntry::getKey, Collectors.counting()));
return wordCount;
}
}
Now, we'll create multiple future tasks to invoke/process each file in the argument as below
ExecutorService exes = Executors.newCachedThreadPool();
FutureTask[] tasks = new FutureTask[count];
Map<String, Long> result = new HashMap<>();
Path[] books = new Path[2];
books[0] = Paths.get("C:\\Users\\Documents\\book1.txt");
books[1] = Paths.get("C:\\Users\\Documents\\book2.txt");
for(int i=0; i<books.length; i++) {
tasks[i] = new FutureTask(new WordCounter(books[i]));
exes.submit(tasks[i]);
}
for(int i=0; i<count; i++) {
try {
Map<String, Long> wordCount = (Map<String, Long>) tasks[i].get();
wordCount.forEach((k,v) -> result.put(k, result.getOrDefault(k, 0L)+1));
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
}
exes.shutdown();
Further result map can be upgraded to volatile keyword and shared among the WordCounter threads to update word count concurrently.
End Result : result.size() should give the expected output
This program in Java creates a list of 15 numbers and creates 3 threads to search for the maximum in a given interval. I want to create another thread that takes those 3 numbers and get the maximum. but i don't know how to get those values in the other thread.
public class apple implements Runnable{
String name;
int time, number, first, last, maximum;
int[] array = {12, 32, 54 ,64, 656, 756, 765 ,43, 34, 54,5 ,45 ,6 , 5, 65};
public apple(String s, int f, int l){
name = s;
first = f;
last = l;
maximum = array[0];
}
public void run(){
try{
for(int i = first; i < last; i++ )
{
if(maximum < array[i])
{
maximum = array[i];
}
}
System.out.println("Thread"+ name + "maximum = " + maximum);
}catch(Exception e){}
}
public static void main(String[] args){
Thread t1 = new Thread(new apple("1 ", 0, 5));
Thread t2 = new Thread(new apple("2 ", 5, 10 ));
Thread t3 = new Thread(new apple("3 ", 10, 15));
try{
t1.start();
t2.start();
t3.start();
}catch(Exception e){}
}
}
Here is how ExecutorService and ExecutorCompletionService can solve it:
public class MaxFinder {
private int[] values;
private int threadsCount;
public MaxFinder(int[] values, int threadsCount) {
this.values = values;
this.threadsCount = threadsCount;
}
public int find() throws InterruptedException {
ExecutorService executor = Executors.newFixedThreadPool(threadsCount);
ExecutorCompletionService<Integer> cs = new ExecutorCompletionService<Integer>(executor);
// Split the work
int perThread = values.length / threadsCount;
int from = 0;
for(int i = 0; i < threadsCount - 1; i++) {
cs.submit(new Worker(from, from + perThread));
from += perThread;
}
cs.submit(new Worker(from,values.length));
// Start collecting results as they arrive
int globalMax = values[0];
try {
for(int i = 0; i < threadsCount; i++){
int v = cs.take().get();
if (v > globalMax)
globalMax = v;
}
} catch (ExecutionException e) {
throw new RuntimeException(e);
}
executor.shutdown();
return globalMax;
}
private class Worker implements Callable<Integer> {
private int fromIndex;
private int toIndex;
public Worker(int fromIndex, int toIndex) {
this.fromIndex = fromIndex;
this.toIndex = toIndex;
}
#Override
public Integer call() {
int max = values[0];
for(int i = fromIndex; i<toIndex; i++){
if (values[i] > max)
max = values[i];
}
return max;
}
}
}
In this solution, N threads work concurrently, each on its portion of the array. The caller thread is responsible for gathering the local maximums as they arrive, and find the global maximum. This solution uses some non-trivial concurrency tools from java.util.concurrent package.
If you prefer a solution that only uses primitive synchronization tools, then you should use a synchronized block in the worker threads, that sets the maximum in some data member and then notifies the collector thread. The collector thread should be in a loop, waiting for notification and then examining the new number, and updating the global maximum if needed. This "consumer producer" model requires careful synchronization.
Based on the code you have, the simplest solution is to join the main thread to each instance thread and then get the max value from them for comparison purposes. Like so:
int globalMax;
try{
t1.start();
t2.start();
t3.start();
t1.join();
globalMax = t1.maximum;
t2.join();
if (t2.maximum > globalMax) {
globalMax = t2.maximum;
}
t3.join();
if (t3.maximum > globalMax) {
globalMax = t3.maximum;
}
} catch(Exception e){
}
Instead of implementing Runnable, try implementing Callable, which is capable of returning a result. The tutorial given here is a good source for describing how to do this.
Another approach to your problem could be to create an object which each apple instance (not sure why you've called it this) could register its maximum with the object. This new class could be passed into each apple constructor, then the apple could call a method, passing its own maximum into this.
For instance:
public class MaximumOfMaximumsFinder implements Runnable {
private List<Integer> maximums = new ArrayList<Integer>();
public void registerSingleMaximum(Integer max) {
maximums.add(max);
}
public void run() {
// use similar logic to find the maximum
}
}
There are several issues around making sure this is coordinated with the other threads, I'll leave this to you, since there's some interesting things to think about.