I have a following code
private List<String[]> userList2 = new ArrayList<String[]>(10000);
ThreadPoolExecutor executor = new ThreadPoolExecutor(10, 10, 10, TimeUnit.SECONDS,
new ArrayBlockingQueue<Runnable>(5), new ThreadPoolExecutor.CallerRunsPolicy());
Database Query
while (rs.next())
{
data = new String[2];
data[0] = rs.getString("userid");
data[1] = rs.getString("email");
userList2.add(data);
if(userList2.size()==10000) //Confusion in this part..
{
final List<String[]> elist = new ArrayList<String[]>(userList2);
executor.execute(new Runnable() {
public void run() {
doBilling(con,elist); //Parallel is not happening here...
}
});
I have a method
doBillng(Connection con,List<String[]> userList)
{
String list[]=null;
String userid=" ";
for(int i=0;i<userList.size();i++)
{
list=userList.get(i);
userid=props[0];
list = BillingDao.billById(uid, con);
}
}
When userlist2 size is 10000 I want to run doBilling in 10 Threads parallelly such that it can reach 10000 records in great speed. But it's not happening!!! Please suggest What am I doing wrong and how should this be solved..
Thanks in advance
You have submitted only one task to the executor and that happens when if(userList2.size()==10000) returns true. So consequently , You have one Thread running to process 10000 elements of the ArrayList!!!!!.
If you want 10 threads to execute 10000 elements of ArrayList your code should be something like this:
while (rs.next())
{
data = new String[2];
data[0] = rs.getString("userid");
data[1] = rs.getString("email");
userList2.add(data);
if(userList2.size()% 1000 == 0) //Check if size is multiple of 1000(obtained by 10000/10)
{
final List<String[]> elist = new ArrayList<String[]>(userList2);
executor.execute(new Runnable()
{
public void run()
{
doBilling(con,elist); //create new ArrayList and assign it to a seperate thread.
}
});
userList2.clear();//clearing the arrayList so that next time when submitted to executor , new elements are processed.
}
}
Related
I'm writing a console application to read json files and then do some processing with them. I have 200k json files to process, so I'm creating a thread per file. But I would like to have only 30 active threads running. I don't know how to control it in Java.
This is the piece of code I have so far:
for (String jsonFile : result) {
final String jsonFilePath = jsonFile;
Thread thread = new Thread(new Runnable() {
String filePath = jsonFilePath;
#Override
public void run() {
// Do stuff here
}
});
thread.start();
}
result is an array with the path of 200k files. From this point, I'm not sure how to control it. I thought about a List<Thread> and then in each thread implements a notifier and when they finish just remove from the list. But then I would have to make the main thread sleep and then wake-up. Which feels weird.
How can I achieve this?
I would suggest to not create one thread per file. Threads are limited resources. Creating too many can lead to starvation or even program abortion.
From what information was provided, I would use a ThreadPoolExecutor. Constructing such an Executor with a limited amount of threads is quite simple thanks to Executors::newFixedSizeThreadPool:
ExecutorService service = Executors.newFixedSizeThreadPool(30);
Looking at the ExecutorService-interface, method <T> Future<T> submit​(Callable<T> task) might be fitting.
For this, some changes will be necessary. The tasks (i.e. what is currently a Runnable in the given implementation) must be converted to a Callable<T>, where T should be substituted with the return-type. The Future<T> returned should then be collected into a list and waited upon on. When all Futures have completed, the result list can be constructed, e.g. through streaming.
With parallelStreams and ForkJoinPool maybe you can get a more straightforward code, plus, an easy way to collect the results of your files after processing. For parallel processing, I prefer to directly use Threads, as a last resort, only when parallelStream can't be used.
boolean doStuff( String file){
// do your magic here
System.out.println( "The file " + file + " has been processed." );
// return the status of the processed file
return true;
}
List<String> jsonFiles = new ArrayList<String>();
jsonFiles.add("file1");
jsonFiles.add("file2");
jsonFiles.add("file3");
...
jsonFiles.add("file200000");
ForkJoinPool forkJoinPool = null;
try {
final int parallelism = 30;
forkJoinPool = new ForkJoinPool(parallelism);
forkJoinPool.submit(() ->
jsonFiles.parallelStream()
.map( jsonFile -> doStuff( jsonFile) )
.collect(Collectors.toList()) // you can collect this to a List<Boolea> results
).get();
} catch (InterruptedException | ExecutionException e) {
e.printStackTrace();
} finally {
if (forkJoinPool != null) {
forkJoinPool.shutdown();
}
}
Put your jobs (filenames) into a queue, start 30 threads to process them, then wait until all threads are done. For example:
static ConcurrentLinkedDeque<String> jobQueue = new ConcurrentLinkedDeque<String>();
private static class Worker implements Runnable {
int threadNumber;
public Worker(int threadNumber) {
this.threadNumber = threadNumber;
}
public void run() {
try {
System.out.println("Thread " + threadNumber + " started");
while (true) {
// get the next filename from job queue
String fileName;
try {
fileName = jobQueue.pop();
} catch (NoSuchElementException e) {
// The queue is empty, exit the loop
break;
}
System.out.println("Thread " + threadNumber + " processing file " + fileName);
Thread.sleep(1000); // so something useful here
System.out.println("Thread " + threadNumber + " finished file " + fileName);
}
System.out.println("Thread " + threadNumber + " finished");
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
public static void main(String[] args) throws InterruptedException {
// Create dummy filenames for testing:
for (int i = 1; i <= 200; i++) {
jobQueue.push("Testfile" + i + ".json");
}
System.out.println("Starting threads");
// Create 30 worker threads
List<Thread> workerThreads = new ArrayList<Thread>();
for (int i = 1; i <= 30; i++) {
Thread thread = new Thread(new Worker(i));
workerThreads.add(thread);
thread.start();
}
// Wait until the threads are all finished
for (Thread thread : workerThreads) {
thread.join();
}
System.out.println("Finished");
}
}
I want to do a hash on every item from an arraylist and basically return it on the main. But for example a have this:
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.ArrayList;
public class hash extends Thread {
private Thread t = null;
private MessageDigest md = null;
private StringBuffer sb = null;
private String message = null;
private ArrayList<String> list = null;
private int count = 0;
public hash(ArrayList<String> list) {
this.list = list;
}
public final void mdstart() {
for(String item:list){
this.message=item;
if(t==null){
t = new Thread(this);
t.start();
}
count++;
}
System.out.println("end: "+this.count);
}
#Override
public final void run() {
System.out.println("run: "+this.count);
try {
md = MessageDigest.getInstance("MD5");
md.update(this.message.getBytes());
byte[] digest = md.digest();
sb = new StringBuffer();
for (byte hash : digest) {
sb.append(String.format("%02x", hash));
}
System.out.println(this.message + " : " + sb.toString());
} catch (NoSuchAlgorithmException ex) {
System.out.println("no such algorithm exception : md5");
System.exit(1);
}
}
public static void main(String args[]) {
ArrayList<String> list = new ArrayList<>();
for (int i = 0; i < 10; i++) {
list.add("message" + i);
}
new hash(list).mdstart();
}
}
and the output is:
end: 10
run: 10
message9 : 99f72d2de922c1f14b0ba5e145f06544
which means that the program run only the last one thread from those I expect.
You are storing your Thread in t, which is null at the start but after creating the first Thread, it isn't null anymore, thus your if fails and no new Thread is created. Then you try to modify the message while this Thread runs... Honestely, the whole thing is a mess. Even if you were to create 10 Threads, they would all point to the same hash() object where the message variable is changing randomly without knowing if any Thread has already finished working.
For example, the following could happen:
You start the Thread for first message
Thread has not yet run, but your for loop already sets the message to the 2nd one
The thread runs and calculates the message for the 2nd message
Message 3 is set. Nothing happens since Thread is already finished.
Message 4 is set, again, nothing happens as the Thread is done
etc.
To fix it:
Remove the list variable from hash.
Create a new hash() object for each hashcode/message
Start a new thread for each hash() object. Then it should work.
The problem is that your code will start only one thread:
if(t==null){
t = new Thread(this);
t.start();
}
Once the first thread is started, t is no longer null, so no new threads would be created.
To fix this problem, make an array of threads, and set threads[count++] to the newly created thread in your loop. Once the loop is over, you can join your threads to make sure they all finish before mdstart() returns:
public final void mdstart() {
Thread[] threads = new Thread[list.size()];
for(String item:list){
this.message=item;
threads[count] = new Thread(this);
threads[count].start();
count++;
}
for (Thread t : threads) {
t.join();
}
System.out.println("end: "+this.count);
}
Note: This will fix the start-up portion of your code. You would need to deal with synchronization issues in your run() method to complete the fix.
I'm trying to instantiate tasks in a ExecutorService that need to write to file in order,so if there exist 33 tasks they need to write in order...
I've tried to use LinkedBlockingQueue and ReentrantLock to guarantee the order but by what I'm understanding in fair mode it unlock to the youngest of the x threads ExecutorService have created.
private final static Integer cores = Runtime.getRuntime().availableProcessors();
private final ReentrantLock lock = new ReentrantLock(false);
private final ExecutorService taskExecutor;
In constructor
taskExecutor = new ThreadPoolExecutor
(cores, cores, 1, TimeUnit.MINUTES, new LinkedBlockingQueue());
and so I process a quota of a input file peer task
if(s.isConverting()){
if(fileLineNumber%quote > 0) tasks = (fileLineNumber/quote)+1;
else tasks = (fileLineNumber/quote);
for(int i = 0 ; i<tasks || i<1 ; i++){
taskExecutor.execute(new ConversorProcessor(lock,this,i));
}
}
the task do
public void run() {
getFileQuote();
resetAccumulators();
process();
writeResult();
}
and my problem ocurre here:
private void writeResult() {
lock.lock();
try {
BufferedWriter bw = new BufferedWriter(new FileWriter("/tmp/conversion.txt",true));
Integer index = -1;
if(i == 0){
bw.write("ano dia tmin tmax tmed umid vento_vel rad prec\n");
}
while(index++ < getResult().size()-1){
bw.write(getResult().get(index) + "\n");
}
if(i == controller.getTasksNumber()){
bw.write(getResult().get(getResult().size()-1));
}
else{
bw.write(getResult().get(getResult().size()-1) + "\n");
}
bw.close();
} catch (IOException ex) {
Logger.getLogger(ConversorProcessor.class.getName()).log(Level.SEVERE, null, ex);
} finally {
lock.unlock();
}
}
It appears to me that everything needs to be done concurrently except the writing of the output to file, and this must be done in the object creation order.
I would take the code that writes to the file, the writeResult() method, out of your threading code, and instead create Futures that returned Strings that are created by the process() method, and load the Futures into an ArrayList<Future<String>>. You then could iterate through the ArrayList, in a for loop calling get() on each Future, and writing the result to your text file with your BufferedWriter or PrintWriter.
Good evening,
I have a List of different URLs (about 500) which content I get from this method
public static String getWebContent(URL url){
// create URL, build HTTPConnection, getContent of page
}
after this I have another method where the content is fetched for values etc.
At this time I do it like this:
List<URL> urls = new ArrayList<>();
List<String> webcontents = new ArrayList<>();
for(URL url : urls){
webcontents.add(getWebContent(url));
}
// Futher methods to extract values from the webcontents
But it actually takes a lot of time, because there is only one Thread doing it. I wanted to make it multithreaded, but I am not sure what's the best way how to do it.
First, I need the return value of every Thread, should I implement Callable instead of Runnable for it?
And how do I run the method with different Threads, should there be one starting with index 0, one with index 50, etc.? And when they are done with one URL they set a flag to true? That would be my way, but it is not very effective I think. If the first website has a lot of content, the first Thread might take much longer then the others.
And when every Thread is done, how I can my data back to one list? Like this?
List<String> webcontent = new ArrayList<>();
if(!t1.isAlive() && !t2.isAlive()){
webcontent.add(t1.getData());
webcontent.add(t2.getData());
}
I hope you can understand my problem and can give me a tip :) Many thanks
You could use an ExecutorCompletionService to retrieve your tasks as they complete.
List<URL> urls = ...; // Create this list somehow
ExecutorCompletionService<String> service =
new ExecutorCompletionService<String>(Executors.newFixedThreadPool(10));
for (URL url: urls) {
service.submit(new GetWebContentCallable(url)); // you need to define the GetWebContentCallable
}
int remainingTasks = urls.size();
while (remainingTasks > 0) {
String nextResult = service.take();
processResult(nextResult); // you define processResult
remainingTasks -= 1;
}
Perhaps you could try something like:
public static List<String> getWebContents(final int threads, final URL... urls){
final List<Future<String>> futures = new LinkedList<>();
final ExecutorService service = Executors.newFixedThreadPool(threads);
Arrays.asList(urls).forEach(
url -> {
final Callable<String> callable = () -> {
try{
return getWebContent(url);
}catch(IOException ex){
ex.printStackTrace();
return null;
}
};
futures.add(service.submit(callable));
}
);
final List<String> contents = new LinkedList<>();
futures.forEach(
future -> {
try{
contents.add(future.get());
}catch(Exception ex){
ex.printStackTrace();
}
}
);
service.shutdown();
return contents;
}
Of if you're not using Java 8:
public static List<String> getWebContents(final int threads, final URL... urls){
final List<Future<String>> futures = new LinkedList<Future<String>>();
final ExecutorService service = Executors.newFixedThreadPool(threads);
for(final URL url : urls){
final Callable<String> callable = new Callable<String>(){
public String call(){
try{
return getWebContent(url);
}catch(IOException ex){
ex.printStackTrace();
return null;
}
}
};
futures.add(service.submit(callable));
}
final List<String> contents = new LinkedList<String>();
for(final Future<String> future : futures){
try{
contents.add(future.get());
}catch(Exception ex){
ex.printStackTrace();
}
}
service.shutdown();
return contents;
}
Instead of retrieving values from working threads, let working threads put results in a resulting collection (be it List<String> webcontent or anything else). Note this may require synchronization.
I am new to Multithreading and synchronization in java. I am trying to achieve a task in which i am given 5 files, each file will be read by one particular thread. Every thread should read one line from file then forward execution to next thread and so on. When all 5 threads read the first line, then again start from thread 1 running line no. 2 of file 1 and so on.
Thread ReadThread1 = new Thread(new ReadFile(0));
Thread ReadThread2 = new Thread(new ReadFile(1));
Thread ReadThread3 = new Thread(new ReadFile(2));
Thread ReadThread4 = new Thread(new ReadFile(3));
Thread ReadThread5 = new Thread(new ReadFile(4));
// starting all the threads
ReadThread1.start();
ReadThread2.start();
ReadThread3.start();
ReadThread4.start();
ReadThread5.start();
and in ReadFile (which implements Runnable, in the run method, i am trying to synchronize on bufferreader object.
BufferedReader br = null;
String sCurrentLine;
String filename="Source/"+files[fileno];
br = new BufferedReader(new FileReader(filename));
synchronized(br)
{
while ((sCurrentLine = br.readLine()) != null) {
int f=fileno+1;
System.out.print("File No."+f);
System.out.println("-->"+sCurrentLine);
br.notifyAll();
// some thing needs to be dine here i guess
}}
Need Help
Though this is not an ideal scenario for using multi-threading but as this is assignment I am putting one solution that works. The threads will execute sequentially and there are few point to note:
Current thread cannot move ahead to read the line in the file until and unless its immediately previous thread is done as they are supposed to read in round-robin fashion.
After current thread is done reading the line it must notify the other thread else that thread will wait forever.
I have tested this code with some files in temp package and it was able to read the lines in round robin fashion. I believe Phaser can also be used to solve this problem.
public class FileReaderRoundRobinNew {
public Object[] locks;
private static class LinePrinterJob implements Runnable {
private final Object currentLock;
private final Object nextLock;
BufferedReader bufferedReader = null;
public LinePrinterJob(String fileToRead, Object currentLock, Object nextLock) {
this.currentLock = currentLock;
this.nextLock = nextLock;
try {
this.bufferedReader = new BufferedReader(new FileReader(fileToRead));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
#Override
public void run() {
/*
* Few points to be noted:
* 1. Current thread cannot move ahead to read the line in the file until and unless its immediately previous thread is done as they are supposed to read in round-robin fashion.
* 2. After current thread is done reading the line it must notify the other thread else that thread will wait forever.
* */
String currentLine;
synchronized(currentLock) {
try {
while ( (currentLine = bufferedReader.readLine()) != null) {
try {
currentLock.wait();
System.out.println(currentLine);
}
catch(InterruptedException e) {}
synchronized(nextLock) {
nextLock.notify();
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
synchronized(nextLock) {
nextLock.notify(); /// Ensures all threads exit at the end
}
}
}
public FileReaderRoundRobinNew(int numberOfFilesToRead) {
locks = new Object[numberOfFilesToRead];
int i;
String fileLocation = "src/temp/";
//Initialize lock instances in array.
for(i = 0; i < numberOfFilesToRead; ++i) locks[i] = new Object();
//Create threads
int j;
for(j=0; j<(numberOfFilesToRead-1); j++ ){
Thread linePrinterThread = new Thread(new LinePrinterJob(fileLocation + "Temp" + j,locks[j],locks[j+1]));
linePrinterThread.start();
}
Thread lastLinePrinterThread = new Thread(new LinePrinterJob(fileLocation + "Temp" + j,locks[numberOfFilesToRead-1],locks[0]));
lastLinePrinterThread.start();
}
public void startPrinting() {
synchronized (locks[0]) {
locks[0].notify();
}
}
public static void main(String[] args) {
FileReaderRoundRobinNew fileReaderRoundRobin = new FileReaderRoundRobinNew(4);
fileReaderRoundRobin.startPrinting();
}
}
If the only objective is to read the files in round-robin fashion and not strictly in same order then we can also use Phaser. In this case the order in which files are read is not always same, for example if we have four files (F1, F2, F3 and F4) then in first phase it can read them as F1-F2-F3-F4 but in next one it can read them as F2-F1-F4-F3. I am still putting this solution for sake of completion.
public class FileReaderRoundRobinUsingPhaser {
final List<Runnable> tasks = new ArrayList<>();
final int numberOfLinesToRead;
private static class LinePrinterJob implements Runnable {
private BufferedReader bufferedReader;
public LinePrinterJob(BufferedReader bufferedReader) {
this.bufferedReader = bufferedReader;
}
#Override
public void run() {
String currentLine;
try {
currentLine = bufferedReader.readLine();
System.out.println(currentLine);
} catch (IOException e) {
e.printStackTrace();
}
}
}
public FileReaderRoundRobinUsingPhaser(int numberOfFilesToRead, int numberOfLinesToRead) {
this.numberOfLinesToRead = numberOfLinesToRead;
String fileLocation = "src/temp/";
for(int j=0; j<(numberOfFilesToRead-1); j++ ){
try {
tasks.add(new LinePrinterJob(new BufferedReader(new FileReader(fileLocation + "Temp" + j))));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
}
public void startPrinting( ) {
final Phaser phaser = new Phaser(1){
#Override
protected boolean onAdvance(int phase, int registeredParties) {
System.out.println("Phase Number: " + phase +" Registeres parties: " + getRegisteredParties() + " Arrived: " + getArrivedParties());
return ( phase >= numberOfLinesToRead || registeredParties == 0);
}
};
for(Runnable task : tasks) {
phaser.register();
new Thread(() -> {
do {
phaser.arriveAndAwaitAdvance();
task.run();
} while(!phaser.isTerminated());
}).start();
}
phaser.arriveAndDeregister();
}
public static void main(String[] args) {
FileReaderRoundRobinUsingPhaser fileReaderRoundRobin = new FileReaderRoundRobinUsingPhaser(4, 4);
fileReaderRoundRobin.startPrinting();
// Files will be accessed in round robin fashion but not exactly in same order always. For example it can read 4 files as 1234 then 1342 or 1243 etc.
}
}
The above example can be modified as per exact requirement. Here the constructor of FileReaderRoundRobinUsingPhaser takes the number of files and number of lines to read from each file. Also the boundary conditions need to be taken into consideration.
You are missing many parts of the puzzle:
you attempt to synchronize on an object local to each thread. This can have no effect and the JVM may even remove the whole locking operation;
you execute notifyAll without a matching wait;
the missing wait must be at the top of the run method, not at the bottom as you indicate.
Altogether, I'm afraid that fixing your code at this point is beyond the scope of one StackOverflow answer. My suggestion is to first familiarize yourself with the core concepts: the semantics of locks in Java, how they interoperate with wait and notify, and the precise semantics of those methods. An Oracle tutorial on the subject would be a nice start.