I have an app that created multiple endless threads. Each thread reads some info and I created some tasks using thread pool (which is fine).
I have added additional functions that handle arrays, when it finishes, its send those ArrayLists to new thread that save those lists as files. I have implemented the saving in 3 ways and only one of which succeeds. I would like to know why the other 2 ways did not.
I created a thread (via new Thread(Runnable)) and gave it the array and name of the file. In the thread constructor I create the PrintWriter and saved the files. It ran without any problems. ( I have 1-10 file save threads runing in parallel).
If I place the save code outputStream.println(aLog); in the Run method, it never reaches it and after the constructor finishes the thread exit.
I place the created runnables (file save) in a thread pool (and code for saving is in the run() method). When I send just 1 task (1 file to save), all is fine. More than 1 task is being added to the pool (very quickly), exceptions is created (in debug time I can see that all needed info is available) and some of the files are not saved.
Can one explain the difference behavior?
Thanks
Please see code below. (starting with function that is being part of an endless thread class that also place some tasks in the pool), the pool created in the endless thread:
ExecutorService iPool = Executors.newCachedThreadPool();
private void logRate(double r1,int ind){
historicalData.clear();
for (int i = 499; i>0; i--){
// some Code
Data.add(0,array1[ind][i][0] + "," + array1[ind][i][1] + "," +
array1[ind][i][2] + "," + array1[ind][i][3] + "," +
array2[ind][i] + "\n" );
}
// first item
array1[ind][0][0] = r1;
array1[ind][0][1] = array1[ind][0][0] ;
array1[ind][0][2] = array1[ind][0][0] ;
array2[ind][0] = new SimpleDateFormat("HH:mm:ss yyyy_MM_dd").format(today);
Data.add(0,r1+","+r1+","+r1+","+r1+ "," + array2[ind][0] + '\n') ;
// save the log send it to the pool (this is case 3)
//iPool.submit(new FeedLogger(fName,Integer.toString(ind),Data));
// Case 1 and 2
Thread fl = new Thread(new FeedLogger(fName,Integer.toString(ind),Data)) ;
}
here is the FeedLogger class:
public class FeedLogger implements Runnable{
private List<String> fLog = new ArrayList<>() ;
PrintWriter outputStream = null;
String asName,asPathName;
public FeedLogger(String aName,String ind, List<String> fLog) {
this.fLog = fLog;
this.asName = aName;
try {
asPathName = System.getProperty("user.dir") + "\\AsLogs\\" + asName + "\\Feed" + ind
+ ".log" ;
outputStream = new PrintWriter(new FileWriter(asPathName));
outputStream.println(fLog); Case 1 all is fine
outputStream.flush(); // Case 1 all is fine
outputStream.close(); Case 1 all is fine
}
catch (Exception ex) {
JavaFXApplication2.logger.log(Level.SEVERE, null,asName + ex.getMessage());
}
}
#Override
public void run()
{
try{
outputStream.println(fLog); // Cas2 --> not reaching this code, Case3 (as task) create
exception when we have multiple tasks
outputStream.flush();
}
catch (Exception e) {
System.out.println("err in file save e=" + e.getMessage() + asPathName + " feed size=" +
fLog.size());
JavaFXApplication2.logger.log(Level.ALL, null,asName + e.getMessage());
}
finally {if (outputStream != null) {outputStream.close();}}
}
}
You need to call start() on a Thread instance to make it actually do something.
Related
I'm writing a console application to read json files and then do some processing with them. I have 200k json files to process, so I'm creating a thread per file. But I would like to have only 30 active threads running. I don't know how to control it in Java.
This is the piece of code I have so far:
for (String jsonFile : result) {
final String jsonFilePath = jsonFile;
Thread thread = new Thread(new Runnable() {
String filePath = jsonFilePath;
#Override
public void run() {
// Do stuff here
}
});
thread.start();
}
result is an array with the path of 200k files. From this point, I'm not sure how to control it. I thought about a List<Thread> and then in each thread implements a notifier and when they finish just remove from the list. But then I would have to make the main thread sleep and then wake-up. Which feels weird.
How can I achieve this?
I would suggest to not create one thread per file. Threads are limited resources. Creating too many can lead to starvation or even program abortion.
From what information was provided, I would use a ThreadPoolExecutor. Constructing such an Executor with a limited amount of threads is quite simple thanks to Executors::newFixedSizeThreadPool:
ExecutorService service = Executors.newFixedSizeThreadPool(30);
Looking at the ExecutorService-interface, method <T> Future<T> submit​(Callable<T> task) might be fitting.
For this, some changes will be necessary. The tasks (i.e. what is currently a Runnable in the given implementation) must be converted to a Callable<T>, where T should be substituted with the return-type. The Future<T> returned should then be collected into a list and waited upon on. When all Futures have completed, the result list can be constructed, e.g. through streaming.
With parallelStreams and ForkJoinPool maybe you can get a more straightforward code, plus, an easy way to collect the results of your files after processing. For parallel processing, I prefer to directly use Threads, as a last resort, only when parallelStream can't be used.
boolean doStuff( String file){
// do your magic here
System.out.println( "The file " + file + " has been processed." );
// return the status of the processed file
return true;
}
List<String> jsonFiles = new ArrayList<String>();
jsonFiles.add("file1");
jsonFiles.add("file2");
jsonFiles.add("file3");
...
jsonFiles.add("file200000");
ForkJoinPool forkJoinPool = null;
try {
final int parallelism = 30;
forkJoinPool = new ForkJoinPool(parallelism);
forkJoinPool.submit(() ->
jsonFiles.parallelStream()
.map( jsonFile -> doStuff( jsonFile) )
.collect(Collectors.toList()) // you can collect this to a List<Boolea> results
).get();
} catch (InterruptedException | ExecutionException e) {
e.printStackTrace();
} finally {
if (forkJoinPool != null) {
forkJoinPool.shutdown();
}
}
Put your jobs (filenames) into a queue, start 30 threads to process them, then wait until all threads are done. For example:
static ConcurrentLinkedDeque<String> jobQueue = new ConcurrentLinkedDeque<String>();
private static class Worker implements Runnable {
int threadNumber;
public Worker(int threadNumber) {
this.threadNumber = threadNumber;
}
public void run() {
try {
System.out.println("Thread " + threadNumber + " started");
while (true) {
// get the next filename from job queue
String fileName;
try {
fileName = jobQueue.pop();
} catch (NoSuchElementException e) {
// The queue is empty, exit the loop
break;
}
System.out.println("Thread " + threadNumber + " processing file " + fileName);
Thread.sleep(1000); // so something useful here
System.out.println("Thread " + threadNumber + " finished file " + fileName);
}
System.out.println("Thread " + threadNumber + " finished");
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
public static void main(String[] args) throws InterruptedException {
// Create dummy filenames for testing:
for (int i = 1; i <= 200; i++) {
jobQueue.push("Testfile" + i + ".json");
}
System.out.println("Starting threads");
// Create 30 worker threads
List<Thread> workerThreads = new ArrayList<Thread>();
for (int i = 1; i <= 30; i++) {
Thread thread = new Thread(new Worker(i));
workerThreads.add(thread);
thread.start();
}
// Wait until the threads are all finished
for (Thread thread : workerThreads) {
thread.join();
}
System.out.println("Finished");
}
}
I have a desktop application, when there is a freeze for some minutes, there is a thread which monitors the freeze and it starts dumping stack traces of all threads(this is done in native call so that JVM_DumpAllStacks can be invoked) into temporary file. Then the temporary file is read as String after the native call and it is used to log in application's own logging framework.
The problem is, After all these process, I am not able to restore System.out to CONSOLE stream.
This is better explained in the below code.
public String getAllStackTraces() {
System.out.println("This will be printed in CONSOLE");
// This is NECESSARY for the jvm to dump stack traces in specific file which we are going to set in System.setOut call.
System.out.close();
File tempFile = File.createTempFile("threadDump",null,new File(System.getProperty("user.home")));
System.setOut(new PrintStream(new BufferedOutputStream(new FileOuptputStream(tempFile))));
//This native call dumps stack traces all threads to tempFile
callNativeMethodToDumpAllThreadStackTraces();
String stackTraces = readFileAsString(tempFile);
//close the tempFile PrintStream so as the next PrintStream object to set as 'out' and to take effect in the native side as well
System.out.close();
//Now I want to start printing in the CONSOLE again. How to do it again ?
//The below line does not work as FileDescriptor.out becomes invalid (i.e FileDescriptor.out.fd, handle = -1) after we do System.out.close() where out is PrintStream of console.
//System.setOut(new PrintStream(new BufferedOutputStream(new FileOuptputStream(FileDescriptor.out))));
PrintStream standardConsoleOutputStream = magicallyGetTheOutputStream() // How ???????????
System.setOut(standardConsoleOutputStream);
System.out.println("This will be printed in CONSOLE !ONLY! if we are able to get the new PrintStream of Console again magically");
}
Now, is there a way to magicallyGetTheOutputStream of Console to start printing in the console again ?
Note: The application is running in java 5 and 6.
Consider this code of how to store away original System.out without closing to later restore it to full glory:
//Store, don't close
PrintStream storeForLater = System.out;
//Reassign
System.out(setToNew);
...
//Close reassigned
setToNew.close();
//Reset to old
System.setOut(storeForLater);
As an alternative to native code, you could call into ThreadMXBean. The returned ThreadInfo objects contain information about Locks held and Locks the thread is waiting for.
public static void dumpThreads(PrintStream out) {
ThreadInfo[] threads = ManagementFactory.getThreadMXBean()
.dumpAllThreads(true, true);
for(final ThreadInfo info : threads) {
out.println("Thread: " + info.getThreadId()
+ "/" + info.getThreadName()
+ " in State " + info.getThreadState().name());
if(info.getLockName() != null) {
out.println("- Waiting on lock: " + info.getLockInfo().toString()
+ " held by " + info.getLockOwnerId()+"/"+info.getLockOwnerName());
}
for(MonitorInfo mi : info.getLockedMonitors()) {
out.println(" Holds a lock on a " + mi.getClassName() +
" from " + mi.getLockedStackFrame().getClassName()+"."+mi.getLockedStackFrame().getMethodName()
+ ": " + mi.getLockedStackFrame().getLineNumber());
}
for(StackTraceElement elm : info.getStackTrace()) {
out.println(" at " + elm.getClassName() + "."
+ elm.getMethodName() + ":"+elm.getLineNumber());
}
out.println();
}
}
I have some old code I am working with, and I'm not too experienced with Threads (mostly work on the front end). Anyway, this Thread.sleep is causing the thread to hang and I'm unsure what to do about it. I thought about using a counter and throwing a Thread.currentThread.interupt, but unsure of where to put it or which thread it will interupt. Here is an example of the dump. As you can see the thread count is getting pretty high at 1708.
Any advice?
"Thread-1708" prio=6 tid=0x2ceec400 nid=0x2018 waiting on condition
[0x36cdf000] java.lang.Thread.State: TIMED_WAITING (sleeping) at
java.lang.Thread.sleep(Native Method) Locked ownable synchronizers:
- None "Thread-1707" prio=6 tid=0x2d16b800 nid=0x215c waiting on condition [0x36c8f000] java.lang.Thread.State: TIMED_WAITING
(sleeping) at java.lang.Thread.sleep(Native Method) Locked ownable
synchronizers:
- None
#Override
public void run()
{
Connection con = null;
int i = 0;
while (is_running)
{
try
{
con = ConnectionManager.getConnection();
while (!stack.isEmpty())
{
COUNT++;
String line = (String) stack.pop();
getPartMfr(line);
try
{
if (this.mfr != null && !this.mfr.equals(EMPTY_STR))
{
lookupPart(con, line);
}
}
catch (SQLException e)
{
e.printStackTrace();
}
if (COUNT % 1000 == 0)
{
Log log = LogFactory.getLog(this.getClass());
log.info("Processing Count: " + COUNT);
}
}
}
catch (NamingException e)
{
e.printStackTrace();
}
catch (SQLException e)
{
e.printStackTrace();
}
finally
{
try
{
ConnectionManager.close(con);
}
catch (SQLException e)
{
e.printStackTrace();
}
}
try {
Thread.sleep(80);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
this.finished = true;
}
Here is where it calls the run method, as you can see it does set it to false, but I guess it is missing threads?
HarrisWorker w[] = new HarrisWorker[WORKER_POOL_SIZE];
try
{
for (int i = 0; i < w.length; i++)
{
w[i] = new HarrisWorker(pw);
w[i].start();
}
pw.println(headers());
File inputDir = new File(HARRIS_BASE);
String files[] = inputDir.list();
for (String file : files)
{
try
{
File f = new File(HARRIS_BASE + File.separator + file);
if (f.isDirectory())
continue;
final String workFile = workDir + File.separator + file;
f.renameTo(new File(workFile));
FileReader fr = new FileReader(workFile);
BufferedReader br = new BufferedReader(fr);
String line = br.readLine();
boolean firstLine = true;
while (line != null)
{
if (firstLine)
{
firstLine = false;
line = br.readLine();
continue;
}
if (line.startsWith(","))
{
line = br.readLine();
continue;
}
// if(line.indexOf("103327-1") == -1)
// {
// line = br.readLine();
// continue;
// }
HarrisWorker.stack.push(line);
line = br.readLine();
}
br.close();
fr.close();
for (int i = 0; i < w.length; i++)
{
w[i].is_running = false;
while (!w[i].finished)
{
Thread.sleep(80);
}
}
move2Processed(file, workFile);
long etime = System.currentTimeMillis();
System.out.println("UNIQUE PARTS TOTAL FOUND: " + HarrisWorker.getFoundCount() + " of " + HarrisWorker.getUniqueCount() + ", "
+ (HarrisWorker.getFoundCount() / HarrisWorker.getUniqueCount()));
System.out.println("Time: " + (etime - time));
}
catch (Exception e)
{
e.printStackTrace();
File f = new File(workDir + File.separator + file);
if (f.exists())
{
f.renameTo(new File(HARRIS_BASE + File.separator + ERROR + File.separator + file));
}
}
}
}
As a direct answer to the question in your title - nowhere. There is nowhere in this code that needs a Thread.interrupt().
The fact that the thread name is Thread-1708 does not necessarily mean there are 1708 threads. One can choose arbitrary names for threads. I usually include the name of the executor or service in the thread name. Maybe 1600 are now long stopped and there are only around a hundred alive. Maybe this particular class starts naming at 1700 to distinguish from other uses.
1708 threads may not be a problem. If you have a multi-threaded server that is serving 2000 connections in parallel, then it certainly expectable that there are 2000 threads doing that, along with a bunch of other threads.
You have to understand why the sleep is there and what purpose it serves. It's not there to just hog memory for nothing.
Translating the code to "plaintext" (btw it can be greatly simplified by using try-with-resources to acquire and close the connection):
Acquire a connection
Use the connection to send (I guess) whatever is in the stack
When failed or finished - wait 80ms (THIS is your sleep)
If run flag is still set - repeat from step 1
Finish the thread.
Now reading through this, it's obvious that it's not the sleep that's the problem. It's that the run flag is not set to false. And your thread just continues looping, even if it can't get the connection at all - it will simply spend most of its time waiting for the retry. In fact - even if you completely strip the sleep out (instead of interrupting it mid-way), all you will achieve is that the Threads will start using up more resources. Given that you have both a logger and you print to stdout via printStackTrace, I would say that you have 2 problems:
Something is spawning threads and not stopping them afterwards (not setting their run flag to false when done)
You are likely getting exceptions when getting the Connection, but you never see them in the log.
It might be that the Thread is supposed to set it's own run flag (say when the stack is drained), but you would have to decide that yourself - that depends on a lot of specifics.
Not an answer but some things you should know if you are writing code for a live, production systemn:
:-( Variable and method both have the same name, run. A better name for the variable might be keep_running Or, change the sense of it so that you can write while (! time_to_shut_down) { ... }
:-( Thread.sleep(80) What is this for? It looks like a big red flag to me. You can never fix a concurrency bug by adding a sleep() call to your code. All you can do is make the bug less likely to happen in testing. That means, when the bug finally does bite, it will bite you in the production system.
:-( Your run() method is way too complicated (the keyword try appears four times). Break it up, please.
:-( Ignoring five different exceptions catch (MumbleFoobarException e) { e.printStackTrace(); } Most of those exceptions (but maybe not the InterruptedException) mean that something is wrong. Your program should do something more than just write a message to the standard output.
:-( Writing error messages to standard output. You should be calling log.error(...) so that your application can be configured to send the messages to someplace where somebody might actually see them.
My android application implements data protection and working with cloud.
Application consists of UI and standalone service (runing in own process).
I'm using IPC(Messages & Handlers) to communicate between UI and service.
I have the next situation - before make some work with data i need to know about data size and data items count (i have to enumerate contacts, photos, etc and collect total information for progresses).
About problem:
When enumeration starts on the service side(it uses 4 runing threads in threadpool) my UI is freezing for several seconds (depends on total data size).
Does anybody know any way to make UI work good - without freezing in this moment?
Update:
Here is my ThreadPoolExecutor wrapper that i am using in service to execute estimate tasks(created like new ThreadPoolWorker(4,4,10)):
public class ThreadPoolWorker {
private Object threadPoolLock = new Object();
private ThreadPoolExecutor threadPool = null;
private ArrayBlockingQueue<Runnable> queue = null;
private List<Future<?>> futures = null;
public ThreadPoolWorker(int poolSize, int maxPoolSize, int keepAliveTime){
queue = new ArrayBlockingQueue<Runnable>(5);
threadPool = new ThreadPoolExecutor(poolSize, maxPoolSize, keepAliveTime, TimeUnit.SECONDS, queue);
threadPool.prestartAllCoreThreads();
}
public void runTask(Runnable task){
try{
synchronized (threadPoolLock) {
if(futures == null){
futures = new ArrayList<Future<?>>();
}
futures.add(threadPool.submit(task));
}
}catch(Exception e){
log.error("runTask failed. " + e.getMessage() + " Stack: " + OperationsHelper.StringOperations.getStackToString(e.getStackTrace()));
}
}
public void shutDown()
{
synchronized (threadPoolLock) {
threadPool.shutdown();
}
}
public void joinAll() throws Exception{
synchronized (threadPoolLock) {
try {
if(futures == null || (futures != null && futures.size() <= 0)){
return;
}
for(Future<?> f : futures){
f.get();
}
} catch (ExecutionException e){
log.error("ExecutionException Error: " + e.getMessage() + " Stack: " + OperationsHelper.StringOperations.getStackToString(e.getStackTrace()));
throw e;
} catch (InterruptedException e) {
log.error("InterruptedException Error: " + e.getMessage() + " Stack: " + OperationsHelper.StringOperations.getStackToString(e.getStackTrace()));
throw e;
}
}
}
}
Here the way to start enumeration tasks that i use:
estimateExecutor.runTask(contactsEstimate);
I must say you did not provided enough information (the part of the code you suspect as the cause..)
but from my knowledge and experience I can make an educated guess -
you are probably performing code on the UI thread (main thread) that it execution taking a while. I can also guess that this code is : querying cotacts / gallery provider for all the data..
in case you don't know - Service callback methods also been executed from the main thread (the UI thread..) unless explicitly you run them from AsyncTask / another thread, and querying content providers and processing it returned cursor for data can also be heavy operation that need to be executed from another thread for not blocking the main UI thread.
after removing the code performing this expensive queries to another thread - there is no reason you'll experience any freezing.
I was successful in reading a file while using multi-process environment using file locking
and in case of multithreaded(singleprocess) i used a queue filled it with file names, opened a thread separately, read from it and then waited till the entire reading was over, after which i used to rename them. In this way i used to read files in multithreaded(in a batch).
Now, i want to read the files in a directory using both multiprocess and multithreads. I tried merging my two approaches but that didn't fare well. log showed a lot of files were showing FileNotFound exception(because their names were changed), some were never read (because thread died), sometimes locks were not released.
///////////////////////////////////////////////////////////////////////
//file filter inner class
class myfilter implements FileFilter{
#Override
public boolean accept(File pathname) {
// TODO Auto-generated method stub
Pattern pat = Pattern.compile("email[0-9]+$");
Matcher mat = pat.matcher(pathname.toString());
if(mat.find()) {
return true;
}
return false;
}
}
/////////////////////////////////////////////////////////////////////////
myfilter filter = new myfilter();
File alreadyread[] = new File[5];
Thread t[] = new Thread[5];
fileread filer[] = new fileread[5];
File file[] = directory.listFiles(filter);
FileChannel filechannel[] = new FileChannel[5];
FileLock lock[] = new FileLock[5];
tuple_json = new ArrayList();
//System.out.println("ayush");
while(true) {
//declare a queue
ConcurrentLinkedQueue filequeue = new ConcurrentLinkedQueue();
//addfilenames to queue and their renamed file names
try{
if(file.length!=0) {
//System.out.println(file.length);
for(int i=0;i<5 && i<file.length;i++) {
System.out.println("acquiring lock on file " + file[i].toString());
try{
filechannel[i] = new RandomAccessFile(file[i], "rw").getChannel();
lock[i] = filechannel[i].tryLock();
}
catch(Exception e) {
file[i] = null;
lock[i] = null;
System.out.println("cannot acquire lock");
}
if(lock[i]!=null){
System.out.println("lock acquired on file " + file[i].toString());
filequeue.add(file[i]);
alreadyread[i] = new File(file[i].toString() + "read");
System.out.println(file[i].toString() + "-----" + times);
}
else{
System.out.println("else condition of acquiring lock");
file[i] = null;
}
System.out.println("-----------------------------------");
}
//starting the thread to read the files
for(int i=0;i<5 && i<file.length && lock[i]!=null && file[i]!=null;i++){
filer[i] = new fileread(filequeue.toArray()[i].toString());
t[i] = new Thread(filer[i]);
System.out.println("starting a thread to read file" + file[i].toString());
t[i].start();
}
//read the text
for(int i=0;i<5 && i<file.length && lock[i]!=null && file[i]!=null;i++) {
try {
System.out.println("waiting to read " + file[i].toString() + " to be read completely");
t[i].join();
System.out.println(file[i] + " was read completetly");
//System.out.println(filer[i].getText());
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
//file has been read Now rename the file
for(int i=0;i<5 && i<file.length && lock[i]!=null && file[i]!=null;i++){
if(lock[i]!=null){
System.out.println("renaming file " + file[i].toString());
file[i].renameTo(alreadyread[i]);
System.out.println("releasing lock on file " + file[i].toString());
lock[i].release();
}
}
//rest of the processing
/////////////////////////////////////////////////////////////////////////////////////////////////////
Fileread class
class fileread implements Runnable{
//String loc = "/home/ayusun/workspace/Eclipse/fileread/bin";
String fileloc;
BufferedReader br;
String text = "";
public fileread(String filename) {
this.fileloc = filename;
}
#Override
public void run() {
try {
br = new BufferedReader(new FileReader(fileloc));
System.out.println("started reading file" + fileloc);
String currline;
while((( currline = br.readLine())!=null)){
if(text == "")
text += currline;
else
text += "\n" + currline;
}
System.out.println("Read" + fileloc + " completely");
br.close();
} catch ( IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
public String getText() {
return text;
}
}
I would like to know, if there is nay other approach that i can adopt.
If you want to create exclusive access to a file, you cannot use file locking, as on most OSes file locking is advisory, not mandatory.
I'd suggest creating a common lock directory for all your processes; in this lock directory, you would create a directory per file you want to lock, right before you open a file.
The big advantage is that directory creation, unlike file creation, is atomic; as such, you can use Files.createDirectory() (or File's .mkdir() if you still use Java6 but then don't forget to check the return code) to grab a lock on the files you read. If this fails, you know someone else is using the file.
Of course, when you're done with a file, don't forget to remove the lock directory matching this file... (in a finally block)
(note: with Java 7 you can use Files.newBufferedReader(); there is even Files.readAllLines())
If you need to process a large number of files using multiple threads, you should probably first distribute the specific files to each thread before it starts.
For example, if you only want to process files whose names start with email and are followed by some digits, you could create 10 threads. The first thread would look for files with names starting with email0, the second thread could handle email1, etc.
This of course would be efficient only if the numbers are evenly distributed.
Another way could be do have the main thread run through and collect all filenames to deal with. It could then divide the files across the number of available threads, and pass each thread an array of those file names.
There could be other ways of dividing the system load which are relevant to your situation.