I have a Java program that needs to call the same external executable 6 times. This executable produces an output file and once all 6 runs are complete I "merge" these files together. I did just have a for-loop where I ran the code, waited for the first run of the external executable to end then I called it again, etc.
I found this highly time consuming, averaging 52.4s for it to run 6 times... I figured it would be pretty easy to speed up by running the external executable 6 times all at once, especially since they aren't dependent on one another. I used ExecutorService and Runnable, etc. to achieve this.
With my current implementation, I shave about ~5s off my time, making it only ~11% faster.
Here is some (simplified) code that explains what I'm doing:
private final List<Callable<Object>> tasks = new ArrayList<Callable<Object>>();
....
private void setUpThreadsAndRun() {
ExecutorService executor = Executors.newFixedThreadPool(6);
for (int i = 0; i < 6; i++) {
//create the params object
tasks.add(Executors.callable(new RunThread(params)));
}
try {
executor.invokeAll(tasks);
} catch (InterruptedException ex) {
//uh-oh
}
executor.shutdown();
System.out.println("Finished all threads!");
}
private class RunThread implements Runnable {
public RunThread(ModelParams params) {
this.params = params;
}
#Override
public void run()
{
//NOTE: cmdarray is constructed from the params object
ProcessBuilder pb = new ProcessBuilder(cmdarray);
pb.directory(new File(location));
p = pb.start();
}
}
I'm hoping there is a more efficient way to do this...or maybe I'm "clogging" my computer's resources by trying to run this process 6 times at once. This process does involve file I/O and writes files that are about 30mb in size.
The only time that forking the executable 6 times will earn a performance boost is if you have at least 6 CPU cores and your application is CPU bound -- i.e. mostly doing processor operations. Since each application writes a 30mb file, it sounds like it is doing a large amount of IO and the applications are IO bound instead -- limited by your hardware's ability to service the IO requests.
To speed up your program, you might try 2 concurrent processes to see if you get an improvement. However, if you program is IO bound, then you will never get much of a speed improvement by forking multiple copies.
Related
Currently my batch file will launch 20 jars with different arguments, after an hour, a taskkill command kills all java, the batch file will now launch 20 different jars.
My problem is, max cpu usage on startup, and potentially wasted cpu later. I could start the jar files at different times, but then they won't run for equal amounts of time. Takes up to 5 minutes for the cpu usage to half.
I need to launch a jar file, then kill it in an hour, without touching the other 19 jar files, and without knowing the PID.
I have been browsing the web and I see some stuff about, making it a background process, then getting the PID that way, can someone help me out with that?
This is what it looks like now
java -jar file.jar -a first
timeout 3
java -jar file.jar -a second
timeout 3
java -jar file.jar -a third
timeout 3
Use jps.exe utility (which is part of standard JDK) to learn the PID of the just started Java process. Then use taskkill /pid to kill this one process.
I'm just going to have some fun here :)
If you don't have ownership of the jars you might try something like:
public class KillTask extends TimerTask {
private Process process;
public KillTask(Process process) {
this.process = process;
}
#Override
public void run() {
// Prepare to die!
process.destroy();
}
}
public class KillDriver {
public KillDriver(String command) {
try {
Process process = Runtime.getRuntime().exec(command);
new Timer().schedule(new KillTask(process), 60 * 60 * 1000);
} catch (IOException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
for (String s : args)
new KillDriver(s);
}
}
This example assumes multiple jars will be started but you can alter that for a single or possibly pass in the delay as an argument to add some flexibility.
It appears you are giving the Jars their own process names ("first", "second", "third"), so I assume you could use a VBscript like this that does not require the PID to kill it.
ProcessName = "FIRST"
Set objWMIService = GetObject("winmgmts:{impersonationLevel=impersonate,authenticationLevel=pktPrivacy,(Debug)}!\\.\root\cimv2")
Set colProcessList = objWMIService.ExecQuery("Select * From Win32_Process")
For Each objProcess in colProcessList
If InStr(UCase(objProcess.Name), ProcessName) Then
WScript.Echo "Killing " & objProcess.Name
objProcess.Terminate()
End If
Next
I'm investigating some unit test failures. The tests pass on an old build server that's been hand-configured (and not documented). I'm trying to run them in a clean virtual machine.
My latest problem is a unit test that creates 10K threads.
for (int i = 0; i < 10000; i++) {
final Thread thread = new Thread(new Runnable() { ... });
threads.add(thread);
thread.start();
}
Well, the max user processes in the clean environment is only 4K.
$ ulimit -u
4096
I was wondering if there's some way for Java to get at that limit. The test really doesn't need 10K, it just needs some arbitrarily large number.
You could call ulimit directly:
Runtime.getRuntime().exec(command)
In Java, suppose if I start 1000 threads with a loop as below, is there any way I can monitor the number of threads actually running and the CPU resources that the threads consume with task manager?
for(int j=0; j<=1000; j++)
{
MyThread mt = new MyThread ();
mt.start ();
}
You can use VisualVM or JConsole or any other monitoring tool
If you mean the Windows task manager then yes, you can customize the columns shown in the process tab:
Menu View > Select Columns > Threads
EDIT
A quick test shows that creating an additional thread does increment that counter by one - and when that thread terminates, the counter decrements.
But it starts with more than one thread, because it probably includes the various JVM threads too (it starts with 19 threads). Note that jconsole shows 10 threads on a mono-thread program too.
If you use visual VM, you can see the split between daemon and non daemon threads (all JVM threads are daemon).
Test code:
public static void main(String[] args) throws InterruptedException {
Thread.sleep(3000);
Runnable r = new Runnable() {
#Override
public void run() {
try {
Thread.sleep(10000);
} catch (InterruptedException ex) {}
}
};
for (int i = 0; i < 5; i++) {
new Thread(r).start();
Thread.sleep(1000);
}
Thread.sleep(10000);
}
In code you can use Thread.activeCount() method
I think Visual VM is a better tool for this purpose. You'll get threads and a lot more information if you download and install all the plugins.
You can use managed bean for this perpers (MXBean). For example ThreadMXBean.
To get MXBean just call
ManagementFactory.getThreadMXBean()
The methods getThreadCount() and getCurrentThreadCpuTime() will help you.
If you just want to check the thread count in Windows 10
Task manager -> Details -> (right click) Select columns -> (check) Threads
I want to create a utility in Java to automate our building process. The current process involve
Opening 2 consoles for servers. ( I want to open these consoles from java program )
Running mulitple bat files in consoles and based on one batch file output, running other commands.
I need head start, what libraries should i use. Can i open 2 consoles from Java (independently). Like even if my program closes those consoles keep running. (consoles are bea server, startWebLogic.cmd).
Alee, yes you can do that with Runtime.getRuntime().exec("file.bat"); and then you have 2 options, you can capture the output of the execution
for example:
public static void main(String[] args) {
System.out.println("hello");
try {
Process p = Runtime.getRuntime().exec("./prog.sh");
InputStream in = p.getInputStream();
System.out.println("OUTPUT");
StringBuilder sb = new StringBuilder();
int c;
while( (c = in.read() ) > 0 ) {
sb.append((char) c);
}
//here the script finished
String output = sb.toString();
if( output.contains("Exception")) {
System.out.println("script failed");
}
if( p.exitValue() == 0) {
System.out.println("The script run without errors");
} else {
System.out.println("The script failed");
}
} catch (IOException e) {
e.printStackTrace();
}
}
this captures both scenarios, where you need to capture the output and then decide whether the script run successfully, or if you can use the exit status from the script.
The exitValue code is 0 for success and any other number for failure.
Surely you can open as many consoles as you want. If you wish to do it simultaneously creaete separate threads and then create process using Runtime.exec() or ProcessBuilder.
But why do you want to do this? There is a good old ant that is a build tool dedicated for such tasks. It supports everything and it is extendable using custom tasks.
Moreover if you suddenly remember that it is 2011 now, use newer tools like Apache Buildr.
Java's Runtime class has methods to launch Processes which have programmatic interfaces to their input/output streams and other management. However, I'm not sure how Windows handles processes whose parents have died.
You should also consider using the Windows Script Host via VB or JScript as you will probably have finer control.
In my program, I need to run a external command in a Ubuntu environment (ntpdate) using java. Currently my code looks like this:
Runtime rt = Runtime.getRuntime();
byte[] readBuffer = new byte[131072];
// Exec a process to do the query
Process p = null;
try {
p = rt.exec("ntpdate -q " + ip);
} catch (Exception ex) {
ex.printStackTrace();
}
if(p!= null){
try {
Thread.sleep(1000);
} catch (Exception e) {
}
// Read the input stream, copy it to the file
InputStream in = p.getInputStream();
try {
int count = 0, rc;
while ((rc = in.read(readBuffer, count, readBuffer.length - count)) != -1) {
count += rc;
if (count >= readBuffer.length) {
p.destroy();
break;
}
}
p.destroy();
result = processOutput(readBuffer, count);
} catch (IOException ex) {
ex.printStackTrace();
}
p.destroy();
This code need to be ran simultaneously on multiple threads in order to maximize performance (I need to test a list of 1.000.000 addresses using ntpdate). However, it runs very slowly, barely consuming machine processing. What am I doing wrong? How could I make this more efficient?
The same problem arises when trying to execute "dig" using .exec(), so I doubt it is because of the specific program being called. Is there some restriction in using Runtime.exec() in a multi Threaded environment?
Is Java the most appropriate approach here? Perhaps this would be better in a shell script, which calls ntpdate in the background multiple times? I'm not sure what benefit you're getting from this code snippet by doing this in Java.
What are you doing with the InputStream from the process?
A bash script could do this like:
for ip in #...IP list
do
ntpdate -q $ip > $ip.txt &
done
Why are you waiting for 1 second at each time ?
try {
Thread.sleep(1000);
} catch (Exception e) {
}
This will do nothing but slowing the execution of your application.
Not sure why it's slow but you need to do a lot more to close your resources. Runtime.exec() needs quite a bit of care and attention to avoid hang-ups and leaking of file descriptors.
http://www.javaworld.com/javaworld/jw-12-2000/jw-1229-traps.html
Are you sure the issue isn't ntpdate? If ntpdate is just sitting there waiting for a server response and has a large timeout value, then your application is going to sit there too.
Try calling ntpdate with a timeout of 0.2 and see if it makes a difference.
Also, as you're opening streams in your code, you definitely want to explicitly .close() them when you're done. Otherwise it might not happen until a GC which could be a very long time away.
I think I found the solution, and that is that there is no solution using java's Runtime.exec(). The problem seems to be that all calls to start a process are synchronized. Indeed, if you start each process alone (via synchronization) you get the exact same result of starting all processes together.
Are there any alternatives to exec? Otherwise, I will need to get some solution without linux's ntpdate...
I notice that both of the commands you tried involve network round-trips. How is the speed if you call something like echo or cat instead?