Rerunning a process/command after an interval after its last completion

Rerunning a process/command after an interval after its last completion - java

I have a java program that I want to run every 2 hours. But I am not sure how long will it take to complete. In some cases, it may take 1 min and in some cases it may take more than 3 hours. Running same command after two hours will result in several instances running in parallel. Hence, I am trying to make it run 2 hours after it finishes. One option is keeping thread.sleep() method in Java. Is there any option I can do in Ubuntu ?

A very basic way to do this could be running your task on any scheduler, like cron/quartz/etc. On each task complete, write/create a file to signify the previous task is complete. On each task start, check for completion. If it has not completed, then skip. Or you could go more complex and write another file to queue the task to run immediately after current task is done. You could apply the same concept to a db table that tracks tasks processed as well.
Of course, you could write your own task managing layer and implement your own scheduling framework hehe

The following shell script will only run my_java_program if no other instances are running:
[ "$(pgrep my_java_program)" ] || my_java_program
If your java program is just a bare jar file, say, mypgm.jar, then try:
[ "$(pgrep -f mypgm.jar)" ] || java -jar mypgm.jar

Related

Jmeter thread group duration

I have this situation:
Thread Group (n_threads=X,duration Y sec)
Loop
Java Sampler
When the test duration ends, Jmeter does not stop the threads that were making requests and therefore the test does not terminate. How can this be solved?

This can happen in the case when response time of your "Java Sampler" is higher than your test duration. I would recommend introducing reasonable timeout into your Java code so the samplers would fail/exit instead of waiting forever for the response. If you have no idea what's going on there - take a thread dump and see where your thread(s) stuck
As the workaround, the only way to "terminate" the test I can think of would be:
Adding another Thread Group with 1 thread
Adding JSR223 Sampler with the following code:
sleep(5000) // wait for 5 seconds, amend accordingly to your desired test duration
log.info('Exceeded test duration')
System.exit(1) // the process will exit with non-zero exit code (error), change it to 0 if needed
See Apache Groovy - Why and How You Should Use It for more information on Groovy scripting concept in JMeter
Also be aware that this code will terminate the whole JVM so it will probably make sense to add jmeter.save.saveservice.autoflush=true line to user.properties as forcibly terminating the whole JVM might lead to some results loss.

Python multithreading takes longer to execute the multiple jar files

I am using ThreadPoolExecutor and giving exact same tasks to workers. The task is to run a jar file and do something with it. the problem I am facing is related to timings.
Case 1: I submit one task to the pool and the worker completes in 8 seconds.
Case 2: I submit same task twice into the pool, and both workers completes around ~10.50 seconds.
Case 3: I submit same task thrice into the pool, and all three workers completes around ~13.38 seconds.
Case 4: I submit same task 4 times into the pool, and all fore workers completes around ~18.88 seconds.
If I replace the workers tasks to time.sleep(8) (instead of running jar file), then all 4 workers finish at ~8 seconds. Is this because of the fact that, the OS before executing java code has to create java environment first, which the OS is not able to manage it in parallel ?
Can someone explain me why is the execution time increasing for same task, while running in parallel ?Thanks :)
Here is how I am executing the pool;
def transfer_files(file_name):
raw_file_obj = s3.Object(bucket_name='foo-bucket', key=raw_file_name)
body = raw_file_obj.get()['Body']
# prepare java command
java_cmd = "java -server -ms650M -mx800M -cp {} commandline.CSVExport --sourcenode=true --event={} --mode=human_readable --configdir={}" \
.format(jar_file_path, event_name, config_dir)
# Run decoder_tool by piping in the encoded binary bytes
log.info("Running java decoder tool for file {}".format(file_name))
res = run([java_cmd], cwd=tmp_file_path, shell=True, input=body.read(), stderr=PIPE, stdout=PIPE)
res_output = res.stderr.decode("utf-8")
if res.returncode != 0:
if 'Unknown event' in res_output:
log.error("Exception occurred whilst running decoder tool")
raise Exception("Unknown event {}".format(event_name))
log.info("decoder tool output: \n" + res_output)
with futures.ThreadPoolExecutor(max_workers=MAX_WORKERS) as pool:
# add new task(s) into thread pool
pool.map(transfer_file, ['fileA_for_workerA', 'fileB_for_workerB'])

Using multithreading doesn't necessarily mean it will execute faster. You would have to deal with the GIL for Python to execute the commands. Think of it like 1 person can do 1 task faster than 1 person doing 2 tasks at the same time. He/she would have to multitask and do part of thread 1 first, than switch to thread 2, etc. The more threads, the more things the python interpreter has to do.
The same thing might be happening for Java too. I don't use Java but they might have the same problems. Here, Is Java a Compiled or an Interpreted programming language ? it says that the JVM converts Java on the fly, so the JVM would probably have to deal with the same problems as Python.
And, for the time.sleep(8), what it does is just use up processor time for the thread, so it would be easy to switch between a bunch of waiting tasks.

How does Gradle's java plugin run JUnit tests in parallel: one queue per fork or a single queue and multiple forks?

I have been recently trying Gradle, I didn't have any prior experience with it, and so far I have been able to do the things I wanted to and am satisfied with the results. However, in my case I have to run Selenium tests with JUnit and some of them are disproportionately larger than others (i.e.: 25min vs 4min).
When using the maxParallelForks option, sometimes it takes longer than I would expect, as the tests seem to be assigned beforehand to the forks and sometimes I end up with iddle forks, while one of them is stuck with a long test, and when it finishes other shorter tests run after it (which could have run in any of the other available forks).
TL;DR:
When running tests in parallel, Gradle seems to assign tests as if there were multiple queues (one per fork) and I would like it to be like a single queue where the forks take the next test in the queue.
As an off-topic example, my situation is like being stuck in a queue at the supermarket, when the ones next to you are empty, but you can't change.

it's the latter. gradle uses one queue and distributes each entry clockwise to the running processes. Say you have 4 tests:
Test1 taking 10s
Test2 taking 1s
Test3 taking 10s
Test4 taking 1s
and using maxParallelForks = 2 the overall test task execution would be around 20s. I guess we need to discuss if this can be improved by getting notified about "free" processes to assign Test3 directly to test worker process 2 after Test2 comes back after 1s.

As of July 2021, Your question (and my experience) matches the issue described in this issue which
"has been automatically closed due to inactivity".
The issue is not resolved, and yes, the system assigns tasks early and does not rebalance to use idle workers.

Separate processes, each run in multithread JAVA

I have 4 separate processes which need to go one after another.
1st process
2nd process
3rd process
4th process
Since, every process is connected to one another, each process should run after process before him finishes.
Each process has its own variable length which will be various as programs data input grows.
But some sketch would be like this
Program Runs
1st process - lasts 10 seconds
2nd process - has 300 HTTP get requests, last 3 minutes
3rd process - has 600 HTTP get requests, lasts 6 minutes
4th process - lasts 1 minute
Program is written in java
Thanks for any answer!

There is no concurrency support in the java API for your use case because what you're asking for is the opposite of concurrent. You have a set of four mutually dependent operations that need to be run in a specific order. You only need, and should probably only use, one thread to correctly handle this case.
It would be reasonable and prudent to put each operation in its own method or class, based on how complex the operations are.
If you insist on using multiple threads, your main thread should maintain a list of runnables. Iterate through the list. Pop the first runnable from the list, create a new thread for that runnable, start the thread, and then invoke join() on the thread. The main thread will block until the runnable is complete. The loop will take you through all the runnables in order. Again, there is no good reason to do this. There may or may not be a bad reason.

How can I get name of scheduled task calling batch file, and pass it to Java program?

Let's say I have multiple scheduled tasks running at different times, but all of which are running a common Windows batch file. This batch file in turn executes a Java program.
I would like my batch file to get the name of the scheduled task that's calling it, and pass its name to the program that my batch is executing. How can I do this?

Like Joey said there is no way to do it without getting help from outside.
You could create a separate instance of the batch for each task with an argument in each one specifying which task is assigned to run it.
You could also create smaller batches like this one:
CALL mybatch.bat 1st_task
and that would pass the name of your first task into the batch as the %1 variable.
You could also have your batch figure it out based on the time that it was run using the %time% variable, but this would need some parsing I'm sure as you can't always guarantee it runs at the same time down to the second.
it might look something like this:
if '%time:~0,5%'=='10:30' set var=1st_task
if '%time:~0,5%'=='12:00' set var=2nd_task
and so on
(That last one assumes that your tasks only run at specified times during the day... and if for some reason they execute at a different time this won't work)

You could pass the name of the scheduled task as an argument to your batch file. You cannot figure it out from inside the batch without help from outside.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Rerunning a process/command after an interval after its last completion - java

The following shell script will only run my_java_program if no other instances are running: [ "$(pgrep my_java_program)" ] || my_java_program If your java program is just a bare jar file, say, mypgm.jar, then try: [ "$(pgrep -f mypgm.jar)" ] || java -jar mypgm.jar

Related

Jmeter thread group duration

Python multithreading takes longer to execute the multiple jar files

How does Gradle's java plugin run JUnit tests in parallel: one queue per fork or a single queue and multiple forks?

Separate processes, each run in multithread JAVA

How can I get name of scheduled task calling batch file, and pass it to Java program?

Categories

Resources