Run single JAR simultanously on Sun-Grid Engine

Run single JAR simultanously on Sun-Grid Engine - java

At my university there is a Sun Grid Engine where I need to perform some tests on. These tests are written in Java an therefore I have created a JAR file which, by just executing it, starts the tests. The test reads in a file and performs some computation on it and at the end writes out a txt file with some results. However, every test is having different parameters which I pass in through the Main method of the JAR. After reading in the file, the parameters will give a different output.
Now I wonder, is this possible to accomplish? Can I run the same JAR multiple times knowing that they all need to read in the same (so just one) file?

Yes. It is possible. Having multiple processes reading the same file is not a problem, even if those processes are not on the same physical machine.
However, make sure you have a different output file per processes.

Related

How to wait until whole files is downloaded from ftp server in Java?

One ThreadPool is downloading files from the FTP server and another thread pool is reading files from it.
Both ThreadPool are running concurrently. So exactly what happens, I'll explain you by taking one example.
Let's assume, I've one csv file with 100 records.
While threadPool-1 is downloading and writing it in a file in pending folder, and at the same time threadpool-2 reads the content from that file, but assume in 1 sec only 10 records can be written in a file in /pending folder and threadpool - 2 reads only 10 record.
ThreadPool - 2 doesn't know about that 90 records are currently in process of downloading. Now, threadPool - 2 will not read 90 records because it doesn't know that whole file is downloaded or not. After reading it'll move that file in another folder. So, my 90 records will be proceed further.
My question is, how to wait until whole file is downloaded and then only threadPool 2 can read contents from the file.
One more thing is that both threadPools are use scheduleFixedRate method and run at every 10 sec.
Please guide me on this.

I'm a fan of Mark Rotteveel's #6 suggestion (in comments above):
use a temporary name when downloading,
rename when download is complete.
That looks like:
FTP download threads write all files with some added extension – perhaps .pending – but name it whatever you want.
When a file is downloaded – say some.pdf – the FTP download thread writes the file to some.pdf.pending
When an FTP download thread completes a file, the last step is a file rename operation – this is the mechanism for ensuring only "done" files are ready to be processed. So it downloads the file to some.pdf.pending, then at the end, renames it to some.pdf.
Reader threads look for files, ignoring anything matching *.pending
I've built systems using this approach and they worked out well. In contrast, I've also worked with more complicated systems that tried to coordinate across threads and.. those often did not work so well.
Over time, any software system will have bugs. Edsger Dijkstra captured this so well:
"If debugging is the process of removing software bugs, then programming must be the process of putting them in."
However difficult it is to reason about program correctness now – while the program is still in design phase,
and has not yet been built – it will be harder to reason about correctness when things are broken in production (which will happen, because bugs).
That is, when things are broken and you're under time pressure to find the root cause (and fix it!), even the best of us would be at a disadvantage
with a complicated (vs. simple) system.
The approach of using temporary names is simple to reason about, which should minimize code complexity and thus make it easier to implement.
In turn, maintenance and bug fixes should be easier, too.
Keep it simple – let the filesystem help you out.

Isuee with concurrent ant calls to DITA OT

We have a multi-thread application, and an integration with DITA-OT throught ant which is called from java.
We are started to face an issue with multiple concurrent ant calls to DITA-OT to run transformations, so when two threads or more run the ant call from java to DITA-OT, it randomly starts to generate an error reading the build_preprocess file.
It seems at the same time when one thread is trying to read the build_preprocess, another thread is deleting it; the build_preprocess is generated in the folder DITA-OT\plugins\org.dita.base
Is there a way to fix the issue, o have DITA-OT to support concurrent requests to run transformations?
enter image description here

This problem:
Failed to read job file: Content is not allowed in trailing section.
might occur if the same temporary files folder is used by two parallel processes.
So just make sure the "dita.temp.dir" and "output.dir" parameters are set to distinct values for the parallel processes so they do not use the same temporary files folder or output folder.
https://www.dita-ot.org/dev/parameters/parameters-base.html#ariaid-title1

What would be best practice If I am trying to constantly check if a directory exists? JAVA

I have a Java application that creates multiple threads. There is 1 producer thread which reads from a 10gb file, parses that information, creates objects from it and puts them into multiple blocking queues (5 queues).
The rest of the 5 consumer threads read from a blockingqueue (each consumer thread has its own blockingqueue). The consumer threads then each write to an individual file, so 5 files in total get created. It takes around 30min to create all files.
The problem:
The threads are writing to an external mount directory in a linux box. We've experience problems where other linux mounts have gone down and applications crash so I want to prevent that in this application.
What I would like to do is keep checking if the mount (directory) exists before writing to it. Im assuming if the directory goes down it will throw a FileNotFoundException. If that is the case I want it to keep checking if the directory is there for about 10-20min before completely crashing. Because I dont want to have to read the 10gb file again I want the consumer threads to be able to pick up from where they last left off.
What Im not sure would be best practice is:
Is it best to check if the directory exists in the main class before creating the threads? Or check in each consumer thread?
If I keep checking if the directory exists in each consumer thread it seems like repeatable code. I can check in the main class but it takes 30min to create these files. What if in those 30min the mount goes down then if Im only checking in the main class whether the directory exists the application will crash. Or if Im already writing to a directory is it impossible for an external directory to go down? Does it get locked?
thank you

We have something similar in our application, but in our case we are running a web app and if our mounted file system goes down we just throw an exception, but we want to do something more elegant, like you do...
I would recommend using a combination of the following patterns: State, CircuitBreaker, which I believe CircuitBreaker is a more specific version of the State pattern, and Observer/Observable.
These would work in the following way...
Create something that represents your file system. Maybe a class called MountedFileSystem. Make all your write calls to this particular class.
This class will catch all FileNotFoundException and one occurs, the CircutBreaker gets triggered. This change will be like the State pattern. One state is when things are working 'fine', the other state is when things aren't working 'fine', meaning that the mount has gone away.
Then, in the background, I would have a task that starts on a thread and checks the actual underlying file system to see if it is back. When the file system is back, change the state in the MountedFileSystem, and fire an Event (Observer/Observable) to try writing the files again to disk.
And as yuan quigfei stated, I am fairly certain you're going to have to rewrite those files. I just don't see being able to restart writing to them, but perhaps someone else has an idea.

write a method to detect folder exist or not.
call this method before actual writing.
create 5 thread based on 2. Once detect file is not existed, you seems have no choice but rewrite. Of course, you don't need re-read if all your content are in memory already(Big memory).

Capture IO of spawned program in Java

I'm writing a program in Java which relies on a pre-compiled third party JAR residing in the same directory as mine. At runtime, my program checks if this file exists, then downloads it if it doesn't. Its main class is then executed. However, the spawned program prints a large amount of text directly to the console. Is there any way to 'capture' (and therefore hide) this output from stdout and return my own input directly from my parent application to stdin? I would ideally like the child program to reside inside the same JVM, so I would like to avoid any version of Runtime.exec().

Use the Java 1.5+ ProcessBuilder class and the Process class. Remember that the process will block if you don't handle its streams correctly.

Detect java program running on linux machine

Ok, so I have a couple of Java programs that I'm running using a chron job on a linux server. These jobs run every ten minutes or so, take literally two minutes to run, and then exit. I need to add a way for the programs to detect, when they start up, if there is already an instance of themselves running, and if so to exit without going any further. I'm really not sure of the best way to handle this though and am hoping someone can offer some advice.
One approach I've considered is to run a command line argument from the java code that does some sort of PS command and looks through those to see if it's running. This seems pretty finicky and complex though for something so small. Plus, I'm not all that knowledgeable with linux and am not even sure the best way to do that. If anyone has some better thoughts, please let me know. Or if that is the best way, if you could provide the linux commands I'd need I'd appreciate it. Thanks.

If you have a writable /tmp directory you can use a lockfile.
When your Java program starts up, check for a file with a name unique to your application (e.g. "my-lock-file.lock") in the /tmp directory. If none exists, create one, and remove it when you're done. If one exists, just exit.
You can check the existence of a file with the .exists() method of the java.io.File class.
If your code needs to be portable, you can use System.getProperty("java.io.tmpdir")); to get an appropriate temporary directory for the platform your code is running on.

You could look at JMX and the Attach API to query for running JVMs.
Or, as Andrew logvinov mentioned, by using a lock file.
If you are using Java WebStart, there's already native support for this.

Many programs solve this by creating a temporary file that points to their PID (often referred to as a "lock" file). The filename should encode all relevant information to distinguish this process from other processes that could legitimately run in parallel.
For example, if the process is bound to a user, it should contain the user name. If the process is bound to a machine, it should (also) contain the hostname (if you put it in machine-bound temp. directory, this is debatable. If you put it in a home directory, think of the case of multiple machines sharing a home via NFS).
The location of these files is typically /tmp. This is a great location, as /tmp is typically wiped during system boot, so no orphan files are left in case of a system crash. Another solution employed by some programs is to put the lock file in the user settings directory, if it is related to the settings. E.g. mozilla thunderbird has a file called /home/<username>/.thunderbird/<profilename>.default/lock.
The file should contain the PID of the process. The idea is simple: If the file contains the PID, it is easy to check whether this process is indeed still running. So if the process crashes, the file gets orphaned. The new process instance will check the PID in the file, see that it is not running any more, and ignore the file (overwrite).
Putting it all together, you could create a file like this:
/tmp/myawesomeservice-username-hostname-lock
With the content:
12345

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Run single JAR simultanously on Sun-Grid Engine - java

Yes. It is possible. Having multiple processes reading the same file is not a problem, even if those processes are not on the same physical machine. However, make sure you have a different output file per processes.

Related

How to wait until whole files is downloaded from ftp server in Java?

Isuee with concurrent ant calls to DITA OT

What would be best practice If I am trying to constantly check if a directory exists? JAVA

Capture IO of spawned program in Java

Detect java program running on linux machine

Categories

Resources