Isuee with concurrent ant calls to DITA OT - java

We have a multi-thread application, and an integration with DITA-OT throught ant which is called from java.
We are started to face an issue with multiple concurrent ant calls to DITA-OT to run transformations, so when two threads or more run the ant call from java to DITA-OT, it randomly starts to generate an error reading the build_preprocess file.
It seems at the same time when one thread is trying to read the build_preprocess, another thread is deleting it; the build_preprocess is generated in the folder DITA-OT\plugins\org.dita.base
Is there a way to fix the issue, o have DITA-OT to support concurrent requests to run transformations?
enter image description here

This problem:
Failed to read job file: Content is not allowed in trailing section.
might occur if the same temporary files folder is used by two parallel processes.
So just make sure the "dita.temp.dir" and "output.dir" parameters are set to distinct values for the parallel processes so they do not use the same temporary files folder or output folder.
https://www.dita-ot.org/dev/parameters/parameters-base.html#ariaid-title1

Related

Apache POI gets java.io.IOException on tmp dir in multithread

I have a java application that gets a request to create an XLSX file.
this application is multi-threaded which means that 5 users simultaneously can run a report.
my issue is that when the report is huge and 5 users create reports together i get this message java.io.IOException: Could not create temporary directory '
this is probably caused because one of the 5 threads deleted the java.tmp.dir and the other 4 threads failed.
how do i resolve that?
one of my suggested solutions is to give each thread a different java.io.tmpdir, is that something that can be done?
One solution will be while creating temp directory then thread should append some prefix to identify uniquely .So there will be no concurrent modification to same folder.
While implementation you have to consider how many request can simultaneously process.You can not create lot of directory.
One solution will be using thread pool and a queue to hold request if request is coming more than you can process.
or If there is similarity in content then you can create a template and change some data dynamically.So only clone will work
I may first check if your methods, in relation to write those .xlsx files, are thread safe.
And your theads may race to write the same files concurrently.

Is file creation process safe among different processes at os level (ubuntu)?

I have two java application which works on some file exist check mechanism , where one application wait till file deletion occurs and create a file on deletion of file to manage concurrency. If the process are not process safe my application fails.
The pseudocode:
if file exists:
do something with it
It's not concurrent safe as nothing ensures the file does not get deleted between the first and the second line.
The safest way would be to use a FileLock. If you are planning to react to file creation/deletion events on Linux, I'd recommend to use some inotify based solution.

Synchronizing process execution in a cluster with 2 nodes in Java

I have a cluster with 2 nodes and a shared file system. Each of these nodes runs has a java process that runs periodically. That process accesses the file system and handles some files and after the processing it deletes those files.
The problem here is that only one of the schedules processes should access the files. The other process should, if the first process runs, skip the execution.
My first attempt to solve this issue to create a hidden file .lock
. When the first process starts the execution it should move the file
into another folder and start handling files. When the other
scheduled process starts the execution it first checks if the .lock
file is present and if it isn't the process skips the execution.
When the first process finishes the execution it moves the .lock
file back to its original folder. I was using the Files.move()
method with ATOMIC_MOVE option but after a certain amount of time I
got unexpected behaviour.
My second attempt was to use a distributed lock like Hazelcast. I did some tests and it seems ok but this solution seems a bit complicated for a task that is this simple.
My question is: Is there any other smarter/simpler solution for this problem or my only option is to use Hazelcast? How would you solve this issue?

Run single JAR simultanously on Sun-Grid Engine

At my university there is a Sun Grid Engine where I need to perform some tests on. These tests are written in Java an therefore I have created a JAR file which, by just executing it, starts the tests. The test reads in a file and performs some computation on it and at the end writes out a txt file with some results. However, every test is having different parameters which I pass in through the Main method of the JAR. After reading in the file, the parameters will give a different output.
Now I wonder, is this possible to accomplish? Can I run the same JAR multiple times knowing that they all need to read in the same (so just one) file?
Yes. It is possible. Having multiple processes reading the same file is not a problem, even if those processes are not on the same physical machine.
However, make sure you have a different output file per processes.

Concurrency while reading files from file system

We have an application that reads files from a particular folder, processes them and copies(some business logic) it to another folder.
The problem here is when there are very large number of files to be processed, running a single instance of an application or a single thread is no longer enough to process this files.
One approach we have for this is to start multiple instances of the application(I feel something is wrong with this approach. Suggest me an alternative if there is one).
Spawning threads or starting multiple instances of the application, care should be taken that, if a thread reads one file and starts processing it, another thread should not pick it up.
We are trying to achieve this by having a database table with the list of file names in the folder, so that when a thread first reads the table for the file name ,we will change the status to in-process or completed and pessimistically lock the table so that other threads cannot read it.
Is there any better solution to the problem ?
You can use most of your existing implementation as the front-end processor to feed file streams to worker threads that you can start/stop as demand dictates. Only the front-end thread opens files, so there is no possibility of one worker interfering with another.
EDIT: Added the word 'no' as it changes the meaning quite a bit...
Also have a look at JDK 7. It has a new file I/O API and a fork/ join framework which might help.
Take a look at Apache Camel (http://camel.apache.org), and its File component (http://camel.apache.org/file2.html). Using Camel allows you to very easily define a set of processing instructions to consume files in a directory atomically, and also to configure a thread pool to deal with multiple files at the same time. Camel in Action's a great book to get you started.
What you describe reminds me of the classical style to develop on UNIX.
In this classical style, you would move a file to a work-in-progress directory so that other files do not pick it up. In general: You could use one directory per processing state and than move files from state to state.
This works essentially because file moves are atomic (at least under Unix systems and NFTS).
What is nice with this approach, is that it is pretty easy to handle problematic situations like crashes and it has automatically a nice management interface everyone is familiar with (the filesystem GUI, ls, Windows Explorer, ...).

Categories

Resources