Parallel Processing in Weblogic using WorkManager

Parallel Processing in Weblogic using WorkManager - java

I have three servers and scheduling one task using application WorkManager. At a time only one node processes this task. Now I want to schedule this task in parallel i.e. run in 3 threads. Paralleling this task on one server/JVM is easy. But not able to find a way where I can schedule a task/work on remote JVM. For example, task is divided in 3 sub tasks and all 3 JVMs running this task in parallel.
I tried creating Global WorkManager and targeted other server(Server2). I ran main job on Server1 and scheduled work using Global Work Manager. But that did not work and work was scheduled on Server1 only.
There is RemoteWorkItem interface provided by commonj. But not sure Weblogic has provided implementation of this interface or not. I am using weblogic 10.3 https://docs.oracle.com/cd/E13222_01/wls/docs90/javadocs/commonj/work/RemoteWorkItem.html
Is there a way out there using WorkManager or I have to go with the messaging solution only?

Work Managers are used to constrain how many work requests a particular WLS instance will execute at once. They're designed more for server stability than for forcing distributed workloads, as evidenced by the fact that you can (and should) set work managers even on single node clusters.
I think you'll have to focus more on how the work items are distributed to each server in the cluster, since the solution is load balancing within the constraints of the work managers.

Related

Behavior of executor service in cluster

I had written a code using executor service in java. Here I am creating 10 worker threads to process database fetched rows. Each thread will be assigned with one resultant row. This approach will work fine when the application is deployed and running on single instance/node.
Can anyone suggest how this will behave when my application is deployed in multiple nodes/cluster?
Do I have to take care of any part of code before deploying into cluster?
04/12/15: Any more suggestions?

You should consider the overhead of each task. Unless the task is of moderate size, you might want to batch them.
In a distributed context the overhead if much higher so you are more likely to need to batch the work.
You will need to a framework, so the considerations will depend on the framework you chose.

How to avoid running duplicate tasks in a High Avaliability Clustered JBoss EAP

Im developing an application that needs to have some background jobs, for example for sending emails on pending alerts. In a standalone configuration the jobs are configured and working fine with Spring scheduler and scheduled-tasks.
But i don't know how to make them work synchonized on a clustered with high avaliability JBoss environment. The main problem is to avoid that jobs on different nodes run at the same time.
I've read this about Quartz:
http://quartz-scheduler.org/documentation/quartz-2.x/configuration/ConfigJDBCJobStoreClustering
But it's not recomended on on a high avaliability scenario:
Never run clustering on separate machines, unless their clocks are
synchronized using some form of time-sync service (daemon) that runs
very regularly (the clocks must be within a second of each other). See
http://www.boulder.nist.gov/timefreq/service/its.htm if you are
unfamiliar with how to do this.
By now i have workarounded the synchronization problem with a self made blocking system (Why my pessimistic Locking in JPA with Oracle is not working). But i wish to know if JBoss provides some solution for this certainly common problem.

You can try an HA Singleton, which is an EJB Singleton configured to be only running on one node in a cluster. That singleton can then use the EJB Timer Service for scheduling your jobs. See the documentation about HA singleton: https://access.redhat.com/documentation/en-US/JBoss_Enterprise_Application_Platform/6/html/Development_Guide/Implement_an_HA_Singleton.html

Scheduled job in a multi node environment

I am working on a scheduled job that will run at certain interval (eg. once a day at 1pm), scheduled through Cron. I am working with Java and Spring.
Writing the scheduled job is easy enough - it does: grab list of people will certain criteria from db, for each person do some calculation and trigger a message.
I am working on a single-node environment locally and in testing, however when we go to production, it will be multi-node environment (with load balancer, etc). My concern is how would multi node environment affect the scheduled job?
My guess is I could (or very likely would) end up with triggering duplicate message.
Machine 1: Grab list of people, do calculation
Machine 2: Grab list of people, do calculation
Machine 1: Trigger message
Machine 2: Trigger message
Is my guess correct?
What would be the recommended solution to avoid the above issue? Do I need to create a master/slave distributed system solution to manage multi node environment?

If you have something like three Tomcat instances, each load balanced behind Apache, for example, and on each your application runs then you will have three different triggers and your job will run three times. I don't think you will have a multi-node environment with distributed job execution unless some kind of mechanism for distributing the parts of the job is in place.
If you haven't looked at this project yet, take a peek at Spring XD. It handles Spring Batch Jobs and can be run in distributed mode.

Clustered Quartz scheduler configuration

I'm working on an application that uses Quartz for scheduling Jobs. The Jobs to be scheduled are created programmatically by reading a properties file. My question is: if I have a cluster of several nodes which of these should create schedules programmatically? Only one of these? Or maybe all?

i have used quartz in a web app, where users, among other things, could create quartz jobs that performed certain tasks.
We have had no problems on that app provided that at least the job names are different for each job. You can also have different group names, and if i remember correctly the jobgroup+jobname combination forms a job key.
Anyway we had no problem with creating an running the jobs from different nodes, but quartz at the time(some 6 months ago, i do not believe this has changed but i am not sure) did not offer the possibility to stop jobs in the cluster, it only could stop jobs on the node the stop command was executed on.
If instead you just want to create a fixed number of jobs when the application starts you better delegate that job to one of the nodes, as the jobs name/group will be read from the same properties file for each node, and conflicts will arise.

Have you tried creating them on all of them? I think you would get some conflict because of duplicate names.
So I think one of the members should create the schedules during startup.

You should only have one system scheduling jobs for the cluster if they are predefined in properties like you say. If all of the systems did it you would needlessly recreate the jobs and maybe put them in a weird state if every server made or deleted the same jobs and triggers.
You could simply only deploy the properties for the jobs to one server and then only one server would try to create them.
You could make a separate app that has the purpose of scheduling the jobs and only run it once.
If these are web servers you could make a simple secured REST API that triggers the scheduling process. Then you could write an automated script to access the API and kick off the scheduling of jobs as part of a deployment or whenever else you desired. If you have multiple servers behind a load balancer it should go to only one server and schedule the jobs which quartz would save to the database backed jobstore. The other nodes in the cluster would receive them the next time they update from the database.

Distributed master-worker architecture simulation design in Java

I need to simulate a system in Java where there is a master and a number of workers. Each worker may process its data locally but needs to communicate the master to read data from other nodes. And workers should work concurrently.
How can I simulate this system? Do I need to start a new thread for every running worker and a master thread? Is there another way?

If you want to do it on a single machine then I see two options:
Create a master and a worker application (make sure that you can run multiple instances of those). Run one master application and multiple instances of the worker application.
Create a single application in which you have a single instance of your Master class and multiple instances of your Worker class. Let the Master run in a separate thread and let each Worker run in its own thread too.
So the first option is to run each "node" (master or worker) as a separate process, while the second option is to run each "node" as a separate thread.

This is a pretty generic question which is open to many architectural solutions. I'd like to present the one I've used in the past. I used RMI here for ease of remote calls.
All master and slave processes are RMI services. Both master and slaves are spawned using the RMI daemon (RMID) which have the bonus feature of "upping" the services in case one goes down due to a JVM crash or any other "abnormal" reason. RMI services in general work based on an interface which defines the contract between the client and the server. Let's say for e.g. that I have to write a service which solves an equation.
We start off with creating two services: master and slave. Both these services implement/expose the same interface to the client. The only difference would be that the "master" service would be solely responsible for "forking" across work to the different slave agents, getting the response back (or re-arranging them if required) and returning it to the client. The master is a simple RMI service which accepts the list of "equations" and splits them across the different clients. Obviously, the master here has references to all the clients it governs in terms of RMI handle (i.e. communication between master and slaves is again a RMI invocation).
Here again, we have a lot of possibilities of configuring how master "looks up" the slaves but I'm sure you can work it out quite easily. This architecture has the advantage of a grid based solution wherein you are not limited by a single process to do all the work for you and hence gain the resiliency and freedom from monolithic heap sizes for your JVM process.
I really haven't used them but Rio and JINI is something you should look into if you want to build distributed systems in Java.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.