I am integrating Spring Batch into an existing Spring webapp and have created three simple jobs as a learning process. Then I created a set of ancestor classes for jobs/readers/writers that can be used to migrate our old batch jobs easily.
We intend to run the jobs asynchronously from within Tomcat as separate threads. I have created a UI to manage and start/stop them.
I have a class annotated with #Configuration and #EnableBatchProcessing which does the job of setting up all the global classes for batch mode; it also then scans the base package for all our jobs for specific ancestor class. Then it suffixes each class name for the corresponding *Factory class, does a getBean for that class and uses it to register each job:
Job job = jobFactory.createInstanceForRegistration();
jobRegistry.register(new ReferenceJobFactory(job));
The createInstanceForRegistration method uses applicationContext.getBean to get the instances of the job/read/writer classes to put the job and steps together and then finally:
((SimpleJob)job).setSteps(steps);
return job;
In the UI, the jobs are listed and I should be able to start them. But when I do:
batchAuditId = jobOperator.start(jobName, parameters);
Suddenly, a NPE is thrown from w/i AbstractJob/execute # line 298:
jobParametersValidator.validate(execution.getJobParameters());
because jobParametersValidator is null - despite the fact that it is initialized with a default in the class itself:
private JobParametersValidator jobParametersValidator =
new DefaultJobParametersValidator();
More to the point, the job class constructor has code that sets several items, including the override of the validator:
super(BATCH_NAME, BATCH_TYPE_CODE, BATCH_FUNCTION, BATCH_PROCESS_AREA);
setLogger(LoggerFactory.getLogger(CustomJob.class));
getLogger().debug("constructed: {}", this.toString());
setParametersRequired(Boolean.TRUE); // this job requires parameters
setRestartable(Boolean.TRUE);
setJobParametersValidator(new CustomJobParametersValidatorImpl());
During deployment, when Spring creates all the classes, I can step through the code and the override validator is created and set for that job.
Yet when jobOperator.start executes, the NPE is thrown.
I can find no reason for this. It is almost like jobOperator.start somehow creates a new instance of the job class in a way that doesn't use the existing instance or my custom constructor.
Can anyone explain what is going on?
Related
So I have a technical challenge I need help with.
A large scale project is using a Quartz scheduler to schedule a job to run every night at 9.
The Job that is scheduled, however needs to read values from property files, get some beans using auto-wiring etc.
When I used #Autowired and #Value annotations, I found the values to be null.
The issue is that Quartz creates JobDetail objects using newJob() outside the spring container. As can be seen in the below code.
JobKey jobKey = new JobKey("NightJob", "9-PM Job");
JobDetail jobDetail = newJob(NightJob.class).withIdentity(jobKey)
.usingJobData("Job-Id", "1")
.build();
The jobDetail object which wraps NightJob thus cannot access property files or beans using spring.
Here is my NightJob class
public class NightJob implements Job{
//#Value to read from property file; here
//#Autowired to use some beans; here
#Override
public void execute(JobExecutionContext context) throws JobExecutionException{
}
}
I scanned Stack Overflow and shortlisted several solutions. I also read through the comments and listed the top counter-comments.
Suggestion 1: Get rid of Quartz and use Spring Batch due to its good integration with Spring Boot
Counter argument 1: Spring Batch is overkill for simple tasks. Use #Scheduled
Suggestion 2: Use #Scheduled annotations and cron expressions provided by spring
Counter argument 2: Your application will not be future ready if you remove Quartz. Complex scheduling may be required in the future
Suggestion 3 : Use the spring interface ApplicationContextAware.
Counter argument 3: Lots of additional code. Defeats the simple and easy concept of Spring boot
Is there a simpler way in Spring Boot to access property file values and autowire objects in a class that implements a Quartz job (In this situation , the NightJob class)
As written in the comments, Spring supports bean injection into Quartz jobs by providing setter methods: https://docs.spring.io/spring-boot/docs/current/reference/html/boot-features-quartz.html
I am developing one task scheduler which triggers the tasks in parallel using executor service. I want to make my task scheduler as generic and no code change/less code change in scheduler code base whenever any new type of task is added.
My tasks (mostly client package) can be of any type which basically just accepts particular request and execute the tasks.
To do this I am exposing interface (say ITask) which must be implemented by tasks (which will be on some other app/package) and that will be having one implementation method say example
doTask(IRequest request);
So the use case is if any clients who wants to trigger their job using my scheduler framework/API, just need to add my package in their dependency and rest (those are, getting the list of task classes which implements ITask > schedule it using executor service > retry failed tasks > finally provide the entire tasks status) should be taken care by my schedular API.
What is the optimal way to do this. I am thinking of solution how Junit gets its #Test methods (based on annotation) of client whoever adds Junit dependency in his package, similarly I want get classes based on interface.
You have tagged this question with Spring, but you don't mention anywhere in the question that you are using the Spring framework. This answer makes a few assumptions:
You are using Spring Framework
The implementations of your desired interface have been configured as Spring Beans
If you get access to the ApplicationContext (see the interface ApplicationContextAware), you can use it to look up Spring beans of a certain type. It would look something like this:
Map<String, ITask> beans = appContext.getBeansOfType(ITask.class);
This method returns a map with the key being the bean identifier and the value being the instance of the bean itself. From there, you could loop through the values and add them to your job scheduler.
Alternatively
If you do not want the requirement of having to configure each ITask implementation as a Spring bean, you could use Spring's ClassPathScanningCandidateComponentProvider (a mouthful, I know).
This is a nifty tool that allows you to scan base packages to find bean "candidates". However, in your case, you could use it to find ITask candidates. Clients to your library could configure the base scan packages which you would use to scan:
private String configuredListOfBasePackages;
public void someMethod () {
ClassPathScanningCandidateComponentProvider scanner = new ClassPathScanningCandidateComponentProvider(false);
scanner.addIncludeFilter(new AssignableTypeFilter(ITask.class));
Set<BeanDefinition> iTaskCandidates = scanner.findCandidateComponents(configuredListOfBasePackages);
// do stuff with the bean definitions
}
This method is obviously a bit more dangerous as it require you to be able to construct a new instance of every candidate you find. As such, this is not the ideal solution.
How can I run a job configured using Spring-Batch right after application startup?
Currently I'm specifying an exact time using cron job, but that requires to change the cron every time I restart the application:
#JobRegistry, #Joblauncher and a Job.
I execute the job as follows:
#Scheduled(cron = "${my.cron}")
public void launch() {
launcher.run(job, params);
}
Checking aroud Spring code I have found SmartLifecycle
An extension of the Lifecycle interface for those objects that require
to be started upon ApplicationContext refresh and/or shutdown in a
particular order. The isAutoStartup() return value indicates whether
this object should be started at the time of a context refresh.
Try creating a custom bean implementing SmartLifecycle and setting autoStartup; when this custom bean start method is invoked launch your job.
A few options that I can think of on the places to put your startup logic:
.1. In a bean #PostConstruct annotated method, reference is here - http://docs.spring.io/spring/docs/current/spring-framework-reference/html/beans.html#beans-postconstruct-and-predestroy-annotations
.2. By implementing an ApplicationListener, specifically for either ContextStartedEvent or ContextRefreshedEvent. Reference here - http://docs.spring.io/spring/docs/current/spring-framework-reference/html/beans.html#context-functionality-events
I'm looking for a lib that allow me to do
define a worker that will be invoked once on a specific time in the future (not need the re-schedule / cron like featrure) i.e. a Timer
The worker should accept a context which withe some parameters / inputs
all should be persistent in the DB (or file) the worker
worker should be managed by spring -- spring should instantiate the worker so it can be injected with dependencies
be able to create timers dynamically via API and not just statically via spring XML beans
nice to have:
support a cluster i.e. have several nodes that can host a worker. each store jobn in the DB will cause invokaction of ONE work on one of the nods
I've examined several alternatives none meets the requirements:
Quartz
when using org.springframework.scheduling.quartz.JobDetailBean makes quartz create your worker instance (and not by spring) so you can't get dependecy ijection, (which will lead me to use Service Locator which I want to avoid)
while using org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean you can't get a context. your Worker expose one public method that accepts no arguments.In addition when using MethodInvokingJobDetailFactoryBean you can't use persistence (form the Javadoc)
Note: JobDetails created via this FactoryBean are not serializable and thus not suitable for persistent job stores. You need to implement your own Quartz Job as a thin wrapper for each case where you want a persistent job to delegate to a specific service method.
Spring's Timer and simple JDK Timers does not support the persistence / cluster feature
I know I can impl thing myself using a DB and Spring (or even JDK) Timers but I prefer to use an a 3r party lib for that.
Any suggestions?
If you want to create the job details to generate triggers/job-details at runtime and still be able to use Spring DI on your beans you can refer to this blog post, it shows how to use SpringBeanJobFactory in conjunction with ObjectFactoryCreatingFactoryBean to create Quartz triggering objects at runtime with Spring injected beans.
For those interested in an alternative to Quartz, have a look at db-scheduler (https://github.com/kagkarlsson/db-scheduler). A persistent task/execution-schedule is kept in a single database table. It is guaranteed to be executed only once by a scheduler in the cluster.
Yes, see code example below.
Currently limited to a single string identifier for no format restriction. The scheduler will likely be extended in the future with better support for job-details/parameters.
The execution-time and context is persistent in the database. Binding a task-name to a worker is done when the Scheduler starts. The worker may be instantiated by Spring as long as it implements the ExecutionHandler interface.
See 3).
Yes, see code example below.
Code example:
private static void springWorkerExample(DataSource dataSource, MySpringWorker mySpringWorker) {
// instantiate and start the scheduler somewhere in your application
final Scheduler scheduler = Scheduler
.create(dataSource)
.threads(2)
.build();
scheduler.start();
// define a task and a handler that named task, MySpringWorker implements the ExecutionHandler interface
final OneTimeTask oneTimeTask = ComposableTask.onetimeTask("my-onetime-task", mySpringWorker);
// schedule a future execution for the task with a custom id (currently the only form for context supported)
scheduler.scheduleForExecution(LocalDateTime.now().plusDays(1), oneTimeTask.instance("1001"));
}
public static class MySpringWorker implements ExecutionHandler {
public MySpringWorker() {
// could be instantiated by Spring
}
#Override
public void execute(TaskInstance taskInstance, ExecutionContext executionContext) {
// called when the execution-time is reached
System.out.println("Executed task with id="+taskInstance.getId());
}
}
Your requirements 3 and 4 do not really make sense to me: how can you have the whole package (worker + work) serialized and have it wake up magically and do its work? Shouldn't something in your running system do this at the proper time? Shouldn't this be the worker in the first place?
My approach would be this: create a Timer that Spring can instantiate and inject dependencies to. This Timer would then load its work / tasks from persistent storage, schedule them for execution and execute them. Your class can be a wrapper around java.util.Timer and not deal with the scheduling stuff at all. You must implement the clustering-related logic yourself, so that only one Timer / Worker gets to execute the work / task.
I want to setup my database with initial data programmatically. I want to populate my database for development runs, not for testing runs (it's easy). The product is built on top of Spring and JPA/Hibernate.
Developer checks out the project
Developer runs command/script to setup database with initial data
Developer starts application (server) and begins developing/testing
then:
Developer runs command/script to flush the database and set it up with new initial data because database structures or the initial data bundle were changed
What I want is to setup my environment by required parts in order to call my DAOs and insert new objects into database. I do not want to create initial data sets in raw SQL, XML, take dumps of database or whatever. I want to programmatically create objects and persist them in database as I would in normal application logic.
One way to accomplish this would be to start up my application normally and run a special servlet that does the initialization. But is that really the way to go? I would love to execute the initial data setup as Maven task and I don't know how to do that if I take the servlet approach.
There is somewhat similar question. I took a quick glance at the suggested DBUnit and Unitils. But they seem to be heavily focused in setting up testing environments, which is not what I want here. DBUnit does initial data population, but only using xml/csv fixtures, which is not what I'm after here. Then, Maven has SQL plugin, but I don't want to handle raw SQL. Maven also has Hibernate plugin, but it seems to help only in Hibernate configuration and table schema creation (not in populating db with data).
How to do this?
Partial solution 2010-03-19
Suggested alternatives are:
Using unit tests to populate the database #2423663
Using ServletContextListener to gain control on web context startup #2424943 and #2423874
Using Spring ApplicationListener and Spring's Standard and Custom Events #2423874
I implemented this using Spring's ApplicationListener:
Class:
public class ApplicationContextListener implements ApplicationListener {
public void onApplicationEvent(ApplicationEvent event) {
if (event instanceof ContextRefreshedEvent) {
...check if database is already populated, if not, populate it...
}
}
}
applicationContext.xml:
<bean id="applicationContextListener" class="my.namespaces.ApplicationContextListener" />
For some reason I couldn't get ContextStartedEvent launched, so I chose ContextRefreshedEvent which is launched in startup as well (haven't bumped into other situations, yet).
How do I flush the database? Currently, I simply remove HSQLDB artifacts and a new schema gets generated on startup by Hibernate. As the DB is then also empty.
You can write a unit test to populate the database, using JPA and plain Java. This test would be called by Maven as part of the standard build lifecycle.
As a result, you would get an fully initialized database, using Maven, JPA and Java as requested.
The usual way to do this is to use a SQL script. Then you run a specific bash file that populate the db using your .sql
If you want to be able to programmatically set your DB during the WebApp StartUp you can use a Web Context Listener. During the initialization of your webContext you can use a Servlet Context Listener to get access to your DAO (Service Layer.. whatever) create your entities and persist them as you use to do in your java code
p.s. as a reference Servlet Life Cycle
If you use Spring you should have a look at the Standard and Custom Events section of the Reference. That's a better way to implement a 'Spring Listener' that is aware of Spring's Context (in the case you need to retrieve your Services form it)
You could create JPA entities in a pure Java class and persist them. This class could be invoked by a servlet but also have a main method and be invoked on the command line, by maven (with the Exec Maven Plugin) or even wrapped as a Maven plugin.
But you're final workflow is not clear (do you want the init to be part of the application startup or done during the build?) and requires some clarification.
I would us a Singleton bean for that:
import javax.annotation.PostConstruct;
import javax.ejb.Startup;
import javax.ejb.Singleton;
#Singleton
#Startup
public class InitData {
#PostConstruct
public void load() {
// Load your data here.
}
}
Depend on your db. It is better to have script to set up db
In the aforementioned ServletContextListener or in a common startup place put all the forthcoming code
Define your data in an agreeable format - XML, JSON or even java serialization.
Check whether the initial data exists (or a flag indicating a successful initial import)
If it exists, skip. If it does not exist, get a new DAO (using WebApplicationContextUtils.getRequiredWebApplicationContext().getBean(..)) , iterate all predefined objects and persist them via the EntityManager in the database.
I'm having the same problem. I've tried using an init-method on the bean, but that runs on the raw bean w/o AOP and thus cannot use #Transactional. The same seems to go for #PostConstruct and the other bean lifecycle mechanism.
Given that, I switched to ApplicationListener w/ ContextRefreshedEvent; however, in this case, #PersistenceContext is failing to get an entity manager
javax.persistence.PersistenceException: org.hibernate.SessionException: Session is closed!
at org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException(AbstractEntityManagerImpl.java:630)
at org.hibernate.ejb.QueryImpl.getSingleResult(QueryImpl.java:108)
Using spring 2.0.8, jpa1, hibernate 3.0.5.
I'm tempted to create a non-spring managed entitymanagerfactory and do everything directly but fear that would interfere w/ the rest of the Spring managed entity and transaction manager.
I'm not sure if you can get away from using some SQL. This would depend if your develoeprs are staring with an empty database with no schema defined or if the tables are there but they are empty.
If you starting with empty tables then you could use a Java approach to generating the data. I'm not that familiar with Maven but I assume you can create some task that would use your DAO classes to generate the data. You could probably even write it using a JVM based scripting language like Groovy that would be able to use your DAO classes directly. You would have a similar task that would clear the data from the tables. Then your developers would just run these tasks on the command line or through their IDE as a manual step after checkout.
If you have a fresh database instance that I think you will need to execute some SQL just to create the schema. You could technically do that with executing SQL calls with hibernate but that really doesn't seem worth it.
Found this ServletContextListener example by mkyong. Quoting the article:
You want to initialize the database connection pool before the web
application is start, is there a “main()” method in whole web
application?
This sounds to me like the right place where to have code to insert initial DB data.
I tested this approach to insert some initial data for my webapp; it works.
I found some interesting code in this repository: https://github.com/resilient-data-systems/fast-stateless-api-authentication
This works pretty neat in my project:
#Component
#DependsOn({ "dataSource" })
public class SampleDataPopulator {
private final static Logger log = LoggerFactory.getLogger(SampleDataPopulator.class);
#Inject
MyRepository myRepository
#PostConstruct
public void populateSampleData() {
MyItem item = new ResourceDatabasePopulator();
myRepository.save(item);
log.info("Populated DB with sample data");
}
}
You can put a file called data.sql in src/main/resources, it will be read and executed automatically on startup. See this tutorial.
The other answers did not work for me.