Spring Batch - JpaPagingItemReader - Works in MySQL - Duplicates in PostgreSQL - java

Before Spring Batch job runs. I have a import table which contains all items that needs importing into our system. It is at this point verified to contain only items that does not exist in our system.
Next I have a Spring Batch Job, which reads from this import table using JpaPagingItemReader.
After work is done, it writes to db using ItemWriter.
I run with page-size and chunk-size at 10000.
Now this works absolutely fine when running on MySQL innoDB. I can even use multiple threading and everything works fine.
But now we are migrating to PostgreSQL, and the same Batch Job runs into some very strange problem
What happens is that it tries to insert duplicates into our system. This will naturally be rejected by unique index constraints and an error is thrown.
Since the import db table is verified to contain only non-existing before batch job starts, the only reason for this i can think of is that the JpaPagingItemReader reads some rows multiple times from import db table when i run on Postgres. But why would it do that?
I have experimented with a lot of settings. Turning chunk and page-size down to around 100 only makes import slower, but still same error. Running single-thread instead of multiple threads only makes the error happen slightly later.
So what on earth could be the reason for my JpaPagingItemReader reading the same items multiple times only on PostgresSQL?
The select statement backing the reader is simple, its a NamedQuery:
#NamedQuery(name = "ImportDTO.findAllForInsert",
query = "select h from ImportDTO h where h.toBeImported = true")
Please also note that the toBeImported flag will not be altered by the batch job at all during runtime, so the results from this query should always return the same before, under and after the batch job.
Any insights, tips or help is greatly appriciated!
Here is Batch Config code:
import javax.persistence.EntityManagerFactory;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.batch.core.launch.support.SimpleJobLauncher;
import org.springframework.batch.core.repository.JobRepository;
import org.springframework.batch.item.database.JpaPagingItemReader;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.task.SimpleAsyncTaskExecutor;
import org.springframework.core.task.TaskExecutor;
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private OrganizationItemWriter organizationItemWriter;
#Autowired
private EntityManagerFactory entityManagerFactory;
#Autowired
private OrganizationUpdateProcessor organizationUpdateProcessor;
#Autowired
private OrganizationInsertProcessor organizationInsertProcessor;
private Integer organizationBatchSize = 10000;
private Integer organizationThreadSize = 3;
private Integer maxThreadSize = organizationThreadSize;
#Bean
public SimpleJobLauncher jobLauncher(JobRepository jobRepository) {
SimpleJobLauncher launcher = new SimpleJobLauncher();
launcher.setJobRepository(jobRepository);
return launcher;
}
#Bean
public JpaPagingItemReader<ImportDTO> findNewImportsToImport() throws Exception {
JpaPagingItemReader<ImportDTO> databaseReader = new JpaPagingItemReader<>();
databaseReader.setEntityManagerFactory(entityManagerFactory);
JpaQueryProviderImpl<ImportDTO> jpaQueryProvider = new JpaQueryProviderImpl<>();
jpaQueryProvider.setQuery("ImportDTO.findAllForInsert");
databaseReader.setQueryProvider(jpaQueryProvider);
databaseReader.setPageSize(organizationBatchSize);
// must be set to false if multi threaded
databaseReader.setSaveState(false);
databaseReader.afterPropertiesSet();
return databaseReader;
}
#Bean
public JpaPagingItemReader<ImportDTO> findImportsToUpdate() throws Exception {
JpaPagingItemReader<ImportDTO> databaseReader = new JpaPagingItemReader<>();
databaseReader.setEntityManagerFactory(entityManagerFactory);
JpaQueryProviderImpl<ImportDTO> jpaQueryProvider = new JpaQueryProviderImpl<>();
jpaQueryProvider.setQuery("ImportDTO.findAllForUpdate");
databaseReader.setQueryProvider(jpaQueryProvider);
databaseReader.setPageSize(organizationBatchSize);
// must be set to false if multi threaded
databaseReader.setSaveState(false);
databaseReader.afterPropertiesSet();
return databaseReader;
}
#Bean
public OrganizationItemWriter writer() throws Exception {
return organizationItemWriter;
}
#Bean
public StepExecutionNotificationListener stepExecutionListener() {
return new StepExecutionNotificationListener();
}
#Bean
public ChunkExecutionListener chunkListener() {
return new ChunkExecutionListener();
}
#Bean
public TaskExecutor taskExecutor() {
SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
taskExecutor.setConcurrencyLimit(maxThreadSize);
return taskExecutor;
}
#Bean
public Job importOrganizationsJob(JobCompletionNotificationListener listener) throws Exception {
return jobBuilderFactory.get("importAndUpdateOrganizationJob")
.incrementer(new RunIdIncrementer())
.listener(listener)
.start(importNewOrganizationsFromImports())
.next(updateOrganizationsFromImports())
.build();
}
#Bean
public Step importNewOrganizationsFromImports() throws Exception {
return stepBuilderFactory.get("importNewOrganizationsFromImports")
.<ImportDTO, Organization> chunk(organizationBatchSize)
.reader(findNewImportsToImport())
.processor(organizationInsertProcessor)
.writer(writer())
.taskExecutor(taskExecutor())
.listener(stepExecutionListener())
.listener(chunkListener())
.throttleLimit(organizationThreadSize)
.build();
}
#Bean
public Step updateOrganizationsFromImports() throws Exception {
return stepBuilderFactory.get("updateOrganizationsFromImports")
.<ImportDTO, Organization> chunk(organizationBatchSize)
.reader(findImportsToUpdate())
.processor(organizationUpdateProcessor)
.writer(writer())
.taskExecutor(taskExecutor())
.listener(stepExecutionListener())
.listener(chunkListener())
.throttleLimit(organizationThreadSize)
.build();
}
}

You need to add order by clause to select

Related

Spring Batch avoid launch Reader and Writer before tasklet

I'm working with spring batch and have a job with two steps the first step (tasklet) validation the header CSV and the second step reads an CSV file and Write to another CSV file like this:
#Bean
public ClassifierCompositeItemWriter<POJO> classifierCompositeItemWriter() throws Exception {
Classifier<POJO, ItemWriter<? super POJO>> classifier = new ClassiItemWriter(ClassiItemWriter.itemWriter());
return new ClassifierCompositeItemWriterBuilder<POJO>()
.classifier(classifier)
.build();
}
#Bean
public Step readAndWriteCsvFile() throws Exception {
return stepBuilderFactory.get("readAndWriteCsvFile")
.<POJO, POJO>chunk(10000)
.reader(ClassitemReader.itemReader())
.processor(processor())
.writer(classifierCompositeItemWriter())
.build();
}
I used a FlatFileItemReader (in ClassitemReader) and a FlatFileItemWriter (in ClassiItemWriter), before reading CSV. I checked if the header of CSV file is correct via tasklet like this :
#Bean
public Step fileValidatorStep() {
return stepBuilderFactory
.get("fileValidatorStep")
.tasklet(fileValidator)
.build();
}
And if so I process the transformation from CSV file recieved to another file CSV.
in the jobBuilderFactory i check if ExistStatus comes from tasklet fileValidatorStep is "COMPLETED" to forward the process to readAndWriteCsvFile(), if not "COMPLETED" and tasklet fileValidatorStep return ExistStatus "ERROR" the job end and exit processing.
#Bean
public Job job() throws Exception {
return jobBuilderFactory.get("job")
.incrementer(new RunIdIncrementer())
.start(fileValidatorStep()).on("ERROR").end()
.next(fileValidatorStep()).on("COMPLETED").to(readAndWriteCsvFile())
.end().build();
}
The problem is that when I launch my Job the Bean readAndWriteCsvFile runs first of the tasklet, that means the standard Bean of reader and writer of spring batch always loaded in life cycle before i can validation the header and check the ExistStatus, the reader is still working and reades the file and puts data in another file without check because is loading Bean during launch of Job before all tasklet.
How i can launch readAndWriteCsvFile methode after fileValidatorStep ?
You don't need a flow job for that, a simple job is enough. Here is a quick example:
import java.util.Arrays;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.StepContribution;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.core.scope.context.ChunkContext;
import org.springframework.batch.core.step.tasklet.Tasklet;
import org.springframework.batch.item.support.ListItemReader;
import org.springframework.batch.repeat.RepeatStatus;
import org.springframework.context.ApplicationContext;
import org.springframework.context.annotation.AnnotationConfigApplicationContext;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
#Configuration
#EnableBatchProcessing
public class MyJobConfiguration {
#Bean
public Step validationStep(StepBuilderFactory stepBuilderFactory) {
return stepBuilderFactory.get("validationStep")
.tasklet(new Tasklet() {
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
if(!isValid()) {
throw new Exception("Invalid file");
}
return RepeatStatus.FINISHED;
}
private boolean isValid() {
// TODO implement validation logic
return false;
}
})
.build();
}
#Bean
public Step readAndWriteCsvFile(StepBuilderFactory stepBuilderFactory) {
return stepBuilderFactory.get("readAndWriteCsvFile")
.<Integer, Integer>chunk(2)
.reader(new ListItemReader<>(Arrays.asList(1, 2, 3, 4)))
.writer(items -> items.forEach(System.out::println))
.build();
}
#Bean
public Job job(JobBuilderFactory jobBuilderFactory, StepBuilderFactory stepBuilderFactory) {
return jobBuilderFactory.get("job")
.start(validationStep(stepBuilderFactory))
.next(readAndWriteCsvFile(stepBuilderFactory))
.build();
}
public static void main(String[] args) throws Exception {
ApplicationContext context = new AnnotationConfigApplicationContext(MyJobConfiguration.class);
JobLauncher jobLauncher = context.getBean(JobLauncher.class);
Job job = context.getBean(Job.class);
jobLauncher.run(job, new JobParameters());
}
}
In this example, if the validationStep fails, the next step is will not be executed.
I solved my problem, i changed the bean FlatFileItemReader method inside the Job class configuration with the annotation #StepScope, now this bean has only loads when I need it also should avoid declaring the FlatFileItemReader bean out of scope of the job

Spring batch, errorr when adding new reader

I'm working on projet spring batch simple process read, process write to csv file. There are already many readers ans sql scripts in this projet. The problem when i add new reader and sql script and i start the batch, i got an error . However, when i use the new reader to and old script sql, it works and the same thing if I use an existing reader to select the new sql script, it works fine but not i use both.
These are the elements:
/**
* Reader
*/
import org.springframework.stereotype.Component;
import javax.sql.DataSource;
import java.io.IOException;
#Component
public class BillDefaultReader extends AbstractBillReader {
public BillDEfaultReader(final DataSource dataSource) throws IOException {
super(dataSource);
this.setPreparedStatementSetter(s -> s.setString(1, "DEFAULT"));
}
#Override
protected String getSqlPath() {
return "/sql/get_script1.sql"; -> SQL
}
}
// The job config
#Bean
public Job batchExJob(final Step batchExStep, final ReportGeneratorJobListener reportGeneratorJobListener) {
return this.jobs.get("batchExJob").start(batchExStep).listener(reportGeneratorJobListener).build();
}
#Bean
public Step batchExStep(final ItemReader<BillTransmissionDTO> compositeBillReader,
final ItemProcessor<BillTransmissionDTO, BillTransmissionDTO> compositeProcessor,
final ItemWriter<BillTransmissionDTO> billTransmissionWriter) {
return this.stepBuilderFactory.get("batchExStep").<BillTransmissionDTO, BillTransmissionDTO>chunk(this.commitInterval).reader(compositeBillReader)
.processor(compositeProcessor)
.writer(billTransmissionWriter)
.faultTolerant()
.skipPolicy(this::skipExceptionsPolicy)
.build();
}
#StepScope
#Bean
public ItemProcessor<BillTransmissionDTO, BillTransmissionDTO> compositeProcessor(final IntegrateBillProcessor integrateBillProcessor,
final InitialisationPayEndowmentProcessor initialisationPaiementDotationProcessor) {
final CompositeItemProcessor<BillTransmissionDTO, BillTransmissionDTO> processor = new CompositeItemProcessor<>();
processor.setDelegates(Arrays.asList(integrateBillProcessor, initializePaiementEndowmentProcessor));
return processor;
}
So this is a part of code (it's translated), This is the same process and same code for all readers and sql , any idea please?
Thank you!

Spring Batch stop job execution from external class

I have an existing spring batch project that have multiple steps. I want to modify a step so I can stop the job : jobExecution.getStatus() == STOPPED.
My step :
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autowired
private StepReader reader;
#Autowired
private StepProcessor processor;
#Autowired
private StepWriter writer;
#Autowired
public GenericListener listener;
#Bean
#JobScope
#Qualifier("mystep")
public Step MyStep() throws ReaderException {
return stepBuilderFactory.get("mystep")
.reader(reader.read())
.listener(listener)
.processor(processor)
.writer(writer)
.build();
}
GenericListener implements ItemReadListener, ItemProcessListener, ItemWriteListener and overrides before and after methods that basically write log.
The focus here is on the StepReader class and its read() method that returns a FlatFileItemReader :
#Component
public class StepReader {
public static final String DELIMITER = "|";
#Autowired
private ClassToAccessProperties classToAccessProperties;
private Logger log = Logger.create(StepReader.class);
#Autowired
private FlatFileItemReaderFactory<MyObject> flatFileItemReaderFactory;
public ItemReader<MyObject> read() throws ReaderException {
try {
String csv = classToAccessProperties.getInputCsv();
FlatFileItemReader<MyObject> reader = flatFileItemReaderFactory.create(csv, getLineMapper());
return reader;
} catch (ReaderException | EmptyInputfileException | IOException e) {
throw new ReaderException(e);
} catch (NoInputFileException e) {
log.info("Oh no !! No input file");
// Here I want to stop the job
return null;
}
}
private LineMapper<MyObject> getLineMapper () {
DefaultLineMapper<MyObject> mapper = new DefaultLineMapper<>();
DelimitedLineTokenizer delimitedLineTokenizer = new DelimitedLineTokenizer();
delimitedLineTokenizer.setDelimiter(DELIMITER);
mapper.setLineTokenizer(delimitedLineTokenizer);
mapper.setFieldSetMapper(new MyObjectFieldSetMapper());
return mapper;
}
}
I tried to implement StepExecutionListener in StepReader but with no luck, I think because the reader method in StepBuilderFactory is expecting an ItemReader from the reader.read() method and it doesn't care about the rest of the class.
I'm looking for ideas or solution to be able to stop the entire job (not fail it) when NoInputFileException is catched.
I'm looking for ideas or solution to be able to stop the entire job (not fail it) when NoInputFileException is catched.
This is a common pattern and is described in details in the Handling Step Completion When No Input is Found section of the reference documentation. The example in that section shows how to fail a job when no input file is found, but since you want to stop the job instead of failing it, you can use StepExecution#setTerminateOnly(); in the listener and your job will end with status STOPPED. In your example, you would add that listener to the MyStep step.
However, I would suggest to add a pre-validation step and stop the job if there is no file. Here is a quick example:
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.repeat.RepeatStatus;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.ApplicationContext;
import org.springframework.context.annotation.AnnotationConfigApplicationContext;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
#Configuration
#EnableBatchProcessing
public class MyJob {
#Autowired
private JobBuilderFactory jobs;
#Autowired
private StepBuilderFactory steps;
#Bean
public Step fileValidationStep() {
return steps.get("fileValidationStep")
.tasklet((contribution, chunkContext) -> {
// TODO add code to check if the file exists
System.out.println("file not found");
chunkContext.getStepContext().getStepExecution().setTerminateOnly();
return RepeatStatus.FINISHED;
})
.build();
}
#Bean
public Step fileProcessingStep() {
return steps.get("fileProcessingStep")
.tasklet((contribution, chunkContext) -> {
System.out.println("processing file");
return RepeatStatus.FINISHED;
})
.build();
}
#Bean
public Job job() {
return jobs.get("job")
.start(fileValidationStep())
.next(fileProcessingStep())
.build();
}
public static void main(String[] args) throws Exception {
ApplicationContext context = new AnnotationConfigApplicationContext(MyJob.class);
JobLauncher jobLauncher = context.getBean(JobLauncher.class);
Job job = context.getBean(Job.class);
JobExecution jobExecution = jobLauncher.run(job, new JobParameters());
System.out.println("Job status: " + jobExecution.getExitStatus().getExitCode());
}
}
The example prints:
file not found
Job status: STOPPED
Hope this helps.

Spring Batch Conditional Flow Not executing the else part

I am trying to achieve the flow shown in the image below using Spring batch. I was referring to java configuration on page 85 of https://docs.spring.io/spring-batch/4.0.x/reference/pdf/spring-batch-reference.pdf where it talks about Java Configuration.
For some reason, when the Decider returns TYPE2, the batch ends with Failed State without any error message. Following is the java configuration of my job:
jobBuilderFactory.get("myJob")
.incrementer(new RunIdIncrementer())
.preventRestart()
.start(firstStep())
.next(typeDecider()).on("TYPE1").to(stepType1()).next(lastStep())
.from(typeDecider()).on("TYPE2").to(stepType2()).next(lastStep())
.end()
.build();
I think something not right with the java configuration though it matches with the Spring document. A flow can be useful here but I am sure there would be a way without it. Any idea on how to achieve this?
You need to define the flow not only from the decider to next steps but also starting from stepType1 and stepType2 to lastStep. Here is an example:
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.job.flow.FlowExecutionStatus;
import org.springframework.batch.core.job.flow.JobExecutionDecider;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.repeat.RepeatStatus;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.ApplicationContext;
import org.springframework.context.annotation.AnnotationConfigApplicationContext;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
#Configuration
#EnableBatchProcessing
public class MyJob {
#Autowired
private JobBuilderFactory jobs;
#Autowired
private StepBuilderFactory steps;
#Bean
public Step firstStep() {
return steps.get("firstStep")
.tasklet((contribution, chunkContext) -> {
System.out.println("firstStep");
return RepeatStatus.FINISHED;
})
.build();
}
#Bean
public JobExecutionDecider decider() {
return (jobExecution, stepExecution) -> new FlowExecutionStatus("TYPE1"); // or TYPE2
}
#Bean
public Step stepType1() {
return steps.get("stepType1")
.tasklet((contribution, chunkContext) -> {
System.out.println("stepType1");
return RepeatStatus.FINISHED;
})
.build();
}
#Bean
public Step stepType2() {
return steps.get("stepType2")
.tasklet((contribution, chunkContext) -> {
System.out.println("stepType2");
return RepeatStatus.FINISHED;
})
.build();
}
#Bean
public Step lastStep() {
return steps.get("lastStep")
.tasklet((contribution, chunkContext) -> {
System.out.println("lastStep");
return RepeatStatus.FINISHED;
})
.build();
}
#Bean
public Job job() {
return jobs.get("job")
.start(firstStep())
.next(decider())
.on("TYPE1").to(stepType1())
.from(decider()).on("TYPE2").to(stepType2())
.from(stepType1()).on("*").to(lastStep())
.from(stepType2()).on("*").to(lastStep())
.build()
.build();
}
public static void main(String[] args) throws Exception {
ApplicationContext context = new AnnotationConfigApplicationContext(MyJob.class);
JobLauncher jobLauncher = context.getBean(JobLauncher.class);
Job job = context.getBean(Job.class);
jobLauncher.run(job, new JobParameters());
}
}
This prints:
firstStep
stepType1
lastStep
If the decider returns TYPE2, the sample prints:
firstStep
stepType2
lastStep
Hope this helps.
Ran into the similar issue where else part is not being called (technically only first configured on() is being called)
Almost all the websites related to the flow and decider examples have the similar job configurations and was not able to figure it out what was the issue.
After some research, found the way how spring maintains the deciders and decisions.
At high level, while initializing the application, based on the job configuration spring maintains a list of decisions for a decider object (like decsion0, decision1, so on).
when we call the decider() method it always returns a new object for the decider. As it is returning a new object, the list contains only one mapping for each object (i.e., decision0 ) and since it is a list, it always return the first configured decision.So this is the reason why only the first configured transition only being called.
Solution:
Instead of making a method call to the decider, create a single-ton bean for the decider and use it in the job configuration
Example:
#Bean
public JobExecutionDecider stepDecider() {
return new CustomStepDecider();
}
inject it and use it in the job creation bean
#Bean
public Job sampleJob(Step step1, Step step2,Step step3,
JobExecutionDecider stepDecider) {
return jobBuilderFactory.get("sampleJob")
.start(step1)
.next(stepDecider).on("TYPE1").to(step2)
.from(stepDecider).on("TYPE2").to(step3)
}
Hope this helps.
Create a dummyStep which will return the FINISH status and jump to next decider. you need to redirect flow cursor to next decider or virtual step after finishing the current step
.next(copySourceFilesStep())
.next(firstStepDecider).on(STEP_CONTINUE).to(executeStep_1())
.from(firstStepDecider).on(STEP_SKIP).to(virtualStep_1())
//-executeStep_2
.from(executeStep_1()).on(ExitStatus.COMPLETED.getExitCode())
.to(secondStepDecider).on(STEP_CONTINUE).to(executeStep_2())
.from(secondStepDecider).on(STEP_SKIP).to(virtualStep_3())
.from(virtualStep_1()).on(ExitStatus.COMPLETED.getExitCode())
.to(secondStepDecider).on(STEP_CONTINUE).to(executeStep_2())
.from(secondStepDecider).on(STEP_SKIP).to(virtualStep_3())
//-executeStep_3
.from(executeStep_2()).on(ExitStatus.COMPLETED.getExitCode())
.to(thirdStepDecider).on(STEP_CONTINUE).to(executeStep_3())
.from(thirdStepDecider).on(STEP_SKIP).to(virtualStep_4())
.from(virtualStep_3()).on(ExitStatus.COMPLETED.getExitCode())
.to(thirdStepDecider).on(STEP_CONTINUE).to(executeStep_3())
.from(thirdStepDecider).on(STEP_SKIP).to(virtualStep_4())
#Bean
public Step virtulaStep_2() {
return stepBuilderFactory.get("continue-virtualStep2")
.tasklet((contribution, chunkContext) -> {
return RepeatStatus.FINISHED;
})
.build();
}

Stopping Tomcat Doesn't Delete Derby db.lck

(Edit: I've added a bounty to the question. I found a workaround (posted as an answer below), but I'm hoping somebody can explain why the workaround was necessary in the first place.)
I have a Spring webapp that connects to a Derby database during development. This works fine the first time I run the webapp, but in subsequent runs it fails during startup with the "Another instance of Derby may have already booted the database" SQLException.
I understand that this is because the connection to Derby isn't being closed when I shutdown Tomcat, even though I would expect Spring to handle that automatically. So my question is, how do I disconnect from Derby correctly? Not only during manually stopping Tomcat, but also during hot deploying of a new .war file?
I'd like to avoid using a Derby server, and I'm also using annotations instead of XML configuration. Here was my original PersistConfig class:
package com.example.spring.config;
import java.beans.PropertyVetoException;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.util.Properties;
import javax.sql.DataSource;
import org.apache.derby.jdbc.EmbeddedDataSource;
import org.hibernate.SessionFactory;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.ComponentScan;
import org.springframework.context.annotation.Configuration;
import org.springframework.jdbc.datasource.SimpleDriverDataSource;
import org.springframework.jdbc.datasource.embedded.ConnectionProperties;
import org.springframework.jdbc.datasource.embedded.EmbeddedDatabaseBuilder;
import org.springframework.jdbc.datasource.embedded.EmbeddedDatabaseConfigurer;
import org.springframework.jdbc.datasource.embedded.EmbeddedDatabaseType;
import org.springframework.orm.hibernate4.HibernateExceptionTranslator;
import org.springframework.orm.hibernate4.HibernateTransactionManager;
import org.springframework.orm.hibernate4.LocalSessionFactoryBean;
import org.springframework.orm.jpa.JpaVendorAdapter;
import org.springframework.orm.jpa.vendor.Database;
import org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter;
import org.springframework.transaction.annotation.EnableTransactionManagement;
#Configuration
#ComponentScan({"com.example.spring.dao.jpa"})
#EnableTransactionManagement // <-- enable #Transactional annotations for spring #Component and stereotypes
public class PersistConfig{
#Bean
public HibernateExceptionTranslator exceptionTranslator() {
return new HibernateExceptionTranslator();
}
#Bean
public LocalSessionFactoryBean localSessionFactoryBean(DataSource dataSource, JpaVendorAdapter vendorAdapter) {
LocalSessionFactoryBean localSessionFactoryBean = new LocalSessionFactoryBean();
localSessionFactoryBean.setDataSource(dataSource);
localSessionFactoryBean.setPackagesToScan("com.example.one", "com.example.two");
Properties properties = new Properties();
properties.putAll(vendorAdapter.getJpaPropertyMap());
localSessionFactoryBean.setHibernateProperties(properties);
return localSessionFactoryBean;
}
#Bean
public HibernateTransactionManager hibernateTransactionManager(SessionFactory sessionFactory) {
HibernateTransactionManager hibernateTransactionManager = new HibernateTransactionManager(sessionFactory);
return hibernateTransactionManager;
}
#Configuration
public static class DevelopmentConfig{
#Bean
public DataSource dataSource() throws SQLException, PropertyVetoException {
DataSource dataSource = new SimpleDriverDataSource(new org.apache.derby.jdbc.EmbeddedDriver(), "jdbc:derby:C:\\Users\\Kevin\\Desktop\\DerbyDB", "", "");
System.out.println("RETURNING DATASOURCE");
return dataSource;
}
#Bean
JpaVendorAdapter vendorAdapter() {
HibernateJpaVendorAdapter vendorAdapter = new HibernateJpaVendorAdapter();
vendorAdapter.setDatabase(Database.DERBY);
vendorAdapter.setDatabasePlatform("org.hibernate.dialect.DerbyDialect");
vendorAdapter.setGenerateDdl(true);
vendorAdapter.setShowSql(true);
vendorAdapter.getJpaPropertyMap().put("hibernate.hbm2ddl.auto", "update");
vendorAdapter.getJpaPropertyMap().put("hbm2ddl.auto", "update");
return vendorAdapter;
}
}
}
I tried adding a shutdown hook to the whole JVM using Runtime.addShutdownHook() where I manually disconnect from the Derby database, but that is seemingly never fired.
I was then told to look into the EmbeddedDatabaseConfigurer interface to add a Spring shutdown callback where I manually close the database connection, and this is what I came up with:
package com.example.spring.config;
import java.beans.PropertyVetoException;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.util.Properties;
import javax.sql.DataSource;
import org.apache.derby.jdbc.EmbeddedDataSource;
import org.hibernate.SessionFactory;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.ComponentScan;
import org.springframework.context.annotation.Configuration;
import org.springframework.jdbc.datasource.SimpleDriverDataSource;
import org.springframework.jdbc.datasource.embedded.ConnectionProperties;
import org.springframework.jdbc.datasource.embedded.EmbeddedDatabaseBuilder;
import org.springframework.jdbc.datasource.embedded.EmbeddedDatabaseConfigurer;
import org.springframework.jdbc.datasource.embedded.EmbeddedDatabaseType;
import org.springframework.orm.hibernate4.HibernateExceptionTranslator;
import org.springframework.orm.hibernate4.HibernateTransactionManager;
import org.springframework.orm.hibernate4.LocalSessionFactoryBean;
import org.springframework.orm.jpa.JpaVendorAdapter;
import org.springframework.orm.jpa.vendor.Database;
import org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter;
import org.springframework.transaction.annotation.EnableTransactionManagement;
#Configuration
#ComponentScan({"com.example.spring.dao.jpa"})
#EnableTransactionManagement // <-- enable #Transactional annotations for spring #Component and stereotypes
public class PersistConfig implements EmbeddedDatabaseConfigurer {
#Bean
public HibernateExceptionTranslator exceptionTranslator() {
return new HibernateExceptionTranslator();
}
#Bean
public LocalSessionFactoryBean localSessionFactoryBean(DataSource dataSource, JpaVendorAdapter vendorAdapter) {
LocalSessionFactoryBean localSessionFactoryBean = new LocalSessionFactoryBean();
localSessionFactoryBean.setDataSource(dataSource);
localSessionFactoryBean.setPackagesToScan("com.example.one", "com.example.two");
Properties properties = new Properties();
properties.putAll(vendorAdapter.getJpaPropertyMap());
localSessionFactoryBean.setHibernateProperties(properties);
return localSessionFactoryBean;
}
#Bean
public HibernateTransactionManager hibernateTransactionManager(SessionFactory sessionFactory) {
HibernateTransactionManager hibernateTransactionManager = new HibernateTransactionManager(sessionFactory);
return hibernateTransactionManager;
}
#Configuration
public static class DevelopmentConfig{
#Bean
public DataSource dataSource() throws SQLException, PropertyVetoException {
DataSource dataSource = new SimpleDriverDataSource(new org.apache.derby.jdbc.EmbeddedDriver(), "jdbc:derby:C:\\Users\\Kevin\\Desktop\\DerbyDB", "", "");
System.out.println("RETURNING DATASOURCE");
return dataSource;
}
#Bean
JpaVendorAdapter vendorAdapter() {
HibernateJpaVendorAdapter vendorAdapter = new HibernateJpaVendorAdapter();
vendorAdapter.setDatabase(Database.DERBY);
vendorAdapter.setDatabasePlatform("org.hibernate.dialect.DerbyDialect");
vendorAdapter.setGenerateDdl(true);
vendorAdapter.setShowSql(true);
vendorAdapter.getJpaPropertyMap().put("hibernate.hbm2ddl.auto", "update");
vendorAdapter.getJpaPropertyMap().put("hbm2ddl.auto", "update");
return vendorAdapter;
}
}
#Override
public void configureConnectionProperties(ConnectionProperties properties, String databaseName) {
System.out.println("CONFIGURE");
properties.setDriverClass(org.apache.derby.jdbc.EmbeddedDriver.class);
properties.setUrl("jdbc:derby:C:\\Users\\Kevin\\Desktop\\DerbyDB");
}
#Override
public void shutdown(DataSource ds, String databaseName) {
System.out.println("SHUTTING DOWN");
try {
DriverManager.getConnection("jdbc:derby:C:\\Users\\Kevin\\Desktop\\DerbyDB;shutdown=true");
}
catch (SQLException e) {
e.printStackTrace();
}
}
}
However, neither the configureConnectionProperties() function nor the shutdown() function seem to be called. I obviously don't know what I'm doing, so any pointers are greatly appreciated.
EDIT : add a precision of restarting embedded Derby database and a probably simpler solution.
I could reproduce at least partially the problem, understand it and fix it. But I cannot say why BoneCP works fine. I simply noticed that if I waited enough between shutting down tomcat and restarting it again, it worked. I suppose that BoneCP do not access immediately the database, and waits enough until first real connection.
First the problem : when using Derby as an embedded database, the database is booted at first connection, but it has to be explicitely shutted down. If it is not, the db.lock file is not deleted and further application may experience problems booting the database again. Nothing exists either in tomcat or (by default) in spring to automatically shutdown such a database.
Next, why your attempt using an EmbeddedDatabaseConfigurer didn't work : EmbeddedDatabaseConfigurer is not a magic marker and inheriting it in a class is not enough to have spring automatically use it. It is simply an interface that must be implemented by a configurer to allow an EmbeddedDatabaseFactory to use it.
Finally the fix. You should not use SimpleDriverDataSource to get your connections from an embedded Derby database but an EmbeddedDatabaseFactory. Spring by default knows Derby embedded database, and you can configure the factory by simply setting the type ... but that only works for in memory databases and you have a file database ! Would have been too simple ... You must inject the factory with a configurer to have all being ok.
And now the code (starting from your first version) :
#Configuration
public static class DevelopmentConfig{
EmbeddedDatabaseFactory dsFactory;
public DevelopmentConfig() {
EmbeddedDatabaseConfigurer configurer = new EmbeddedDatabaseConfigurer() {
#Override
public void configureConnectionProperties(ConnectionProperties properties, String databaseName) {
System.out.println("CONFIGURE");
properties.setDriverClass(org.apache.derby.jdbc.EmbeddedDriver.class);
properties.setUrl("jdbc:derby:C:\\Users\\Kevin\\Desktop\\DerbyDB");
}
#Override
public void shutdown(DataSource dataSource, String databaseName) {
final String SHUTDOWN_CODE = "XJ015";
System.out.println("SHUTTING DOWN");
try {
DriverManager.getConnection("jdbc:derby:;shutdown=true");
} catch (SQLException e) {
// Derby 10.9.1.0 shutdown raises a SQLException with code "XJ015"
if (!SHUTDOWN_CODE.equals(e.getSQLState())) {
e.printStackTrace();;
}
}
}
};
dsFactory = new EmbeddedDatabaseFactory();
dsFactory.setDatabaseConfigurer(configurer);
}
#Bean
public DataSource dataSource() throws SQLException, PropertyVetoException {
System.out.println("RETURNING DATASOURCE");
return dsFactory.getDatabase();
}
// remaining of code unchanged
This way, I can hot reload the war, and when tomcat is closed, the db.lock is normally destroyed.
Edit:
In case of problems, Derby documentation advices to add the following command to restart a database after a shutdown : Class.forName("org.apache.derby.jdbc.EmbeddedDriver").newInstance(); . It could be the last instruction of the configureConnectionProperties method.
But in fact, the solution could be even simpler. What really needs to be added to your config is a proper shutdown of the embedded driver (and eventually a restart). So a simple PreDestroy (and eventualy `#PostConstruct) annotated method should be enough :
#Configuration
public static class DevelopmentConfig{
#PreDestroy
public void shutdown() {
final String SHUTDOWN_CODE = "XJ015";
System.out.println("SHUTTING DOWN");
try {
DriverManager.getConnection("jdbc:derby:;shutdown=true");
} catch (SQLException e) {
// Derby 10.9.1.0 shutdown raises a SQLException with code "XJ015"
if (!SHUTDOWN_CODE.equals(e.getSQLState())) {
e.printStackTrace();
}
}
}
/* if needed ...
#PostConstruct
public void init() throws InstantiationException, IllegalAccessException, ClassNotFoundException {
Class.forName("org.apache.derby.jdbc.EmbeddedDriver").newInstance();
}
*/
#Bean
public DataSource dataSource() throws SQLException, PropertyVetoException {
DataSource dataSource = new SimpleDriverDataSource(new org.apache.derby.jdbc.EmbeddedDriver(), "jdbc:derby:C:\\Users\\Kevin\\Desktop\\DerbyDB", "", "");
System.out.println("RETURNING DATASOURCE");
return dataSource;
}
// remaining of code unchanged
The main interest of this variant is that you can choose your datasource, from a SimpleDriverDataSource to a real pool.
I figured out a fix to the problem, although I don't really understand why this fix works. It turns out that using BoneCP to configure the DataSource fixes the problem, or at least covers it up.
public DataSource dataSource() throws SQLException, PropertyVetoException {
BoneCPConfig config = new BoneCPConfig();
config.setUsername("");
config.setPassword("");
config.setJdbcUrl("jdbc:derby:C:\\Users\\Kevin\\Desktop\\DerbyDB");
BoneCPDataSource dataSource = new BoneCPDataSource(config);
dataSource.setDriverClass("org.apache.derby.jdbc.EmbeddedDriver");
return dataSource;
}
Stranger still, the db.lck file is still never deleted, but I don't see any errors at all, and it seems to be working fine.
I'm adding this as an answer in case somebody else has a similar problem, but I'm leaving the question open in case somebody can explain to me why this fixes the problem.
You can use a LifeCycleListener which executes shutdown procedure of Derby if you defined Embedded Derby as a DataSource (Resource) of Tomcat.
Here's a sample implementation of LifeCycleListener for Tomcat8, and detail of setup.

Categories

Resources