spring batch metadata in cassandra database

spring batch metadata in cassandra database - java

The question is quite simple: can I create metadata schema for Spring Batch in Cassandra database? How can I do this if yes?
I've read that Spring Batch requires RDBMS database for that and No-SQL databases are not supported. Is that still the limitation in Spring Batch and how can I override that issue eventually?

Due to the fact that Casandra does not support simple sequences for keys, Spring Batch does not support using it for the job repository.

It is possible to extend Spring Batch to support Cassandra by customising ItemReader and ItemWriter.
Reader:
#Override
public Company read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
final List<Company> companies = cassandraOperations.selectAll(aClass);
log.debug("Read operations is performing, the object size is {}", companies.size());
if (index < companies.size()) {
final Company company = companies.get(index);
index++;
return company;
}
return null;
}
Writer:
#Override
public void write(final List<? extends Company> items) throws Exception {
logger.debug("Write operations is performing, the size is {}" + items.size());
if (!items.isEmpty()) {
logger.info("Deleting in a batch performing...");
cassandraTemplate.deleteAll(aClass);
logger.info("Inserting in a batch performing...");
cassandraTemplate.insert(items);
}
logger.debug("Items is null...");
}
#Beans:
#Bean
public ItemReader<Company> reader(final DataSource dataSource) {
final CassandraBatchItemReader<Company> reader = new CassandraBatchItemReader<Company>(Company.class);
return reader;
}
#Bean
public ItemWriter<Company> writer(final DataSource dataSource) {
final CassandraBatchItemWriter<Company> writer = new CassandraBatchItemWriter<Company>(Company.class);
return writer;
}
Full source code can be found in Github Spring-Batch-with-Cassandra

Related

What is the most efficient way to persist thousands of entities?

I have fairly large CSV files which I need to parse and then persist into PostgreSQL. For example, one file contains 2_070_000 records which I was able to parse and persist in ~8 minutes (single thread). Is it possible to persist them using multiple threads?
public void importCsv(MultipartFile csvFile, Class<T> targetClass) {
final var headerMapping = getHeaderMapping(targetClass);
File tempFile = null;
try {
final var randomUuid = UUID.randomUUID().toString();
tempFile = File.createTempFile("data-" + randomUuid, "csv");
csvFile.transferTo(tempFile);
final var csvFileName = csvFile.getOriginalFilename();
final var csvReader = new BufferedReader(new FileReader(tempFile, StandardCharsets.UTF_8));
Stopwatch stopWatch = Stopwatch.createStarted();
log.info("Starting to import {}", csvFileName);
final var csvRecords = CSVFormat.DEFAULT
.withDelimiter(';')
.withHeader(headerMapping.keySet().toArray(String[]::new))
.withSkipHeaderRecord(true)
.parse(csvReader);
final var models = StreamSupport.stream(csvRecords.spliterator(), true)
.map(record -> parseRecord(record, headerMapping, targetClass))
.collect(Collectors.toUnmodifiableList());
// How to save such a large list?
log.info("Finished import of {} in {}", csvFileName, stopWatch);
} catch (IOException ex) {
ex.printStackTrace();
} finally {
tempFile.delete();
}
}
models contains a lot of records. The parsing into records is done using parallel stream, so it's quite fast. I'm afraid to call SimpleJpaRepository.saveAll, because I'm not sure what it will do under the hood.
The question is: What is the most efficient way to persist such a large list of entities?
P.S.: Any other improvements are greatly appreciated.

You have to use batch inserts.
Create an interface for a custom repository SomeRepositoryCustom
public interface SomeRepositoryCustom {
void batchSave(List<Record> records);
}
Create an implementation of SomeRepositoryCustom
#Repository
class SomesRepositoryCustomImpl implements SomeRepositoryCustom {
private JdbcTemplate template;
#Autowired
public SomesRepositoryCustomImpl(JdbcTemplate template) {
this.template = template;
}
#Override
public void batchSave(List<Record> records) {
final String sql = "INSERT INTO RECORDS(column_a, column_b) VALUES (?, ?)";
template.execute(sql, (PreparedStatementCallback<Void>) ps -> {
for (Record record : records) {
ps.setString(1, record.getA());
ps.setString(2, record.getB());
ps.addBatch();
}
ps.executeBatch();
return null;
});
}
}
Extend your JpaRepository with SomeRepositoryCustom
#Repository
public interface SomeRepository extends JpaRepository, SomeRepositoryCustom {
}
to save
someRepository.batchSave(records);
Notes
Keep in mind that, if you are even using batch inserts, database driver will not use them. For example, for MySQL, it is necessary to add a parameter rewriteBatchedStatements=true to database URL.
So better to enable driver SQL logging (not Hibernate) to verify everything. Also can be useful to debug driver code.
You will need to make decision about splitting records by packets in the loop
for (Record record : records) {
}
A driver can do it for you, so you will not need it. But better to debug this thing too.
P. S. Don't use var everywhere.

Implementing Spring + Apache Flink project with Postgres

I have a SpringBoot gradle project using apache flink to process datastream signals. When a new signal comes through the datastream, I would like to query look up (i.e. findById() ) it's details using an ID in a postgres database table which is already created in order to get additional information about the signal and enrich the data. I would like to avoid using spring dependencies to perform the lookup (i.e Autowire repository) and want to stick with flink implementation for the lookup.
Where can i specify how to add the postgres connection config information such as port, database, url, username, password etc... (for simplicity purposes can assume the postgres db is local in my machine). Is it as simple as adding the configuration to the application.properties file? if so how can i write the query method to look up the record in the postgres table when searching by non primary key value?
Some online sources are suggesting using this skeleton code but I am not sure how/id it fits my use case. (I have a EventEntity model created which contains all the params/columns from the table which i'm looking up).
like so
public class DatabaseMapper extends RichFlatMapFunction<String, EventEntity> {
// Declare DB connection & query statements
public void open(Configuration parameters) throws Exception {
//Initialize DB connection
//prepare query statements
}
#Override
public void flatMap(String value, Collector<EventEntity> out) throws Exception {
}
}

Your sample code is correct. You can set all your custom initialization and preparation code for PostgreSQL in open() method. Then you can use your pre-configured fields in your flatMap() function.
Here is one sample for Redis operations
I have used RichAsyncFunction here and I suggest you do the same as it is suggested as best practice. Read here for more: https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/asyncio.html)
You can pass configuration parameteres in your constructor method and use it in your initialization process
public static class AsyncRedisOperations extends RichAsyncFunction<Object,Object> {
private JedisPool jedisPool;
private Configuration redisConf;
public AsyncRedisOperations(Configuration redisConf) {
this.redisConf = redisConf;
}
#Override
public void open(Configuration parameters) {
JedisPoolConfig jedisPoolConfig = new JedisPoolConfig();
jedisPoolConfig.setMaxTotal(this.redisConf.getInteger("pool", 8));
jedisPoolConfig.setMaxIdle(this.redisConf.getInteger("pool", 8));
jedisPoolConfig.setMaxWaitMillis(this.redisConf.getInteger("maxWait", 0));
JedisPool jedisPool = new JedisPool(jedisPoolConfig,
this.redisConf.getString("host", "192.168.10.10"),
this.redisConf.getInteger("port", 6379), 5000);
try {
this.jedisPool = jedisPool;
this.logger.info("Redis connected: " + jedisPool.getResource().isConnected());
} catch (Exception e) {
this.logger.error(BaseUtil.append("Exception while connecting Redis"));
}
}
#Override
public void asyncInvoke(Object in, ResultFuture<Object> out) {
try (Jedis jedis = this.jedisPool.getResource()) {
String key = jedis.get(key);
this.logger.info("Redis Key: " + key);
}
}
}

Spring Kafka ChainedKafkaTransactionManager doesn't synchronize with JPA Spring-data transaction

I read a ton of Gary Russell answers and posts, but didn't find actual solution for the common use-case for synchronization of the sequence below:
recieve from topic A => save to DB via Spring-data => send to topic B
As i understand properly: there is no guarantee for fully atomic processing in that case and i need to deal with messages deduplication on the client side, but the main issue is that ChainedKafkaTransactionManager doesn't synchronize with JpaTransactionManager (see #KafkaListener below)
Kafka config:
#Production
#EnableKafka
#Configuration
#EnableTransactionManagement
public class KafkaConfig {
private static final Logger log = LoggerFactory.getLogger(KafkaConfig.class);
#Bean
public ConsumerFactory<String, byte[]> commonConsumerFactory(#Value("${kafka.broker}") String bootstrapServer) {
Map<String, Object> props = new HashMap<>();
props.put(BOOTSTRAP_SERVERS_CONFIG, bootstrapServer);
props.put(AUTO_OFFSET_RESET_CONFIG, 'earliest');
props.put(SESSION_TIMEOUT_MS_CONFIG, 10000);
props.put(ENABLE_AUTO_COMMIT_CONFIG, false);
props.put(MAX_POLL_RECORDS_CONFIG, 10);
props.put(MAX_POLL_INTERVAL_MS_CONFIG, 17000);
props.put(FETCH_MIN_BYTES_CONFIG, 1048576);
props.put(FETCH_MAX_WAIT_MS_CONFIG, 1000);
props.put(ISOLATION_LEVEL_CONFIG, 'read_committed');
props.put(KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(VALUE_DESERIALIZER_CLASS_CONFIG, ByteArrayDeserializer.class);
return new DefaultKafkaConsumerFactory<>(props);
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, byte[]> kafkaListenerContainerFactory(
#Qualifier("commonConsumerFactory") ConsumerFactory<String, byte[]> consumerFactory,
#Qualifier("chainedKafkaTM") ChainedKafkaTransactionManager chainedKafkaTM,
#Qualifier("kafkaTemplate") KafkaTemplate<String, byte[]> kafkaTemplate,
#Value("${kafka.concurrency:#{T(java.lang.Runtime).getRuntime().availableProcessors()}}") Integer concurrency
) {
ConcurrentKafkaListenerContainerFactory<String, byte[]> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.getContainerProperties().setMissingTopicsFatal(false);
factory.getContainerProperties().setTransactionManager(chainedKafkaTM);
factory.setConsumerFactory(consumerFactory);
factory.setBatchListener(true);
var arbp = new DefaultAfterRollbackProcessor<String, byte[]>(new FixedBackOff(1000L, 3));
arbp.setCommitRecovered(true);
arbp.setKafkaTemplate(kafkaTemplate);
factory.setAfterRollbackProcessor(arbp);
factory.setConcurrency(concurrency);
factory.afterPropertiesSet();
return factory;
}
#Bean
public ProducerFactory<String, byte[]> producerFactory(#Value("${kafka.broker}") String bootstrapServer) {
Map<String, Object> configProps = new HashMap<>();
configProps.put(BOOTSTRAP_SERVERS_CONFIG, bootstrapServer);
configProps.put(BATCH_SIZE_CONFIG, 16384);
configProps.put(ENABLE_IDEMPOTENCE_CONFIG, true);
configProps.put(KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
configProps.put(VALUE_SERIALIZER_CLASS_CONFIG, ByteArraySerializer.class);
var kafkaProducerFactory = new DefaultKafkaProducerFactory<String, byte[]>(configProps);
kafkaProducerFactory.setTransactionIdPrefix('kafka-tx-');
return kafkaProducerFactory;
}
#Bean
public KafkaTemplate<String, byte[]> kafkaTemplate(#Qualifier("producerFactory") ProducerFactory<String, byte[]> producerFactory) {
return new KafkaTemplate<>(producerFactory);
}
#Bean
public KafkaTransactionManager kafkaTransactionManager(#Qualifier("producerFactory") ProducerFactory<String, byte[]> producerFactory) {
KafkaTransactionManager ktm = new KafkaTransactionManager<>(producerFactory);
ktm.setTransactionSynchronization(SYNCHRONIZATION_ON_ACTUAL_TRANSACTION);
return ktm;
}
#Bean
public ChainedKafkaTransactionManager chainedKafkaTM(JpaTransactionManager jpaTransactionManager,
KafkaTransactionManager kafkaTransactionManager) {
return new ChainedKafkaTransactionManager(kafkaTransactionManager, jpaTransactionManager);
}
#Bean(name = "transactionManager")
public JpaTransactionManager transactionManager(EntityManagerFactory em) {
return new JpaTransactionManager(em);
}
}
Kafka listener:
#KafkaListener(groupId = "${group.id}", idIsGroup = false, topics = "${topic.name.import}")
public void consume(List<byte[]> records, #Header(KafkaHeaders.OFFSET) Long offset) {
for (byte[] record : records) {
// cause infinity rollback (perhaps due to batch listener)
if (true)
throw new RuntimeExcetion("foo");
// spring-data storage with #Transactional("chainedKafkaTM"), since Spring-data can't determine TM among transactionManager, chainedKafkaTM, kafkaTransactionManager
var result = storageService.persist(record);
kafkaTemplate.send(result);
}
}
Spring-kafka version: 2.3.3
Spring-boot version: 2.2.1
What is a proper way to implement such use-case ?
Spring-kafka documentation limited only to small/specific examples.
P.s. when i'm using #Transactional(transactionManager = "chainedKafkaTM", rollbackFor = Exception.class) on #KafkaListener method i am facing endless cyclic rollback, however FixedBackOff(1000L, 3L) is set.
EDIT: i'm planning to achieve max affordable synchronization between listener, producer and database with configurable retries num.
EDIT: Code snippets above edited with respect to advised configuration. Using ARBP doesn't solve infinity rollback cycle for me, since the first statement's predicate is always false (SeekUtils.doSeeks):
DefaultAfterRollbackProcessor
...
#Override
public void process(List<ConsumerRecord<K, V>> records, Consumer<K, V> consumer, Exception exception,
boolean recoverable) {
if (SeekUtils.doSeeks(((List) records), consumer, exception, recoverable,
getSkipPredicate((List) records, exception), LOGGER)
&& isCommitRecovered() && this.kafkaTemplate != null && this.kafkaTemplate.isTransactional()) {
ConsumerRecord<K, V> skipped = records.get(0);
this.kafkaTemplate.sendOffsetsToTransaction(
Collections.singletonMap(new TopicPartition(skipped.topic(), skipped.partition()),
new OffsetAndMetadata(skipped.offset() + 1)));
}
}
It is worth saying that there is no active transaction in Kafka Consumer method (TransactionSynchronizationManager.isActualTransactionActive()).

What makes you think it's not synchronized? You really don't need #Transactional since the container will start both transactions.
You shouldn't use a SeekToCurrentErrorHandler with transactions because that occurs within the transaction. Configure the after rollback processor instead. The default ARBP uses a FixedBackOff(0L, 9) (10 attempts).
This works fine for me; and stops after 4 delivery attempts:
#SpringBootApplication
public class So58804826Application {
public static void main(String[] args) {
SpringApplication.run(So58804826Application.class, args);
}
#Bean
public JpaTransactionManager transactionManager() {
return new JpaTransactionManager();
}
#Bean
public ChainedKafkaTransactionManager<?, ?> chainedTxM(JpaTransactionManager jpa,
KafkaTransactionManager<?, ?> kafka) {
kafka.setTransactionSynchronization(SYNCHRONIZATION_ON_ACTUAL_TRANSACTION);
return new ChainedKafkaTransactionManager<>(kafka, jpa);
}
#Autowired
private Saver saver;
#KafkaListener(id = "so58804826", topics = "so58804826")
public void listen(String in) {
System.out.println("Storing: " + in);
this.saver.save(in);
}
#Bean
public NewTopic topic() {
return TopicBuilder.name("so58804826")
.partitions(1)
.replicas(1)
.build();
}
#Bean
public ApplicationRunner runner(KafkaTemplate<String, String> template) {
return args -> {
// template.executeInTransaction(t -> t.send("so58804826", "foo"));
};
}
}
#Component
class ContainerFactoryConfigurer {
ContainerFactoryConfigurer(ConcurrentKafkaListenerContainerFactory<?, ?> factory,
ChainedKafkaTransactionManager<?, ?> tm) {
factory.getContainerProperties().setTransactionManager(tm);
factory.setAfterRollbackProcessor(new DefaultAfterRollbackProcessor<>(new FixedBackOff(1000L, 3)));
}
}
#Component
class Saver {
#Autowired
private MyEntityRepo repo;
private final AtomicInteger ids = new AtomicInteger();
#Transactional("chainedTxM")
public void save(String in) {
this.repo.save(new MyEntity(in, this.ids.incrementAndGet()));
throw new RuntimeException("foo");
}
}
I see "Participating in existing transaction" from both TxMs.
and with #Transactional("transactionManager"), I just get it from the JPATm, as one would expect.
EDIT
There is no concept of "recovery" for a batch listener - the framework has no idea which record in the batch needs to be skipped. In 2.3, we added a new feature for batch listeners when using MANUAL ack modes.
See Committing Offsets.
Starting with version 2.3, the Acknowledgment interface has two additional methods nack(long sleep) and nack(int index, long sleep). The first one is used with a record listener, the second with a batch listener. Calling the wrong method for your listener type will throw an IllegalStateException.
When using a batch listener, you can specify the index within the batch where the failure occurred. When nack() is called, offsets will be committed for records before the index and seeks are performed on the partitions for the failed and discarded records so that they will be redelivered on the next poll(). This is an improvement over the SeekToCurrentBatchErrorHandler, which can only seek the entire batch for redelivery.
However, the failed record will still be replayed indefinitely.
You could keep track of the record that keeps failing and nack index + 1 to skip over it.
However, since your JPA tx has rolled back; this won't work for you.
With batch listener's you must handle problems with batches in your listener code.

Using ClassifierCompositeItemWriter and FlatFileItemWriter to write to multiple files

I'm trying to create a spring batch job that will read from MySQL database and write the data to different files depending on a value from the database. I am getting an error :
org.springframework.batch.item.WriterNotOpenException: Writer must be open before it can be written to
at org.springframework.batch.item.file.FlatFileItemWriter.write(FlatFileItemWriter.java:255)
Here's my ClassifierCompositeItemWriter
ClassifierCompositeItemWriter<WithdrawalTransaction> classifierCompositeItemWriter = new ClassifierCompositeItemWriter<WithdrawalTransaction>();
classifierCompositeItemWriter.setClassifier(new Classifier<WithdrawalTransaction,
ItemWriter<? super WithdrawalTransaction>>() {
#Override
public ItemWriter<? super WithdrawalTransaction> classify(WithdrawalTransaction wt) {
ItemWriter<? super WithdrawalTransaction> itemWriter = null;
if(wt.getPaymentMethod().equalsIgnoreCase("PDDTS")) { // condition
itemWriter = pddtsWriter();
} else {
itemWriter = swiftWriter();
}
return itemWriter;
}
});
As you can see, I only used two file writers for now.
#Bean("pddtsWriter")
private FlatFileItemWriter<WithdrawalTransaction> pddtsWriter()
And
#Bean("swiftWriter")
private FlatFileItemWriter<WithdrawalTransaction> swiftWriter()
I also added them as stream
#Bean
public Step processWithdrawalTransactions() throws Exception {
return stepBuilderFactory.get("processWithdrawalTransactions")
.<WithdrawalTransaction, WithdrawalTransaction> chunk(10)
.processor(withdrawProcessor())
.reader(withdrawReader)
.writer(withdrawWriter)
.stream(swiftWriter)
.stream(pddtsWriter)
.listener(headerWriter())
.build();
}
Am I doing something wrong?

SELECT FOR UPDATE query with Spring's #Transaction management creates deadlock upon subsequent updates

Setup: Spring application deployed on Weblogic 12c, using JNDI lookup to get a datasource to the Oracle Database.
We have multiple services which will be polling the database regularly for new jobs. In order to prevent two services picking the same job we are using a native SELECT FOR UPDATE query in a CrudRepository. The application then takes the resulting job and updates it to PROCESSING instead of WAITING using the CrusRepository.save() method.
The problem is that I can't seem to get the save() to work within the FOR UPDATE transaction (at least this is my current working theory of what goes wrong), and as a result the entire polling freezes until the default 10 minute timeout occurs. I have tried putting #Transactional (with various propagation flags) basically everywhere, but I'm not able to get it to work (#EnableTransactionManagement is activated and working).
Obviously there must be some basic knowledge I'm missing. Is this even a possible setup? Unfortunately, just using #Transactional with a non-native CrudRepository SELECT query is not possible, as it apparently first makes a SELECT to see if the row is locked or not, and only then makes a new SELECT that locks it. Another service could very well pick up the same job in the meanwhile, which is why we need it to lock immediately.
Update in relation to #M. Deinum's comment.: I should perhaps also mention that it's a setup wherein the central component that's doing the polling is a library used by all the other services (therefore the library has #SpringBootApplication, as does each service using it, so double component scanning is certainly present). Furthermore, the service has two separate classes for polling depending on the type of service, with a lot of common code, shared in an AbstractTransactionHelper class. Below I've aggregated some code for the sake of brevity.
The library's main class:
#SpringBootApplication
#EnableTransactionManagement
#EnableJpaRepositories
public class JobsMain {
public static void initializeJobsMain(){
PersistenceProviderResolverHolder.setPersistenceProviderResolver(new PersistenceProviderResolver() {
#Override
public List<PersistenceProvider> getPersistenceProviders() {
return Collections.singletonList(new HibernatePersistenceProvider());
}
#Override
public void clearCachedProviders() {
//Not quite sure what this should do...
}
});
}
#Bean
public JtaTransactionManager transactionManager(){
return new WebLogicJtaTransactionManager();
}
public DataSource dataSource(){
final JndiDataSourceLookup dsLookup = new JndiDataSourceLookup();
dsLookup.setResourceRef(true);
DataSource dataSource = dsLookup.getDataSource("Jobs");
return dataSource;
}
}
The repository (we're returning a set with only one job as we had some other issues when returning a single object):
public interface JobRepository extends CrudRepository<Job, Integer> {
#Query(value = "SELECT * FROM JOB WHERE JOB.ID IN "
+ "(SELECT ID FROM "
+ "(SELECT * FROM JOB WHERE "
+ "JOB.STATUS = :status1 OR "
+ "JOB.STATUS = :status2 "
+ "ORDER BY JOB.PRIORITY ASC, JOB.CREATED ASC) "
+ "WHERE ROWNUM <= 1) "
+ "FOR UPDATE", nativeQuery = true)
public Set<Job> getNextJob(#Param("status1") String status1, #Param("status2") String status2);
The transaction handling class:
#Service
public class JobManagerTransactionHelper extends AbstractTransactionHelper{
#Transactional
#Override
public QdbJob getNextJobToProcess(){
Set<Job> jobs = null;
try {
jobs = jobRepo.getNextJob(Status.DONE.name(), Status.FAILED.name());
} catch (Exception ex) {
logger.error(ex);
}
return extractSingleJobFromSet(jobs);
}
Update 2: Some more code.
AbstractTransactionHelper:
#Service
public abstract class AbstractTransactionHelper {
#Autowired
QdbJobRepository jobRepo;
#Autowired
ArchivedJobRepository archive;
protected Job extractSingleJobFromSet(Set<Job> jobs){
Job job = null;
if(jobs != null && !jobs.isEmpty()){
for(job job : jobs){
if(this instanceof JobManagerTransactionHelper){
updateJob(job);
}
job = job;
}
}
return job;
}
protected void updateJob(Job job){
updateJob(job, Status.PROCESSING, null);
}
protected void updateJob(Job job, Status status, String serviceMessage){
if(job != null){
if(status != null){
job.setStatus(status);
}
if(serviceMessage != null){
job.setServiceMessage(serviceMessage);
}
saveJob(job);
}
}
protected void saveJob(Job job){
jobRepo.save(job);
archive.save(Job.convertJobToArchivedJob(job));
}
Update 4: Threading. newJob() is implemented by each service that uses the library.
#Service
public class JobManager{
#Autowired
private JobManagerTransactionHelper transactionHelper;
#Autowired
JobListener jobListener;
#Autowired
Config config;
protected final AtomicInteger atomicThreadCounter = new AtomicInteger(0);
protected boolean keepPolling;
protected Future<?> futurePoller;
protected ScheduledExecutorService pollService;
protected ThreadPoolExecutor threadPool;
public boolean start(){
if(!keepPolling){
ThreadFactory pollServiceThreadFactory = new ThreadFactoryBuilder()
.setNamePrefix(config.getService() + "ScheduledPollingPool-Thread").build();
ThreadFactory threadPoolThreadFactory = new ThreadFactoryBuilder()
.setNamePrefix(config.getService() + "ThreadPool-Thread").build();
keepPolling = true;
pollService = Executors.newSingleThreadScheduledExecutor(pollServiceThreadFactory);
threadPool = (ThreadPoolExecutor)Executors.newFixedThreadPool(getConfig().getThreadPoolSize(), threadPoolThreadFactory);
futurePoller = pollService.scheduleWithFixedDelay(getPollTask(), 0, getConfig().getPollingFrequency(), TimeUnit.MILLISECONDS);
return true;
}else{
return false;
}
}
protected Runnable getPollTask() {
return new Runnable(){
public void run(){
try{
while(atomicThreadCounter.get() < threadPool.getMaximumPoolSize() &&
threadPool.getActiveCount() < threadPool.getMaximumPoolSize() &&
keepPolling == true){
Job job = transactionHelper.getNextJobToProcess();
if(job != null){
threadPool.submit(getJobHandler(job));
atomicThreadCounter.incrementAndGet();//threadPool.getActiveCount() isn't updated fast enough the first loop
}else{
break;
}
}
}catch(Exception e){
logger.error(e);
}
}
};
}
protected Runnable getJobHandler(final Job job){
return new Runnable(){
public void run(){
try{
atomicThreadCounter.decrementAndGet();
jobListener.newJob(job);
}catch(Exception e){
logger.error(e);
}
}
};
}

As it turns out, the problem was the WeblogicJtaTransactionManager. My guess is that the FOR UPDATE resulted in a JPA transaction, but upon updating the object in the database, the WeblogicJtaTransactionManager was used, which failed to find an ongoing JTA transaction. Since we're deploying on Weblogic we wrongly assumed we had to use the WeblogicJtaTransactionManager.
Either way, exchanging the TransactionManager to a JpaTransactionManager (and explicitly setting the EntityManagerFactory and DataSource on it) basically solved all problems.
#Bean
public PlatformTransactionManager transactionManager() {
JpaTransactionManager jpaTransactionManager = new JpaTransactionManager(entityManagerFactory().getObject());
jpaTransactionManager.setDataSource(dataSource());
jpaTransactionManager.setJpaDialect(new HibernateJpaDialect());
return jpaTransactionManager;
}
Assuming you also have added an EntityManagerFactoryBean which is needed if you want to use multiple datasources in the same project (which we're doing, but not within single transactions, so no need for JTA).
#Bean
public LocalContainerEntityManagerFactoryBean entityManagerFactory() {
HibernateJpaVendorAdapter vendorAdapter = new HibernateJpaVendorAdapter();
LocalContainerEntityManagerFactoryBean factoryBean = new LocalContainerEntityManagerFactoryBean();
factoryBean.setDataSource(dataSource());
factoryBean.setJpaVendorAdapter(vendorAdapter);
factoryBean.setPackagesToScan("my.model");
return factoryBean;
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

spring batch metadata in cassandra database - java

Due to the fact that Casandra does not support simple sequences for keys, Spring Batch does not support using it for the job repository.

Related

What is the most efficient way to persist thousands of entities?

Implementing Spring + Apache Flink project with Postgres

Spring Kafka ChainedKafkaTransactionManager doesn't synchronize with JPA Spring-data transaction

Using ClassifierCompositeItemWriter and FlatFileItemWriter to write to multiple files

SELECT FOR UPDATE query with Spring's #Transaction management creates deadlock upon subsequent updates

Categories

Resources