SpringBatch with SpringBoot - java

Hi folks need some opinions here.
I already have a spring boot application holding all my rest APIs running on tomcat that ships in with spring-boot-starter-web.
I would like to set up jobs using spring batch that will be schedule via kubernetes. The idea is to share the same business logic instead of creating a standalone batch project which i need to maintain double business logic.
Question, scheduling via kubernetes meaning i will be firing java -jar someJar --spring.batch.jobNames=xxx in container, doing that it will also start up all my RestApis right? which in turn unnecessary and waste of resources. Anyway to mitigate this or my understanding is wrong?

The way I would implement this is by extracting the common business logic in a separate module, and make the batch app and the webapp depend on that module.

Related

Controller of microservices in Java on same system

How do I launch and control multiple microservices on the same system in Java? Is there an existing Java controller that can do this?
I have an application server which consists of multiple microservices that run on the same OS instance/system. Each microservice is spring boot based though there are a few exceptions. I'm looking for some already written controller which will start each service in order and restart if a service fails. I'm not looking for a container based approach but rather a controller which runs as a process on windows.
I don't want to create and maintain windows service entries for each service as that is error prone and tough to keep configured correctly. Getting the startup order right is also difficult.
I can write one myself but I'd rather not re-invent the wheel if I can find something that does what I need.

Is there a way to deploy/update Spring Batch Jobs seperately?

We're running a Spring Batch Web-Application for Importing CSV Files into a Database. This Web-Application is currently evolving and is constantly extended by new jobs.
the current update procedure looks like this:
1. Write new Code
2. Build a war file
3. Deploy the newly build war file and replace the whole Web Application on the Tomcat Webserver
This might bring us into trouble, when the running system is currently importing / writing Files to the Database.
I wanted to know if there is a smart way to maybe upgrade the spring batch jobs seperately ?
I already thought about splitting the Project into many different Web-Applications but this might be a lot of overhead with all the libraries bundled into the war file(s).
Are there any best practices for building that sort of Application ?
Thanks for your Help !
This packaging model is known to cause a lot of issues like the one you are facing. I recommend to package your jobs as separate jars and make your application launch those jobs in separate processes. With this model, you can deploy/upgrade jobs without impacting the web application used to launch them.
For the record, Spring Batch Admin suffered from this packaging model (as described here) and the recommended replacement is Spring Cloud Data Flow (which uses the model I described previously)

Java Spring/Workflow

I have 50,000,000 files that need to be processed using 3-5 different filters configured in workflows
I plan to use microservice architecture
My Questions
i want to use spring integration and batch, to run the workflows. and design the workflows, do you agree or is there another java based system you recommend?
can spring batch can handle "long running i.e. days" workflows.
can spring batch/integration load xml files on the fly
I think Spring Batch is pretty good for this job, below my answers.
I recommend you Spring Batch for this job. It's easy to use and in combination with Spring Workflow are good for the workflow desing.
Yes, it's really good. You need to configure it well.
I'm not sure what are you saying with on the fly. (batch files or configuration files). For batch files yes. For configuration files, it depends on how you load the configuration and how you will use the context.
IMHO Spring Batch can process files based on multiple filters. It can also be easily customized to fit most of your needs and has really fast processing speeds. However, I haven't tried it with anything close to 50,000,000 files, so can't vouch for that.
To run a Spring Batch application as a microservice, take a look at Spring Boot and Spring Cloud Task. Also, look into Spring Cloud Dataflow for orchestration.

Fast Multithreaded Online Processing Application Framework Suggestions

I am looking for a pattern and/or framework which can model the following problem in an easily configurable way.
Every say 3 minutes, I needs to have a set of jobs kick off in a web application context that will concurrently hit web services to obtain the latest version of data, and push it off to a database. The problem is the database will be being heavily used to read the data from to do tons of complex calculations on the data. We are currently using spring so I have been looking at Spring Batch to run this process does anyone have any suggestions/patterns/examples of using Spring or other technologies of a similar system?
We have used ServletContextlisteners to kick off TimerTasks in our web applications when we needed processes to run repeatedly. The ServletContextListener kicks off when the app server starts the application or when the application is restarted. Then the timer tasks act like a separate thread that repeats your code for the specified period of time.
ServletContextListener
http://www.javabeat.net/examples/2009/02/26/servletcontextlistener-example/
TimerTask
http://enos.itcollege.ee/~jpoial/docs/tutorial/essential/threads/timer.html
Is refactoring the job out of the web application and into a standalone app a possibility?
That way you could stick the batch job onto a separate batch server (so that the extra load of the batch job wouldn't impact your web application), which then calls the web services and updates the database. The job can then be kicked off using something like cron or Autosys.
We're using Spring-Batch for exactly this purpose.
The database design would also depend on what the batched data is used for. If it is for reporting purposes, I would recommend separating the operational database from the reporting database, using a database link to obtain the required data from the operational database into the reporting database and then running the complex queries on the reporting database. That way the load is shifted off the operational database.
I think it's worth also looking into frameworks like camel-integration. Also take a look at the so called Enterprise Integration Patterns. Check the catalog - it might provide you with some useful vocabulary to think about the scaling/scheduling problem at hand.
The framework itself integrates really well with Spring.

Threads in a Java EE application

I have a Java EE application that has two components: First is a service that scrapes some information from internet and fills it into database. Second is a web interface (deployed on tomcat) from where user can browse that information.
What could be the best approach to implement the first component? Should it be run as a background Daemon/Service or a thread within the container?
I would personally separate them into different processes. Aside from anything else, it means you can restart one without worrying about the other. It also means you can really easily deploy them on different machines without pointlessly installing Tomcat for a service which doesn't actually need a web interface.
Depending on the type of application framework, Spring lets you use Quartz or the java.util.concurrent framework. Spring has a TaskExecutor abstraction (see the Spring documentation) which simplifies a lot of this, but check to see which fits best with your design.
Spring or Quartz (managed by Spring) then controls the creation and starting/stopping of Threads or Executors or Jobs, along with their frequency/period and other scheduling parameters, and also manages any pooling of jobs you might require.
I use these for all my background tasks and batch jobs in any Java EE applications I write with no problems. Since the jobs are Spring managed POJOs, they have access to the full dependency injection framework and so on that that Spring entails, and of course you can switch between scheduler frameworks with a simple change to you application configuration XML file as your needs change or scale.
There is nothing wrong with having background jobs inside a web container, but you MUST let the web container know about it so it can be stopped and started properly.
Have a look at the load-on-startup tag in web.xml. There are some advice on http://wiki.metawerx.net/wiki/Web.xml.LoadOnStartup

Categories

Resources