I'm working on an ETL project. I've been using spring integration for a long time. The data source is currently files or chronicle but it may change to live streams and volumes are likely to grow. There is a potential to move on to big data solutions (hadoop, spark etc) in the future.
Based on this I need a comparison between spring integration and reactive streams? Why would anyone use one over the other (or am I wrong in the first place trying to compare the two)? Scenarios (if any) where you think they could be used together?
Actually, both of them can be used together. Check out the documentation for Reactive Spring Integration.
Related
When introducing parallel processing to an application where multiple save entity calls are being made, I see prior dev has chosen to do it via Spring Integration using split().channel(MessageChannels.executor(Executors.newfixedThreadPool(10))).handle("saveClass","saveMethod").aggregate().get() - where this method is mapped to a requestChannel using #Gateway annotation. My question is this task seems to be simpler to do using the parallelStream() and forEach() method. Does IntergrationFlow provide any benefit in this case?
If you really do a plain in-memory data processing where Java's Stream API is enough, then indeed you don't need the whole messaging solution like Spring Integration. But if you deal with distributed requirements to process data from different systems, like from HTTP to Apache Kafka or DB, then it is better to use some tool which allows you smoothly to connect everything together. Also: no one stops you to use Stream API in the Spring Integration application. In the end all your code is Java anyway. Please, learn more what is EIP and why we would need a special framework to implement this messaging-based solutions: https://www.enterpriseintegrationpatterns.com/
I have a java/jersey api that is called from the front end. I need to write tests for the java code. How the code is written is:
1. The api call executes the resource method, this calls a separate method that gets data from db and returns to the resource method. This then returns a javax.ws.rs.core.Response to the client.
This is going to be my first time writing tests, so please answer considering I know nothing. What is the best way to start here? And what types of tests should I write. Unit tests are what I’m aiming for here.
Now I have done a lot of research here and I’m leaning towards using JUnit + Mockito to do this. But how do I check for the data in a Response object?
And how should I check the other file that is getting data from db? I found out DBUnit that can do that, but do I need it?
Another framework I came across was Rest Assured. Do I need to include that also? Or can the same things be done with JUnit/Mockito?
I just want some direction from people who have tested out jersey api’s. And want to know what is the most common way to do this.
I do not think there is a best way to do this, what you need to test is often subjective and dependent on the context.
However, you can structure your code in a way that the most important is tested easily and what's left (integration) can be done later / with different tools.
What I suggest here is to follow the principles of the hexagonal architecture. The idea is to keep at the center of your application and without any kind of dependencies (imports ...) to any framework (jaxrs, jpa, etc.) all business rules. These rules can be easily designed with TDD. You will then have very short running tests. It may be necessary to use Mockito to mock implementations of SPI interfaces.
In a second time, you can use this "core" by wiring adapters to the outer world (HTTP, databases, AMQP, etc.), using API and implementing SPI interfaces.
If you want to test these adapters, you exit the scope of unit-tests, and write integration-tests. Integration with a framework, a protocol, anything really.
This kind of tests can use a wide variety of tools, from a framework-related mock (like Jersey test framework), in-memory database (like H2), to fully operational middleware instance using tools like testcontainers.
What is important to remember when writing integration-tests is they are slow in regards of unit-tests. In order to keep a feedback-loop as short as possible, you will want to limit the number of integration-tests to a minimum.
Hoping this will help you!
I wanted to build a Application which listens to a queue and does a series of steps.
Basically the application should listen to Queue1 and:
- Get some data from ServiceA[Small amount of data]
- Get some data from ServiceB[Small amount of data]
- Update Some information in Service C [Based on the data]
- Create number of messages[based on the data] on a Queue2.
Due to the flow based nature of this application I was looking into Job Execution system in Spring. However all the steps are designed to be idempotent and the data being transferred between steps is small, hence I did not want a Database with this application.
I started exploring Spring Batch or Spring Task for this. Spring batch provides really good constructs like Tasklet and Steps but there are number of comments recommending connecting Spring Batch to database and how it is designed to manage massive amounts of data, reliably(I don't need reliability here since the queue and idempotent nature provides that.). While I can pass data using the Execution Context there were recommendations against it.
Question:
- Are there simpler starters in the Spring Boot ecosystem which provide workflows/Job like interface which I should use ?
- Is this a valid use case for spring Batch or is that over engineering/misuse of the steps ?
Thanks a lot for the Help
Ayushman
P.S: I can provide exact details of the job but did not want to conflate the question.
I had two projects worth of experience with Spring Batch. I haven't tried Spring Task.
Having said that, my answer is somewhat bias. Spring Batch is a bit notorious to configure. If your application is simple enough, just use "spring-boot-starter-amqp". It will be enough.
By any chance, you decide to use Spring Batch (for its Job and Step Aspects features or other features), you may want to configure to just use an in-memory database (because you don't need any retry/roll-back feature it is providing).
I have 50,000,000 files that need to be processed using 3-5 different filters configured in workflows
I plan to use microservice architecture
My Questions
i want to use spring integration and batch, to run the workflows. and design the workflows, do you agree or is there another java based system you recommend?
can spring batch can handle "long running i.e. days" workflows.
can spring batch/integration load xml files on the fly
I think Spring Batch is pretty good for this job, below my answers.
I recommend you Spring Batch for this job. It's easy to use and in combination with Spring Workflow are good for the workflow desing.
Yes, it's really good. You need to configure it well.
I'm not sure what are you saying with on the fly. (batch files or configuration files). For batch files yes. For configuration files, it depends on how you load the configuration and how you will use the context.
IMHO Spring Batch can process files based on multiple filters. It can also be easily customized to fit most of your needs and has really fast processing speeds. However, I haven't tried it with anything close to 50,000,000 files, so can't vouch for that.
To run a Spring Batch application as a microservice, take a look at Spring Boot and Spring Cloud Task. Also, look into Spring Cloud Dataflow for orchestration.
I've created some REST endpoints using pure Groovy/Grails. For now most of the operations are all CRUD-like.
I'm beginning to compare the performance of the Grails app to an equivalent Java/Spring app for the CRUD scenarios that I've made, using JMeter. So now I'm taking a subset of the scenarios I've implemented in Grails and porting to a basic Spring MVC app.
I'm very interested in seeing performance comparisons published by others on the web. Can anyone refer me to some?
Any other information in regards to the testing and analysis I'm going to do is welcome. Thanks!
UPDATE REGARDING THE ANSWER:
#Lari's answer below references a website with extremely comprehensive tests, comparing Grails 2.X vs Spring 4.X (see README.md), in addition to a multitude of other frameworks.
However, those tests have Grails running on Resin while Spring is on Tomcat. A little strange to me since Grails uses Tomcat by default.
Resin and Tomcat arguably have similar performance.
The website has several sections (tabs on top) and even subsections (tabs in the "Results" area). My original question was regarding web service behavior for REST. To that end here are the top-level sections that answered my question:
Querying multiple rows in a DB table (HTTP GET) and returning JSON array as result.
Modifying multiple rows in a DB table and returning JSON array as result. This test does not use HTTP PUT with a body, but instead HTTP GET. Scroll to bottom of page for details, and also Requirements page.
If you're interested in HTML rendering see the Fortune Cookie example.
Not surprisingly Spring is better, but like #Joshua points out, this is a contrived example and you will have to be the judge of what to extrapolate from the results. Not to mention that Grails used Resin while Spring used Tomcat. Hopefully each server (Tomcat / Resin) was configured similarly in terms of max threads, Java memory, etc? The config files may be buried in the source code (if you find out, let me know).
I also setup dummy applications for Spring 4.X vs Grails 2.X, with Tomcat being configured exactly the same (both used the same standalone Tomcat installation rather than one bundled inside Grails). In my tests I performed an HTTP GET and returned a JSON array which was formed using static (pre-instantiated) in-memory objects (no DB query). My results also showed better performance for Spring (sorry I can't find my data any longer!). I used Spring Boot to slap a Spring app together quickly, and Grails already has scaffolding by default.
There is http://www.techempower.com/benchmarks/ . The source code is at github.
Take a look at this PLAY VS. GRAILS SMACKDOWN presentation. You can find some results of perfomance inside.