spring boot distributed processing with kafka

spring boot distributed processing with kafka - java

I have several apps developed using spring boot. Some apps call another apps, that in time call other apps, it is getting hard to manage and scale. I need to be able to distribute them in a network and also combine the apps in different 'flows' with minimun changes to the apps.
Ideally I would like to wrap the apps and abstract them into components that have N inputs and M outputs. At boot time I would use some configuration to wire the inputs and outputs to real kafka topic queues.
For instance, input A to an app can come from several kafka topic queues, and the output B from the same app can go to other set of kafka topic queues.
I would like to be able to change the queues without having to recompile the apps, also no extra network hops to send/receive from/to multiple queues, this should in the same process and multi threaded.
Does anybody knows if something similar exists already? Can spring integration do this? Apache Camel? Or am I better off writing it myself?

See Spring for Apache Kafka. There is also a spring-integration-kafka extension that sits on top of spring-kafka.
The Spring for Apache Kafka (spring-kafka) project applies core Spring concepts to the development of Kafka-based messaging solutions. It provides a "template" as a high-level abstraction for sending messages. It also provides support for Message-driven POJOs with #KafkaListener annotations and a "listener container". These libraries promote the use of dependency injection and declarative. In all of these cases, you will see similarities to the JMS support in the Spring Framework and RabbitMQ support in Spring AMQP.

Related

Ways to implement Google Pub Sub

I found these 3 ways for implementing messaging with Google Pub Sub:
with client libraries
https://cloud.google.com/pubsub/docs/publisher
with spring integration message channels and PubSubTemplate API
https://dzone.com/articles/spring-boot-and-gcp-cloud-pubsub
without message channels but with PubSubTemplate API
https://medium.com/bb-tutorials-and-thoughts/gcp-how-to-subscribe-and-send-pubsub-messages-in-spring-boot-app-b27e2e8863e3
I want to understand the differences between them / when each is best to use and which would be useful for my case.
I have to implement a single Topic and a single Subscription to get the queue functionality. I think I'd rather not use Spring message channels if not needed , they seem to intermediate the communication between Pub Sub topic and the subscription and I don't want that. I want things simple , so I think the option 3 would be best but I am also wondering about option 1.

Option 1, client libraries, is universal. You don't need Spring to run it, you can use this library in Groovy or in Kotlin also.
Option 2, it's deeply integrated to Spring. It's quite invisible but if you have special thing to do, it's tricky to override this implementation
Option 3, it's a light spring integration. PubSubTemplate (the client in fact) is loaded automatically for you at startup, as any bean and you can use it easily in your code. It's my preferred option when I use Spring.

Google Cloud Pub/Sub Using Client Libraries :
Using Google Cloud Pub/Sub with Client libraries is one of the standard and easiest way to implement Cloud Pub/Sub.
A producer of the data publishes messages to Pub/Sub topic, a subscriber client then creates a subscription to that topic and consumes messages.
You need to install the client libraries. You can follow this setup and tutorial for further information.
Here you won't require Spring integration, you can directly use the client library to publish messages and pull it from subscription.
Spring Integration using spring channels :
This use case involves intensive integration of Spring Boot Application with Google Cloud Pub/Sub using Spring Integration to send and receive Pub/Sub messages. ie. Pub/Sub acts as intermediate messaging system
Here The Spring Application sends messages to Cloud Pub/Sub topic utilizing spring channels and the Application further receives messages from Pub/Sub through these channels.
Pub/Sub message in Spring-Boot App :
This use case is a simple and valid example of integrating Cloud Pub/Sub with Spring boot application.
The use case demonstrates how to subscribe to a subscription and send message to topics using Spring Boot Application
Message is published to the topic, queued in the respective subscription and then received by the subscriber Spring Boot Application

What is the difference between Spring Cloud Stream and Spring Cloud Task?

My current understanding is that both of these projects are under Spring Cloud Dataflow, and serve as components of the pipeline. However, both can be made recurring (a stream is by definition recurring, where a task can run every certain time interval). In addition, both can be configured to communicate with the rest of the pipeline through the message broker. Currently there is this unanswered question, so I've yet to find a clear answer.

Please see my response as below:
My current understanding is that both of these projects are under Spring Cloud Dataflow, and serve as components of the pipeline.
Both Spring Cloud Stream and Spring Cloud Task are not under Spring Cloud Data Flow, instead, they can be used as standalone projects and Spring Cloud Data Flow just uses them.
Spring Cloud Stream lets you bind your event-driven long-running applications into a messaging middleware or a streaming platform. As a developer, you have to choose your binder (the binder implementations for RabbitMQ, Apache Kafka etc.,) to stream your events or data from/to the messaging middleware you bind to.
Spring Cloud Task doesn't bind your application into a messaging middleware. Instead, it provides abstractions and lifecycle management to run your ephemeral or finite duration applications (tasks). It also provides the foundation for developing Spring Batch applications.
However, both can be made recurring (a stream is by definition recurring, where a task can run every certain time interval)
A task application can be triggered/scheduled to make it a recurring one whereas the streaming application is a long-running, not a recurring one.
In addition, both can be configured to communicate with the rest of the pipeline through the message broker.
Though a task application can be configured to communicate to a messaging middleware, the concept of pipeline is different when it comes to stream vs task (batch). For the streaming applications, the pipeline refers to the communication via the messaging middleware while for the task applications, the concept of composed tasks lets you create a conditional workflow of multiple task applications. For more information on composed tasks, you can refer to the documentation.

Two spring boot apps communicating with messaging queue between each other

I have two spring boot apps running in the same local network and they need to communicate with each other. An obvious answer is to leverage REST API and make http calls, but I would like to use Spring Integration project for this purpose.
That said, I have several architectural questions regarding that:
Should I setup a standalone messaging framework (e.g. Rabbit MQ) or embedded should also work (e.g. messaging will be embedded to one of the two apps).
If standalone, what messaging framework should I choose: ActiveMQ, RabbitMQ or something else?

Welcome to the Messaging Microservices world!
You go right way, but forget the embedded middleware if you are going to production. Especially when your application will be distributed geographically.
So, right you need some Message Broker and that should be definitely external one.
It's really your choice which one is better for your purpose. For example you can even take into account Apache Kafka or Redis.
If we talk here about Spring Integration it might be better for you to consider to use our new product - Spring Cloud Stream.
With that you just have your applications as Spring Boot Microservices which are able to connect to the external middleware transparently for the application. You just deal with message channels in the application!

Light Weight Java Web Services

I have Java EE applications (ear) running on separate JBoss instances and on different hardware. I want to call from
one application to another which is in another server JBOSS.
Same JBOSS, between two ear.
Same Server, between two JBOss.
The communication data types can be any type. For instance; JSON or Objects. I want to know what lightweight, Open source Java web frameworks I can use to call from one to another? Here some of them. But I don't have any experience from them. Commonly, SOAP and RESTful services are used and there are many implementation frameworks of them.
Please suggest me know from your experience what are the available frameworks which suit for my requirement? Let me have source which explain any comparison. My concerns are that, the communication methodology should be light weight, should support to transfer any type of data, there should not be much configurations, or standards. The framework should support to transfer simply (all communications are done in my applications. so no need well structured, standardized weight configurations) and securely. and it should be in Java. I use Java 7.

This is a typical integration problem. For integrating, mediating, proxying etc. different services and even transferring data, use Apache Camel. For a short answer what Camel is, see What exactly is Apache Camel?
In Camel you define routes using a Java DSL or a XML Spring DSL. Proxying a web service is described here. Using the XML Spring DSL, the route would look as follows:
<route>
<from uri="jetty:http://0.0.0.0:8080/myapp?matchOnUriPrefix=true"/>
<to uri="jetty:http://realserverhostname:8090/myapp?bridgeEndpoint=true&throwExceptionOnFailure=false"/>
</route>
Using the Java DSL, this would become:
from("jetty:http://0.0.0.0:8080/myapp?matchOnUriPrefix=true"
.to("jetty:http://realserverhostname:8090/myapp?bridgeEndpoint=true&throwExceptionOnFailure=false")
There are many different protocols that are supported by Camel such as JSM, SOAP WS, RESTful WS, plain HTTP, TCP. Have a look at https://camel.apache.org/components.html for all possibilities.
The next example shows you how easy it is to define a RESTful server using the Restlet component:
from("restlet:http://localhost:8400/orders/{id}?restletMethod=post")
.process(new Processor() {
#Override
public void process(final Exchange exchange) throws Exception {
final String res = "received [" + exchange.getIn().getBody(String.class) + "] with order id = " + exchange.getIn().getHeader("id");
exchange.getIn().setBody(res);
}
});
The corresponding client looks as follows:
from("direct:start")
.setBody(constant("Hello, world!!"))
.to("http://localhost:8400/orders/22?restletMethod=post")
.log("order: direct start body result = ${bodyAs(String)}")
That said, Camel supports a plentitude of enterprise integration patterns such as splitter, aggregator etc. that can be used for your needs. Have a look at http://camel.apache.org/enterprise-integration-patterns.html for more information about that.
You can just use "normal" Java classes for transforming data and hook them into the routes. Beside that there are many integrated type converter for transforming one data type to another. These converters can easily be extended. See https://camel.apache.org/type-converter.html.
You could use Camel as your base integration framework and add e.g. JMS/ActiveMQ for the communication. However, it is also possible to use ActiveMQ as your base and add Camel for transforming the data, see https://activemq.apache.org/broker-camel-component.html: "The broker camel component makes this even easier - which intercepts messages as they move through the broker itself, allowing them to be modified and manipulated before they are persisted to the message store or delivered to end consumers." However, I prefer to use Camel as the base and add JMS/ActiveMQ for the asynchronous communication (e.g. if message persistence is needed or if the communication has to occur between different hosts).
Camel supports a huge amount of different protocols and formats. However, you don't have to use them, if you don't need them. Just add the dependencies to your pom.xml if you need them. Apache Camel is a small library (11.2 MB) with minimal dependencies for easy embedding in any Java application. Camel runs standalone, in a Servlet engine, or in an OSGI container such as Karaf/ServiceMix/JBoss Fuse ESB. You can start small and the application can grow, if your needs are growing.
For starting using Camel, read the excellent book by Claus Ibsen: http://www.manning.com/ibsen/.

From my understanding of your situation, I think ESB would be a good solution for your problem.
http://en.wikipedia.org/wiki/Enterprise_service_bus
The one from WSO2 is a pretty light-weight open-source ESB and has a good active community.
http://wso2.com/products/enterprise-service-bus/

you could use jax-ws to provide the webservices from your JBoss and call them using javax.xml.soap. What i dont know is if its possible to send object data, maybe you have to serialize from and to xml end send it encoded as base64 string.
Another way might be jms.

If all of the other solutions listed here do not fit your needs, you could interact with the applications by sending JSON or XML data over HTTP.
Spark is a micro web framework for Java that lets you quickly create web endpoints.
By default, Spark runs on an embedded server, but it can easily run on an existing JBoss server instead. Here is a sample that I put together a few months ago to demonstrate how it works and how to get it working with JBoss.
You can have each application that needs to receive data expose a HTTP endpoint and have the calling applications send a simple HTTP request.

Simple and open win. You can expose objects remotely in many different ways, but Java RMI and EJB limit you to Java only clients.
The most open, easiest way to do it is to use HTTP as your protocol and web services, either SOAP or REST (my preference). These will interact easily with any client, even those that aren't Java. Clients need not know or care that you chose Java and JBOSS to implement your server logic.

Existing spring application extension by adding camel features

I have my web application written in Spring MVC. It is quite simple app for registering some activities and generating reports after some time. Now I have it done fully in Spring. The only entry point is HTTP webapp request. I'd like to add other entry points to allow user to trigger application via JMS queue, FTP files and SOAP-based web service.
I know I can do this all using Spring own features somehow, but I wonder if it is desirable to involve Apache Camel into all that stuff?
I think about leaving web application as it is (communicating directly with services), only add some Camel magic to spring context and expose several endpoints from Camel and then after messages processing and transformations call existing services.
I think about using Camel to be able to use some asynchronous processing and threads/scalability features. Is it the right way to go?

I will recommend you to use Apache Camel. I have used it for a similar purpose. The solution is an appropriate one from a 'Separation of Concerns' point. Camel implement Enterprise Integration Patters and is a better solution for integrating various protocols and interfaces. Your application should deal with functionality only and as designed should just expose a servlet to get requests and process it.
Handling of interfaces and protocols are well structured in Camel and its easy to maintain and configure in the long run.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.