Streaming data architecture

Streaming data architecture - java

I would like to design the best architecture for my following project: I have an application running on any device (desktop, mobile...) where users can publish or receive notifications with other users they share data.
Basically, a user can share with other users what he is doing on the application, other users being notified in real-time of the changes and vice-versa. And users are only able to receive notifications they are allowed by other users.
For example, when a user moves a widget on the screen, the application must store the new widget position, and also notify in real-time other users of this new position to perform the change on their screen. For this need, I would see an event-driven architecture with a publish-subscribe pattern. However, I guess I would also need to handle sync request-response pattern when the application needs to retrieve the list of users to share a widget for example.
I had a quick look at Streaming Data book by Manning where a streaming data architecture is described, but I don't know if this kind of architecture would fit my needs. One difference for example in the implementation part is that the event source producer can also be an event consumer in my application (in the book, the event source producer is a separate public streaming API and the real application is the only consumer)
My idea if I follow a bit the book would be the following: WebSocket for data ingestion and data access, a broker-like Kafka as message repository and a separate analysis service consuming Kafka topics and persisting data in DB. One interrogation is if I could use only one WebSocket for both data ingestion and data access.
Which detailed architecture and tools would you use to fit these needs?
For the implementation, I would consider javascript for the client part, and Java for the server part.

This is a pretty common use case for Kafka (leveraging both the broadcast and storage elements). There are some examples here which should help, although the context is slightly different:
https://github.com/confluentinc/kafka-streams-examples/tree/4.0.0-post/src/main/java/io/confluent/examples/streams/microservices
https://www.confluent.io/blog/building-a-microservices-ecosystem-with-kafka-streams-and-ksql/
In this example the CQRS pattern is used, so changes you make to the screen position would create events sent to kafka, then you create a view service that other application instances can (long) poll to get changes.
You could also implement this with a websocket. There are a few implementations of this on github but I haven't tried any of them personally. The one complexity is that, if you want to scale out to many nodes, you need some way to map messages in Kafka to open websockets (whereas mapping requests to kafka partitions in the REST example is handled automatically). Getting started with a single server implementation wouldn't require this complexity though.

Related

Micro service Architecture based on RESTful API's in java

Best Architecture for implementing a WebService that takes requests from one side, save and enhance that and then call another service with new parameters.
is there any special Design Pattern for this?

There's not a lot to go on, but from what you've said it sounds like a job for "pipes and filters"!
To get a more precise answer, you might want to ask yourself some more detailed questions:
If you need to do any validation or transformation of the incoming message? Will you want to handle all requests the same way, or are there different types? Are the external services likely to change, and if so, will they do this frequently? What do you want to do if the final web service call fails (should you rollback the database record?)? How do you want to report failures/responses - do you need to report these back? Do you need a mechanism to track the progress of a particular request?

Since you are looking for a design pattern, I think you might want to compare the pros and cons of using microservices orchestration vs choreography in the context of your project.

If you do not need an immediate response to the calling system I would suggest to you to use event-driven approach if that's feasible. So instead of REST services, you will have a message broker and your services will be subscribed for certain events. This will hide your consumers behind the message broker which will make your system less coupled.
This can be implemented via Spring Cloud Stream, where you will have a Sink (microservice producing events, transformer - microservice that makes intermediate transformations possible and a source - microservice that receives a final result for further processing).
Another possible case could be Camel. It has basically all the integration patterns built in, so it should not be a problem to implement the solution either based on REST APIs or events.

How do you design a dorne delivery sytem from the software architecture point of view?

For my master's degree final project I decided to design a drone delivery system. The main purpose is to learn to design complex systems.
The basic use case is this:
User goes to merchant online shop, selects the products, selects the delivery method as "Drone delivery" and selects his delivery location.
Merchant website, makes an API call to our drone delivery system (DDS) application to register the new delivery order.(The order will contain all information that we need: parcel pick up location, and destination location...)
The DDS application based on drones positions, and based on an algorithm will calculate and mark which drone can deliver this order in the shortest time.
The selected drone when is free will deliver the order.
So far so good. My questions are related to the software architecture of this system. I have some general questions and some specific questions.
General questions:
How do you design a system like this in order to be scalable? I mean: The system may be used by may merchants, if they hit my API in the same time with 100 orders, the system must be able to handle it.
What are some good design principles or patterns when designing an system like this?
Specific questions:
So far i have came up with this architecture:
System Components:
Java(Spring) application
Rest web servce
web interface managing dorens and parces
bussines logic and algorithms for routing drones
producer/consumer for RabbitMQ
Mysql Server
RabbitMq
System flow:
Merchant hits REST API to register the order
The Java Application saves the order to Mysql database.
After saving the order to the database, an Producer puts the order in a queue in RabbitMQ
An Consumer consumes the RabbitMQ order queue. It takes each order and calculates based on an algorithm the drone that offers the best time for the delivery. Each drone has a separate queue in RabbitMQ. After finding the best drone, the consumer inserts the order in the drone queue in RabbitMQ. The consumer also interrogates the mysql database during this process.
Whenever a drone is free, it will communicate with the system to ask for the next order. The system will look in the drone RabbitMQ queue and will take from there the next order.
My questions are related to the consumer and producer:
Is OK that the consumer to have logic in it, in my example it will have the algorithm that will determine the best drone, to do this it needs to talk to mysql also, for retrieving drone positions? Is this a good practice? If not how can i do different?
Is best practice for the consumer to stay in the application? Right now consumers are running in the same server as the web service and the code is not separated from web service code. I am thinking maybe in the future you may need to move the consumers in a separate server? How do you think the consumers so they can easily be separated from the application?
I think that the producer must stay in the application, i mean is coupled with the web service app. Is that OK?
Sorry for the long post, and for my poor English.
Thank you very much :)

Yes, the consumer should have logic in it. This is a standard EIP routing pattern.
If you properly separate your business logic layers from your data access layers (your queue access is a data access layer), then it probably isn't a problem to have them all share a common project. You ultimately probably want to separate your business logic/domain model from the web service and the router/consumer, but those are much more deployment and packaging concerns.
As long as you keep your web service code out of your business logic (and vice versa) you will probably be ok, you will just have to deploy the whole thing multiple times, and only expose the endpoints that are relevant for any given deployment. You ultimately might be happier though if you separate your layers via libraries, as it will actually enforce not mixing the concerns.
And yes, the producer must be deployed with the web service, just make sure you are aware that as a Data Access Layer, that it's in a separate package/class. It will make your testing much easier.

When to compose interoperable message: in app or Mirth?

We're designing an architecture for communicating several applications and we have decided to use Mirth as (pseudo)ESB. In our processes we want to give back control to users as soon as we can, so when an action is fired by an user (for example, pressing Save button after filling in a form) some (necessary) changes are made in database and then a message has to be sent to another system. User doesnt have to wait until message is sent, so our applications gives back control when database changes are done. Message composition is done in background asynchronously. But we don´t really know which approach we should follow:
a) Start a new thread in our app where we collect all necessary data (starting from "primary data", this is, some primary keys that allow us to find all information) to fill an HL7 message and send it to queue where Mirth is listening.
b) Send "primary data" to Mirth and delegate HL7 message composition to it.Mirth can access directly to database to collect necessary data or another option could be invoking some REST/SOAP services of our own.
In case of option B, we have some doubts about how to invoke Mirth:
b.1) Our app makes database modifications and writes primary data on a queue (distributed transaction).
b.2) Our app makes database modifications and call a SOAP or Rest service published by Mirth which all it does is writing message on a queue where Mirth is also reading (no distributed transaction in our app).
Some argue that composing message in our app and using Mirth only as a broker is "missusing" Mirth. On the other side, there is some mates that find accessing app database from Mirth is very intrusive and it should not know our schema. Last option, invoking an app service from Mirth which returns all necessary information for HL7 is like sending "primary data" from app to Mirth only to get it back when Mirth calls service (passing that data as a parameter).
Thank you for your advices.

I'm not sure if Mirth is the appropriate tool to use as an Enterprise Service Bus where your requirements include real time notifications/events to allow the user to proceed after submitting a form.
Without knowing more, such as the architecture in play, we can't really advise you.
IMO, as one who experienced with Mirth integration, as well as designing database dependent applications, I would say that Mirth isn't the appropriate tool for the job.

(1) There is not enough information for an "expert advice" and no single clear technically-justified answer
(2) Option (a) looks like least expensive and easiest to implement for the 1st version, especially with reuse of stable tested libraries like HAPI
(3) In your design treat your Enterprise service bus as a black box component and concentrate on designing the interfaces and clarifying the asynchronous message sequences. This way the service bus internals, the message routing and queuing decisions can be postponed to the deployment time with some coding effort and by following the adapter design pattern
(4) Arguments worded like "missusing", "intrusive", "like it", "nice" perhaps indicate a valid point of view but as such do not create a measurable, verifiable decision criteria or performance indicators and should not be used alone
(5) This is the right time to apply a decision making process and weight-evaluate the various options. As a minimal formal input I'd recommend the Plus/Minus/Interesting
(6) In your decision following points should not be ommited:
securing data privacy (health state is a private property protected by law in some countries)
fault tolerance (robustness, reliability, exception handling)
maintenance costs (do you have qualified people to maintain it, can the solution monitor and auto-correct itself or someone will have to review millions of lines of logs manually)
development costs (do you have qualified people already, how many lines of code can you reuse vs. how many will you have to create/debug)
(7) I'm sorry that my answer is not directly helpful, my choice would be to compose the message in a reliable secured application server, whatever that means in this case and regardless of how it's axons or pseudopods would be connected
Last but not the least: record the why you made the choice - forever, so that you can test and validate your assumptions any time later when the original decision makers get lost in the sands of time

Java enterprise architecture for delegating tasks between applications

In my environment I need to schedule long-running task. I have application A which just shows to the client the list of currently running tasks and allows to schedule new ones. There is also application B which does the actual hard work.
So app A needs to schedule a task in app B. The only thing they have in common is the database. The simplest thing to do seems to be adding a table with a list of tasks and having app B query that table every once in a while and execute newly scheduled tasks.
Yet, it doesn't seem to be the proper way of doing it. At first glance it seems that the tool for the job in an enterprise environment is a message queue. App A sends a message with task description to the queue, app B reads a message from the queue and executes the task. Is it possible in such case for app A to get the status of all the tasks scheduled (persistent queue?) without creating a table like the one mentioned above to which app B would write the status of completed tasks? Note also that there may be multiple instances of app A and each of them needs to know about all tasks of all instances.
The disadvantage of the 'table approach' is that I need to have DB polling.
The disadvantage of the 'message queue approach' is that I'm introducing a new communication channel into the infrastructure (yet another thing that can fail).
What do you think? Any other ideas?
Thank you in advance for any advice :)
========== UPDATE ==========
Eventually I decided on the following approach: there are two sides of this problem: one is communication between A and B. The other is getting information about the tasks.
For communication the right tool for the job is JMS. For getting data the right tool is the database.
So I'll have app A add a new row to the 'tasks' table descibing a task (I can query this table later on to get list of all tasks). Then A will send a message to B via JMS just to say 'you have work to do'. B will do the work and update task status in the table.
Thank you for all responses!

You need to think about your deployment environment both now and likely changes in the future.
You're effectively looking at two problems, both which can be solved in several ways, depending on how much infrastructure you able to obtain and are also willing to introduce, but it's also important to "right size" your design for your problems.
Whilst you're correct to think about the use of both databases and messaging, you need to consider whether these items are overkill for your domain and only you and others who know your domain can really answer that.
My advice would be to look at what is already in use in your area. If you already have database infrastructure that you can build into, then monitoring task activity and scheduling jobs in a database are not a bad idea. However, if you would have to run your own database, get new hardware, don't have sufficient support resources then introduction of a database may not be a sensible option and you could look at a simpler, but potentially more fragile approach of having your processes write files to schedule jobs and report tasks.
At the same time, don't look at the introduction of a DB or JMS as inherently error prone. Correctly implemented they are stable and proven technologies that will make your system scalable and manageable.
As #kan says, use exposing an web service interface is also a useful option.

Another option is to make the B as a service, e.g. expose control and status interfaces as REST or SOAP interfaces. In this case the A will just be as a client application of the B. The B stores its state in the database. The A is a stateless application which just communicates with B.
BTW, using Spring Remote you could expose an interface and use any of JMS, REST, SOAP or RMI as a transport layer which could be changed later if necessary.

You have messages (JMS) in enterprise architecture. Use these, they are available in Java EE containers like Glassfish. Messages can be serialized to be sure they will be delivered even if the server reboots while they are in the queue. And you even do not need to care how all this is implemented.

There can be couple of approaches here. First, as #kan suggested to have app B expose some web service for the interactions. This will heterogenous clients to communicate with app B. Seems a good approach. App B can internally use whatever persistent store it deems fit.
Alternatively, you can have app B expose some management interface via JMX and have applications like app A talk to app B through this management interface. Implementing the task submission and retrieving the statistics etc. would be simpler. Additionally, you can also leverage JMX notifications for real time updates on task submissions and accomplishments etc. Downside to this is that this would be a Java specific solution and hence supporting heterogenous clients will be distant dream.

How do you store and replay JDBC statements?

Given a JDBC-based application, that was not designed for real-time propagation of changes from one instance of the app running on computer A to another instance runnning on computer B in a two-way synchronization schema. How can you do this elegantly, without using Symmetric DS?
We though of using XMPP and XStream, transforming POJOs to XML or JSON, sending them via the XMPP, Smack API to the pre-configured "chat room" where other bots, listening, would replay the data they receive. Thus, even offline client apps, would receive the "DiscussionHistory" by sending their last "since timestamp".
I kind of looked everywhere for a "near real-time database change propagation" in Java, or even in H2, but where changes are propagated between each node registered, but the only solution I could think of is to use the XMPP protocol, build a "bot" chat-room around it, have nodes send their data there while others listen for changes.
The so-called "bots" are application instances on different computers, of an accounting application that should allow for real-time collaboration on the same database, but allow for offline modifications (so no centralized server to store changes).

One common approach is to build your caching so that the application always queries the database if a particular entry is not found. Then you would only have to synchronize cache-evictions to force all nodes in a group to re-load a certain entry. This is fairly easily achieved using, for instance, spring method caching and ehcache.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.