Microservices: Best Practice for combining data from Instances of SAME service? - java

Scenario:
We have two instances of the same microservice, which receives two events (pictured as Event1 and Event2 below) from Kafka. The instances need to combine the result of their own individual transformations, with that of the other, and send only 1 notification downstream.
I am trying to understand what is the best way to do all this. Specifically, how to make:
each instance of the microservice to wait for the other instance,
and then combine the individual transforms into one
and then check if the other instance has already combined and sent the notification, if yes, then skip!
Below diagram to help visualize:

Consider using the temporal.io open source project to implement this. You can code your logic as a simple stateful class that reacts to the events. The idea of Temporal is that the instance of that class is not linked to a specific instance of the service. Each object is unique and identified by a business ID. So all the cross-process coordination is handled by the Temporal runtime.
Here is a sample code using Temporal Java SDK. Go, Typescript/Javascript, PHP, Python are also supported.
#WorkflowInterface
public interface CombinerWorkflow {
#WorkflowMethod
void combine();
#SignalMethod
void event1(Event1 name);
#SignalMethod
void event1(Event2 name);
}
// Define the workflow implementation which implements the getGreetings workflow method.
public static class CombinerWorkflowImpl implements CombinerWorkflow {
private Event1 event1;
private Event2 event2;
private Notifier notifier = Workflow.newActivityStub(Notifier.class);
#Override
public void combine() {
Workflow.await(()->event1 != null && event2 !=null);
Event3 result = combine(event1, event2);
notifier.notify(result);
}
#Override
public void event1(Event1 event) {
this.event1 = event;
}
#Override
public void event1(Event2 event) {
this.event2 = event;
}
}
This code looks too simple as it doesn't talk to persistence. But Temporal ensures that all the data and even threads are durably preserved as long as needed (potentially years). So any infrastructure and process failures will not stop its execution.

It's worth noting that you cannot guarantee all of:
a notification which needs to be sent will be sent in some finite period of time (this is a reasonable working definition of availability in this context)
no notification will be sent more than once
either instance can fail or there are arbitrary network delays
Fundamentally you will need each instance to tell the other one that it's claiming responsibility for sending the notification or ask the other instance if it's claimed that responsibility. If telling, then if it doesn't wait for acknowledgement you cannot guarantee "not more than once". If you tell and wait for acknowledgement, you cannot guarantee "will be sent in a finite period". If you ask, you will likewise have to decide whether or not to send in the case of no reply.
You could have the instances use some external referee: this only punts the CAP tradeoff to that referee. If you choose a CP referee, you will be giving up on guaranteeing a notification will be sent. If you choose AP, you will be giving up on guaranteeing that no notification gets sent more than once.
You'll have to choose which of those three guarantees you want to weaken; deciding how you weaken will guide your design.

There are multiple ways to get around such kind of data sync problems. But since you are using Kafka, you should be using out of box functionalities offered by Kafka.
Option 1 (Preferable)
Kafka guarantees to maintain the order of events within the same partition. Therefore if your producer could send these events to the same partition, they would be received by the same consumer in any given consumer group (in your case, same instance - or if you are using threads as consumer, same thread of same consumer). With this you wouldn't need to worry about about syncing events across multiple consumers.
If you are using Spring Boot, this could be easily achieved by providing partition key in kafka template.
More on this topic : How to maintain message ordering and no message duplication
Option 2
Now, if you don't have control over producer, you would need to handle this at application side. You are going to need a distributed caching support, e.g. redis for this. Simply maintain the boolean state (completed: true OR false) for these events and only when all related events are received, process the downstream logic.
NOTE: Assuming you are using a persistence layer, combining and transforming the events should be trivial. But if you are not using any persistence, then you would need to use in-memory cache for Option1. For Option2, it should be trivial because you already have the distributed cache (and can store whatever you want).

Related

Is Session.sendToTarget() thread-safe?

I am trying to integrate QFJ into a single-threaded application. At first I was trying to utilize QFJ with my own TCP layer, but I haven't been able to work that out. Now I am just trying to integrate an initiator. Based on my research into QFJ, I would think the overall design should be as follows:
The application will no longer be single-threaded, since the QFJ initiator will create threads, so some synchronization is needed.
Here I am using an SocketInitiator (I only handle a single FIX session), but I would expect a similar setup should I go for the threaded version later on.
There are 2 aspects to the integration of the initiator into my application:
Receiving side (fromApp callback): I believe this is straightforward, I simply push messages to a thread-safe queue consumed by my MainProcessThread.
Sending side: I'm struggling to find documentation on this front. How should I handle synchronization? Is it safe to call Session.sendToTarget() from the MainProcessThread? Or is there some synchronization I need to put in place?
As Michael already said, it is perfectly safe to call Session.sendToTarget() from multiple threads, even concurrently. But as far as I see it you only utilize one thread anyway (MainProcessThread).
The relevant part of the Session class is in method sendRaw():
private boolean sendRaw(Message message, int num) {
// sequence number must be locked until application
// callback returns since it may be effectively rolled
// back if the callback fails.
state.lockSenderMsgSeqNum();
try {
.... some logic here
} finally {
state.unlockSenderMsgSeqNum();
}
Other points:
Here I am using an SocketInitiator (I only handle a single FIX session), but I would expect a similar setup should I go for the threaded version later on.
Will you always use only one Session? If yes, then there is no use in utilizing the ThreadedSocketInitiator since all it does is creating a thread per Session.
The application will no longer be single threaded, since the QFJ initiator will create threads
As already stated here Use own TCP layer implementation with QuickFIX/J you could try passing an ExecutorFactory. But this might not be applicable to your specific use case.

A multi-agent system that uses Producer-Consumer pattern?

I am trying to implement a Producer-Consumer pattern that uses multi-agents as workers instead of multi-threads.
As I understand, a typical multi-threaded implementation uses a BlockingQueue where one Producer thread puts information on the Queue and have multiple Consumer threads pull the data and execute some processing functions.
So following the same logic, my design will use a Producer agent that generates data and sends it to multiple Consumer Agents. At first guess, I have thought I should use a shared BlockingQueue between the Consumer agents and have the agents access the queue and retrieve the data. But I don't know if this is easy to do because I don't think agents have any shared memory and it is way more simple to directly send the information to the Consumer agents as ACL Messages.
This is important to consider because my multi-agent design will process a lot of data. So my question is, in Jade, what happens if I send to many ACL messages to a single agent? will the agent ignore the other messages?
This post has an answer that suggests "..Within the JADE framework, Agents feature an 'Inbox' for ACLMessages, basically a BlockingQueue Object that contains a list of recieved messages. the agent is able to observe its own list and treat them as its lifecycle proceeds. Containers do not feature this ability...". Is that statement correct? If this is true, then the other messages are just waiting on the queue and it will be ideal for my design to send information directly to the Consumer Agents, but I didn't see any BlockingQueues on the ACLMessage class.
Yes, messages will be in queue and agent will not ignore them.
ACLMessage is just a message object, that is sent between agents. Each agents has its own message queue (jade.core.MessageQueue) and several methods for handling communication.
If you check Agent class documentation, you can find methods like
receive() - nonblocking receive, returns first message in queue or null if queue is empty
receive(MessageTemplate pattern) - behaves like the the previous one, but you can also specify pattern for message, like for example specific sender AID, conversation ID, also combinations.
blockingReceive() - blocking receive, blocks agent until message appears in queue
blockingReceive(MessageTemplate pattern) - blocking receive, with pattern
and also there are methods for blocking receive, where you can set the timeout.
It's also important to mention, that if you define your agent logic in Behaviour class, you can also just block only behaviour, instead of blocking entire agent.
ACLMessage msg = agent.receive();
if (msg != null) {
// your logic
} else {
block();
}
The difference, is that block() method inside behaviour just marks your behaviour as blocked and removes it from agent's active behaviour pool (it added back to active pool, when message is received or behaviour is restared by restart() method) allowing to execute other agent's behaviours, and blockingReceive() blocks entirely your agent and all his behaviours until he receives message.

Axon: Eventsourced Aggregate none state changing event

I have a usecase where I would like to publish a non-state-chaninging event as a trigger.
In the vast majority of cases, the Aggregates will publish events by applying them. However, occasionally, it is necessary to publish an event (possibly from within another component), directly to the Event Bus. To publish an event, simply wrap the payload describing the event in an EventMessage. The GenericEventMessage.asEventMessage(Object) method allows you to wrap any object into an EventMessage ...
The event is published from inside a Saga.
When I use asEventMessageand look at the events table I'm a little confused. The event has an aggregate identifier that does not exist in the rest of the system and the type entry is null (when reading the docs, for a moment it sounds like the expected behavior of asEventMessage is equal to applying events from within aggregates).
Since I consider the event I'm talking about conceptionally part of the aggregate it should be referring to it, right?
So I craft a GenericDomainMessage myself and set its aggregate identifier, sequence number and type manually:
#SagaEventHandler
public void on (AnotherEvent event, #SequenceNumber long sequenceNr) {
// ...
GenericDomainEventMessage myEvent = new GenericDomainEventMessage(
MyAggregate.class.getSimpleName(),
identifier.toString(),
sequenceNr + 1,
payload);
eventStore.publish(myEvent);
}
This event does not introduce (data) state change to its underlying aggregate. I treat it as a flag/trigger that has a strong meaning in the domain.
I could also publish the event from within the aggregate, in a command handler but some of the operations that need to be performed are outside of the aggregate's scope. That's why a Saga seems to be more suitable.
So my questions are:
Is publishing a GenericDomainEventMessage equal to the behavior of AggrgateLifeCycle#apply?
Should there be a no-op handler in the aggregate or will axon handle this correctly?
In Axon 3, publishing the event to the EventBus is the same as apply()ing them from within the Aggregate, with one difference: apply() will invoke any available handlers as well. If no handler is available, there is no difference.
EventBus.publish is meant to publish Events that should not be directly tied to the Aggregate. In the Event Store table, it does get an Aggregate identifier (equal to the message identifier), but that's a technical detail.
In your case, the recommended solution would be to apply() the Event. The fact that the event doesn't trigger a state change doesn't matter at that point. You're not obliged to define an #EventSourcingHandler for it.

How can I make multiple Jframes consume data from the same Thread in Java?

I have a program that must output the data of a weighting scale. It uses a thread to read continually data from the rs232 source and must output the data graphically. The user can open and close as many Jframes as it wishes and all must show the same data that is read from the rs232 in a JTextArea. How can I approach this?
Thank you very much in advance.
There are a number of ways you might approach this problem
The user can open and close as many Jframes as it wishes and all must show the same data that is read from the rs232
This raises the question if you're only interested in the real time results or the historical results. For argument sake, I'm only going to focus on the real time results.
Basically you need to start with a class which is responsible for actually reading the data from the port. This class should do only two things:
Read the data
Generate events when new data is read
Why? Because then any additional functionality you want to implement (like writing the data to a database or caching the results for some reason) can be added later, simply by monitoring the events that are generated.
Next, you need to define a interface which describes the contract that observers will implement in order to be able to receive events
public interface ScaleDataSourceListener {
public void scaleDataUpdated(ScaleDataSourceEvent evt);
}
You could also add connection events (connect/disconnect) or other events which might be important, but I've kept it simple.
The ScaleDataSourceEvent would be a simple interface which described the data of the event
public interface ScaleDataSourceEvent {
public ScaleDataSource getSource();
public double data();
}
for example (I like interfaces, they describe the expected contract, define the responsibility and limit what other people can do when they receive an instance of an object implementing the interface, but that's me)
Your data source would then allow observers to register themselves to be notified about events generated by it...
public interface ScaleDataSource ... {
//...
public void addDataSourceListener(ScaleDataSourceListener listener);
public void removeDataSourceListener(ScaleDataSourceListener listener);
}
(I'm assuming the data source will be able to do other stuff, but I've left that up to you to fill in, again, I prefer interfaces where possible, that's not a design restriction on your part ;))
So, when data is read from the port, it would generate a new event and notify all the registered listeners.
Now, Swing is not thread safe, what this means is, you shouldn't make updates to the UI from any thread other then the Event Dispatching Thread.
In your case, probably the simplest solution would be to simply use SwingUtilities.invokeLater to move from the data sources thread context to the EDT.
Basically, this is a simple Observer Pattern
There are a lot of other considerations you need to think about as well. Ie, are the frames been opened within the same process as the data source, or does the data source operate within it's own, seperate process. This complicates the process, as you'll need some kind of IPC system, maybe using sockets, but the overriding design is the same.
What happens if the data source is reading data faster then you can generate events? You might need some kind of queue, where the data source simply dumps the data to the queue and you have some kind of dispatcher (on another thread) reading it and dispatching events.
There are number implementations of blocking queues which provide a level of thread safety, have a look through concurrency APIs for more details.
... as some ideas ;)
first, create a frame class extends JFrame, and create a method to receive data from rs232. Then every object of this class can get data using that method.
u can create one frame by creating one object of the class.

Shared Monitor Objects

Good time!
There is a big web application with lots of threads processing data back and forth. One part of it is the service that processes trades (TradeProcessingService). When a trade is received it is validated and sent for a further processing to other services. So that the TradeProcessingService is an entry point of this web app component.
Each trade is connected with exactly one exchange. As far as all the processing is based on the trade exchange, it is required to perform such processing in parallel for different exchanges.
Together with a described above functionality, there is a scheduling service (ExchangeDataUpdaterService) that updates exchange data (one by one) every 10 seconds. As far as this data is used for the trade processing, it is required to synchronize the processing and updating operations.
Thereby it is not only required to synchronize each processing thread (with all the services' method call chain) by an exchange, but also synchronize those threads with updating threads (also, by exchange data).
I am not experienced with such tasks. It seems that there should be some shared monitor objects (say, one per each exchange) to use in processing and updating threads...
Could you, please, suggest some best practices for dealing with the above scenario?
First of all, thanks OldCurmudgeon, Martin James and Peter Lawrey for the suggested solutions.
I'll describe the approach I've implemented (just an overview without any disaster recovery or resource releasing mechanisms).
As I wrote before, the main idea is to make a parallel processing for each exchange.
Firstly, I created a map with exchanges as keys and ExchangeTaskProcessorEnv classes as values, where an ExchangeTaskProcessorEnv class object contains a BlockingQueue (for the ExchangeAwareTask class objects) and an ExchangeAwareTaskProcessor class object (which is a Runnable implementer that calls an 'invoke' method of each ExchangeAwareTask class object from the queue and perform some additional processing). The described map is located in a singleton class - ExchangeTaskHolder.
Secondly, there is a number of operations that could be performed for different business cases: update an exchange's info, update a trade's info, synchronize data with an FTP server... For each of these cases I've created a task that extends the ExchangeAwareTask class and overrides the 'invoke' method (each task knows how it must be processed). It is worth mentioning that each task contains a reference to a corresponding exchange.
Thirdly, I introduced a static factory to be able to create a required task for a required exchange.
Lastly, when a user or a scheduling mechanism needs to perform some action it creates a required task with the factory and adds it to the ExchangeTaskHolder object that assigns it for the corresponding exchange.
After a while I realized that there are a number of special cases when some action does not correspond any exchange or when an action corresponds all exchanges. No problem, in the first case an additional task-map-bucket could be added (the exchange-aware processing mechanism is not affected), while on the second case a special method of the ExchangeTaskHolder creates the required tasks for each ExchangeAwareTaskProcessor (for each exchange).
Some time later another requirement came: each task must have a certain weight. Not problem, I just changed the ExchangeTaskHolder's BlockingQueue with a PriorityBlockingQueue one.

Categories

Resources