I'm writing an exercise application using DDD principles in java with spring boot and mongodb. According to DDD, the communication between Aggregates occur only through messaging. At this point Im not distributing the application, all aggregates resides in the same application, so Im simply using the Spring messaging capabilities to exchange messages.
Each aggregate correspond to exactly one mongo document. Each command or operation triggered by an event is guarded by a #Transactional annotation to ensure that the db transaction and the event are processed atomically.
I was wondering where should I store the events? Can I store them within the mongo document? Actually, since mongo transactions spans single documents, isn't this the only option?
Next step is to set a periodic task that will read all recent events and publish them to simulate an off thread communication. At this point, probably a good place would be a separate document collection storing the events.
P.S. I'm not taking event sourcing into consideration for the moment, as it seems to be more advanced.
Thank you!
I was wondering where should I store the events?
The usual line of thinking is that distributed transactions suck; therefore if you can possibly manage it you want to store the events with the aggregate state. In the RDBMS world your events live in a table that is in the same database schema as your aggregate data -- see Udi Dahan, 2014.
If it helps, you can think of this "outbox" of event messages as being another entity within your aggregate.
After this save is successful, you then work on the problem of copying the information to the different places it needs to be, paying careful attention to the failure modes. It's purely a "plumbing" problem at this point, which is to say that it is usually treated as an application and infrastructure concern, rather than as a domain concern.
Related
To understand if the spring events fits the task im working on I need to understand how they work, where are they stored?
as I can guess they are stored in a spring application context and disappears if the application crashes, is my guess correct?
Spring events are intended to use when calling methods directly would create too much coupling. If you need to track events for auditing or replay purposes you have to save the events yourself. Based on your comments, there are many ways to achieve this, based on the topology and purpose of the application (list not complete):
Model entities that represent the events and store them in a repository
Incorporate a message broker such as Kafka that support message persistence
Install an in-memory cache such as Hazelcast
Use a cloud message service such as AWS SQS
Lastly, please make sure that you carefully evaluate which options suits your needs best. Options 2 to 4 all introduce heavy complexity and distributed applications can bring sorrow and misery to your life. Go for the simplest option if you can and only resort the other options if absolutely necessary.
I have 3 micro-services, for example, A, B, C. Services A does some tasks and updates its database accordingly. Same for rest two services.
Suppose services C could not insert to the database because of some error but service A and B updated the database accordingly and this has led to the inconsistencies in the database.
How shall I correctly handle this scenario if -
I have one common database for all the services?
Separate databases associated with each service?
Thank you for your answers!
For Separate databases you might want to google the SAGA architecture pattern. This helps you to manage transaction accross different microservices each having respective Database. It would take me a lot of space to describe it here, so I think the best advice I can give you is to refer you to this article SAGA Pattern for database per service architecture
First up, in a microservices architecture you should pursue separate databases, or at the very least separated schemas. Sharing data across microservices, as pointed out in comments, would be a microservice anti-pattern.
You can consider a couple of approaches here:
Each microservice updates it's own database and informs the others that an update has taken place. This enables each microservice to align its own database (eventually consistent).
A better approach, if you need coordination, is to create a fourth coordinating microservice whose job is to orchestrate the other three microservices. Research the saga pattern. This is especially useful if you need transactional coordination (i.e. all services must update their databases or none of them). If you think you need transactional coordination think again very carefully - in many (most?) situations eventually consistent is good enough. If you really need transactional then you should research saga and routing slip patterns which include compensation in the event of a failure.
If you need a unified view of the three separate databases then consider another microservice whose job is to create the view (projection) that you need. Let the microservices do the one thing they are good at and that only, if you start to mix concerns in your microservices - well, again it would be an anti-pattern.
A good method of enabling microservice communication is to use a message bus such as RabbitMQ or Azure Service Bus, but there are many other options including Spring Boot.
Given your questions, I would spend some more time researching microservice architectures and the right tools for your project before embarking on a microservices project. A lot of work has been done to ease the complexity of microservices and you would be wise to research the most suitable tool set for you. Nevertheless it will add quite a lot complexity at first but if done right as the project grows it can pay dividends.
You can use RabbitMQ to make message exchange among Micro Services. RabbitMQ will hold all the information on Database Update. So even if a micro service dies before database update then when the microservice will up again, it would look into RabbitMQ and knows what it missed. Thus it can do the database update after recovering from failure.
With CQRS architecture, in write intensive realtime applications like trading systems, traditional approaches of loading aggregates from database + distributed cache + distributed lock do not perform well.
Actor model (AKKA) fits well here but i am looking for an alternative solution. What I have in mind is to use Kafka for sending commands, make use of topic partitioning to make sure commands for the same aggregate always arrive on the same node. Then use database + local cache + local pessimistic lock to load aggregate roots and handle commands. This brings 3 main benefits:
aggregates are distributed across multiple nodes
no network traffics for looking up central cache and distributed locks
no serialization & deserialization when saving and loading aggregates
One problem of this approach is when consumer groups rebalance, may produce stale aggregate state in the local cache, setting short cache timeouts should work most of time.
Has anyone used this approach in real projects?
Is it a good design and what are the down sides?
Please share your thoughts and experiences. Thank you.
IMHO Kafka will do the job for you. You need to ensure, if the network is fast enough.
In our project are reacting on soft real time on customer needs and purchases, and we are sending over Kafka infos to different services, which are performing busines logic. This works well.
Confirmantions on network levels are done well within Kafka broker.
For example when one of Broker nodes crashes, we do not loose messages.
Another matter is if you need any kind of very strong transactional confirmations for all actions, then you need to be careful in design - perhaps you need more topics, to send infos and all needed logical confirmations.
If you need to implement some more logic, like confirmation when the message is processed by other external service, perhaps you will need also disable auto commits.
I do not know if it is complete answer to your question.
I'm developing an MVC spring web app, and I would like to store the actions of my users (what they click on, etc.) in a database for offline analysis. Let's say an action is a tuple (long userId, long actionId, Date timestamp). I'm not specifically interested in the actions of my users, but I take this as an example.
I expect a lot of actions by a lot of (different) users par minutes (seconds). Hence the processing time is crucial.
In my current implementation, I've defined a datasource with a connection pool to store the actions in a database. I call a service from the request method of a controller, and this service calls a DAO which saves the action into the database.
This implementation is not efficient because it waits that the call from the controller and all the way down to the database is done to return the response to the user. Therefore I was thinking of wrapping this "action saving" into a thread, so that the response to the user is faster. The thread does not need to be finished to get the reponse.
I've no experience in these massive, concurrent and time-critical applications. So any feedback/comments would be very helpful.
Now my questions are:
How would you design such system?
would you implement a service and then wrap it into a thread called at every action?
What should I use?
I checked spring Batch, and this JobLauncher, but I'm not sure if it is the right thing for me.
What happen when there are concurrent accesses at the controller, the service, the DAO and the datasource level?
In more general terms, what are the best practices for designing such applications?
Thank you for your help!
Take a singleton object # apps level and update it with every user action.
This singleton object should have a Hashmap as generic, which should get refreshed periodically say after it reached a threshhold level of 10000 counts and save it to DB, as a spring batch.
Also, periodically, refresh it / clean it upto the last no.# of the records everytime it processed. We can also do a re-initialization of the singleton instance , weekly/ monthly. Remember, this might lead to an issue of updating the same in case, your apps is deployed into multiple JVM. So, you need to implement the clone not supported exception in singleton.
Here's what I did for that :
Used aspectJ to mark all the actions of the user I wanted to collect.
Then I sent this to log4j with an asynchronous dbAppender...
This lets you turn it on or off with log4j logging level.
works perfectly.
If you are interested in the actions your users take, you should be able to figure that out from the HTTP requests they send, so you might be better off logging the incoming requests in an Apache webserver that forwards to your application server. Putting a cluster of web servers in front of application servers is a typical practice (they're good for serving static content) and they are usually logging requests anyway. That way the logging will be fast, your application will not have to deal with it, and the biggest work will be writing a script to slurp the logs into a database where you can do analysis.
Typically it is considered bad form to spawn your own threads in a Java EE application.
A better approach would be to write to a local queue via JMS and then have a separate component, e.g., a message driven bean (pretty easy with EJB or Spring) which persists it to the database.
Another approach would be to just write to a log file and then have a process read the log file and write to the database once a day or whenever.
The things to consider are: -
How up-to-date do you need the information to be?
How critical is the information, can you lose some?
How reliable does the order need to be?
All of these will factor into how many threads you have processing your queue/log file, whether you need a persistent JMS queue and whether you should have the processing occur on a remote system to your main container.
Hope this answers your questions.
I am not very familiar with databases and what they offer outside of the CRUD operations.
My research has led me to triggers. Basically it looks like triggers offer this type of functionality:
(from Wikipedia)
There are typically three triggering events that cause triggers to "fire":
INSERT event (as a new record is being inserted into the database).
UPDATE event (as a record is being changed).
DELETE event (as a record is being deleted).
My question is: is there some way I can be notified in Java (preferably including the data that changed) by the database when a record is Updated/Deleted/Inserted using some sort of trigger semantics?
What might be some alternate solutions to this problem? How can I listen to database events?
The main reason I want to do this is a scenario like this:
I have 5 client applications all in different processes/existing across different PCs. They all share a common database (Postgres in this case).
Lets say one client changes a record in the DB that all 5 of the clients are "interested" in. I am trying to think of ways for the clients to be "notified" of the change (preferably with the affected data attached) instead of them querying for the data at some interval.
Using Oracle you can setup a Trigger on a table and then have the trigger send a JMS message. Oracle has two different JMS implementations. You can then have a process that will 'listen' for the message using the JDBC Driver. I have used this method to push changes out to my application vs. polling.
If you are using a Java database (H2) you have additional options. In my current application (SIEM) I have triggers in H2 that publish change events using JMX.
Don't mix up the database (which contains the data), and events on that data.
Triggers are one way, but normally you will have a persistence layer in your application. This layer can choose to fire off events when certain things happen - say to a JMS topic.
Triggers are a last ditch thing, as you're operating on relational items then, rather than "events" on the data. (For example, an "update", could in reality map to a "company changed legal name" event) If you rely on the db, you'll have to map the inserts & updates back to real life events.... which you already knew about!
You can then layer other stuff on top of these notifications - like event stream processing - to find events that others are interested in.
James
Hmm. So you're using PostgreSQL and you want to "listen" for events and be "notified" when they occur?
http://www.postgresql.org/docs/8.3/static/sql-listen.html
http://www.postgresql.org/docs/8.3/static/sql-notify.html
Hope this helps!
Calling external processes from the database is very vendor specific.
Just off the top of my head:
SQLServer can call CLR programs from
triggers,
postgresql can call arbitrary C
functions loaded dynamically,
MySQL can call arbitrary C functions,
but they must be compiled in,
Sybase can make system calls if set
up to do so.
The simplest thing to do is to have the insert/update/delete triggers make an entry in some log table, and have your java program monitor that table. Good columns to have in your log table would be things like EVENT_CODE, LOG_DATETIME, and LOG_MSG.
Unless you require very high performance or need to handle 100Ks of records, that is probably sufficient.
I think you're confusing two things. They are both highly db vendor specific.
The first I shall call "triggers". I am sure there is at least one DB vendor who thinks triggers are different than this, but bear with me. A trigger is a server-side piece of code that can be attached to table. For instance, you could run a PSQL stored procedure on every update in table X. Some databases allow you to write these in real programming languages, others only in their variant of SQL. Triggers are typically reasonably fast and scalable.
The other I shall call "events". These are triggers that fire in the database that allow you to define an event handler in your client program. IE, any time there are updates to the clients database, fire updateClientsList in your program. For instance, using python and firebird see http://www.firebirdsql.org/devel/python/docs/3.3.0/beyond-python-db-api.html#database-event-notification
I believe the previous suggestion to use a monitor is an equivalent way to implement this using some other database. Maybe oracle? MSSQL Notification services, mentioned in another answer is another implementation of this as well.
I would go so far as to say you'd better REALLY know why you want the database to notify your client program, otherwise you should stick with server side triggers.
What you're asking completely depends on both the database you're using and the framework you're using to communicate with your database.
If you're using something like Hibernate as your persistence layer, it has a set of listeners and interceptors that you can use to monitor records going in and out of the database.
There are a few different techniques here depending on the database you're using. One idea is to poll the database (which I'm sure you're trying to avoid). Basically you could check for changes every so often.
Another solution (if you're using SQL Server 2005) is to use Notification Services, although this techonology is supposedly being replaced in SQL 2008 (we haven't seen a pure replacement yet, but Microsoft has talked about it publicly).
This is usually what the standard client/server application is for. If all inserts/updates/deletes go through the server application, which then modifies the database, then client applications can find out much easier what changes were made.
If you are using postgresql it has capability to listen notifications from JDBC client.
I would suggest using a timestamp column, last updated, together with possibly the user updating the record, and then let the clients check their local record timestamp against that of the persisted record.
The added complexity of adding a callback/trigger functionality is just not worth it in my opinion, unless supported by the database backend and the client library used, like for instance the notification services offered for SQL Server 2005 used together with ADO.NET.