Currently we are working on a project and we are in design and architecture phase of the project following are main points of projects.
There are switches which are generating real time data
We have two components in to be made in Java/Java EE, call it CompA and CompB
CompA apply some process based on input record from switch without contacting to any databases, CompA does not have DB access.
CompB takes process record of CompA and apply processing also, this involves business database
CompA and CompB have multiple instances in system for scalability and fault tolerance.
Record is text record having multiple fields
Record is transactional , On record is considered processed if it is process from both CompA and CompB , other wise it will be roll backed and resend again
Now the problem is what is best way for communication between CompA adn Comp B
One way is
1. CompA--------> CompB
2. CompA-------->Messaging Server(JMS)------> CompB
Requirment: There will be more than one CompA and CompB is the system and if any component fails it load will be shared by other peers e.g if CompA fails its load will be shared by other CompA instances in the system. For that we are going for second option with JMS so that CompA is not tightly bound with CompB. But as new Component (Messaging Server)is introduced this may cause performance degradation as the record processing is transactional the system is real time.
Your suggestions and expert advices will be highly appriciated
JMS is the way to go - http://docs.oracle.com/javaee/6/tutorial/doc/bnceh.html
It is very reliable, you can do things like set message expirations and enforce priorities and it is perfect for your model which is basically a "Multiple producer/multiple consumer" over the network.
JMS supports transactions and it is built for reliability - by far it is the most reliable mechanism available. Performance-wise, you should talk about "scalability" more than "raw performances". Provided your hardware can cope, JMS will.
Wikipedia has a very good list of available JMS implementations:
http://en.wikipedia.org/wiki/Java_Message_Service#Provider_implementations
I have used Apache ActiveMQ, Open Message Queue and OpenJMS and, even if I have no experience of deployment of JMS servers on a clustered environment, I agree ActiveMQ is the most reliable solution I used.
I would suggest to use JMS with spring integration. check example
In my case we have used ActiveMq with spring integration so that we were able to handle loadbalancing and fail over fetures easily.
Related
With CQRS architecture, in write intensive realtime applications like trading systems, traditional approaches of loading aggregates from database + distributed cache + distributed lock do not perform well.
Actor model (AKKA) fits well here but i am looking for an alternative solution. What I have in mind is to use Kafka for sending commands, make use of topic partitioning to make sure commands for the same aggregate always arrive on the same node. Then use database + local cache + local pessimistic lock to load aggregate roots and handle commands. This brings 3 main benefits:
aggregates are distributed across multiple nodes
no network traffics for looking up central cache and distributed locks
no serialization & deserialization when saving and loading aggregates
One problem of this approach is when consumer groups rebalance, may produce stale aggregate state in the local cache, setting short cache timeouts should work most of time.
Has anyone used this approach in real projects?
Is it a good design and what are the down sides?
Please share your thoughts and experiences. Thank you.
IMHO Kafka will do the job for you. You need to ensure, if the network is fast enough.
In our project are reacting on soft real time on customer needs and purchases, and we are sending over Kafka infos to different services, which are performing busines logic. This works well.
Confirmantions on network levels are done well within Kafka broker.
For example when one of Broker nodes crashes, we do not loose messages.
Another matter is if you need any kind of very strong transactional confirmations for all actions, then you need to be careful in design - perhaps you need more topics, to send infos and all needed logical confirmations.
If you need to implement some more logic, like confirmation when the message is processed by other external service, perhaps you will need also disable auto commits.
I do not know if it is complete answer to your question.
For my master's degree final project I decided to design a drone delivery system. The main purpose is to learn to design complex systems.
The basic use case is this:
User goes to merchant online shop, selects the products, selects the delivery method as "Drone delivery" and selects his delivery location.
Merchant website, makes an API call to our drone delivery system (DDS) application to register the new delivery order.(The order will contain all information that we need: parcel pick up location, and destination location...)
The DDS application based on drones positions, and based on an algorithm will calculate and mark which drone can deliver this order in the shortest time.
The selected drone when is free will deliver the order.
So far so good. My questions are related to the software architecture of this system. I have some general questions and some specific questions.
General questions:
How do you design a system like this in order to be scalable? I mean: The system may be used by may merchants, if they hit my API in the same time with 100 orders, the system must be able to handle it.
What are some good design principles or patterns when designing an system like this?
Specific questions:
So far i have came up with this architecture:
System Components:
Java(Spring) application
Rest web servce
web interface managing dorens and parces
bussines logic and algorithms for routing drones
producer/consumer for RabbitMQ
Mysql Server
RabbitMq
System flow:
Merchant hits REST API to register the order
The Java Application saves the order to Mysql database.
After saving the order to the database, an Producer puts the order in a queue in RabbitMQ
An Consumer consumes the RabbitMQ order queue. It takes each order and calculates based on an algorithm the drone that offers the best time for the delivery. Each drone has a separate queue in RabbitMQ. After finding the best drone, the consumer inserts the order in the drone queue in RabbitMQ. The consumer also interrogates the mysql database during this process.
Whenever a drone is free, it will communicate with the system to ask for the next order. The system will look in the drone RabbitMQ queue and will take from there the next order.
My questions are related to the consumer and producer:
Is OK that the consumer to have logic in it, in my example it will have the algorithm that will determine the best drone, to do this it needs to talk to mysql also, for retrieving drone positions? Is this a good practice? If not how can i do different?
Is best practice for the consumer to stay in the application? Right now consumers are running in the same server as the web service and the code is not separated from web service code. I am thinking maybe in the future you may need to move the consumers in a separate server? How do you think the consumers so they can easily be separated from the application?
I think that the producer must stay in the application, i mean is coupled with the web service app. Is that OK?
Sorry for the long post, and for my poor English.
Thank you very much :)
Yes, the consumer should have logic in it. This is a standard EIP routing pattern.
If you properly separate your business logic layers from your data access layers (your queue access is a data access layer), then it probably isn't a problem to have them all share a common project. You ultimately probably want to separate your business logic/domain model from the web service and the router/consumer, but those are much more deployment and packaging concerns.
As long as you keep your web service code out of your business logic (and vice versa) you will probably be ok, you will just have to deploy the whole thing multiple times, and only expose the endpoints that are relevant for any given deployment. You ultimately might be happier though if you separate your layers via libraries, as it will actually enforce not mixing the concerns.
And yes, the producer must be deployed with the web service, just make sure you are aware that as a Data Access Layer, that it's in a separate package/class. It will make your testing much easier.
Requirement: Log events like Page Views and form Submits. Each page has ~1 second SLA. The application can have 100's of concurrent users at a time.
Log events are stored into the Database.
Solution: My initial thought was to use an async logging approach where the control returns back to the application and the logging happens in a different thread (via Spring's Thread pool task executor).
However someone suggested using JMS would be a more robust approach. Is the added work(Setting-up queue(s), writing to the queue(s), reading from the queue(s)) required when using this approach worthwhile?
What are some of the best practices / things to look out for (in a production environment) when implementing something like this?
Both approaches are valid, but one is vulnerable if you app unexpectedly stops. In your first scenario, events yet to be written to the database will be lost. Using a persistent JMS queue will mean that those events will be read from the queue and persisted to the database upon restart.
Of course, if your DB writes are so much slower than placing a message of similar size on to a JMS queue, you may be solving the wrong problem?
Using JMS for logging is a complete mismatch. JMS is a Java Abstraction for a Middleware Tool like MQ Series. That is complete overkill, and will let you go through a setup and configuration hell. JMS also lets you place messages in a transactional context, so you already get quickly the idea that JMS might be not much better than Database writes as #rjsang suggested.
This is not that JMS is not a nice technolgy. It is a good technology where it is applied properly.
For Assynchronous logging, you better just depend on a Logging API that directly supports it like Log4j2. In your case, you might be looking to configure a AsyncAppender with a JDBCAppender. Log4j2 has many more appenders as additional options, including one for JMS. However, by at least using a Logging abstraction, you make that all configurable and will make it possible to change your mind at a later time.
In the future we might have something similar to Asynchronous CDI Events, which should work similar to JMS, but would be much more lightweight. Maybe you can get something similar to work by combining CDI Events with EJB Asynchronous Methods. As long as you don't use EJB's with a remote interface, it should also be pretty lightweight.
You could give it a try using fully async and external tooling if you want to. If you have to stick to your SLA at any price and resilience is important for you, you could try using either logstash or process your logs offline. With doing so, you decouple your application from the database and you are no longer depending on the database performance. If the database is slow and you're using async loggers, queues might run full.
With logstash using GELF the whole log processing is handled within a different (or even remote) JVM. Offline processing (e.g. you write CSV logs) allows you to load the log data afterwards into the database.
How should I design an application comprised of numerous (but identical) independent processes that need to communicate data to an enterprise application and be monitored and accessible by a web interface?
Here's a more concrete example in Java:
The independent processes are multiple instances of a standalone J2SE application that receives on initialization data about a "user" entity and then starts doing stuff regarding this "user" (this is an infinite process and so any batch sort of design would be wrong here and also similarly, the starting time of these processes is irrelevant)
The enterprise application is a set of J2EE beans and web-services that implement business logic, DB access etc.. and that are (for example) hosted on GlassFish.
The web front is a set of JSPs (perhaps also on GlassFish) that work with the beans.
Now ideally, I want a way for the processes in (1) to be able to invoke methods from the beans in (2), but also for the beans in (2) to be able and update the processes (1) about things.
So these are the required flows of executions, assuming there are 10 independent process of (1) running for 10 different users (consider a "user" something easily identifiable by, say, a number):
Something happens in one of the processes of (1) and they invoke a method from the enterprise application (2) with some data.
One of the real, human, users (which was already identified by the web app) clicks something on a web-page of (3), this invokes a method in (2), and then some "magical" entity (which I have no idea how to name) finds the independent process from (1) that is responsible for this particular user and updates the process with some new data.
My best approach so far is to expose these J2SE apps by JMX and go from there, but I have one thing I don't understand - who or what should be holding a key-pair list of the sort "the process at URI X is responsible for user Y" and then directing the calls accordingly.
BTW, please feel free to give any advice outside of the Java platform (!), as long as it is a platform that can be scaled easily.
EDIT:
Also, is there a way to "host" such independent processes on some app-server? Something that will re-spawn processes if they fail, allow for deployment and monitoring of such processes on remote machines etc.?
There has been some time since I have used Java Message Service in the past so I am afraid I am not up-to-date with the technical details, but from your description it seems like it would suit your case, to handle communication between the adminstration GUI and the client processes.
There are various options (I believe you are interested for asynchronous communication) so you should take a look on the latest developments to examine yourself if it fits your case or not.
Regarding the data size that the server would exchange with the processes I believe this is a different topic and I must say that the answer depends. Would it be better to send all data in the message? Or would the message be just a notification so the client to be notified and then connect to some enterprise bean to check some new state? I would prefer the latter case but this is something you should decide based on your requirements. I wouldn't blindly exclude the first option unless I had some apparent evidence that this wouldn't work.
Regarding the scaling I don't think it can be much worse then the scaling of your rest of your beans. As much the server is concerned they processes are all clients that need to be served.
Please take the above advice with a grain of salt: I don't know specifics of your problem/design. I am speaking more about in a general way.
I hope that helps
In my environment I need to schedule long-running task. I have application A which just shows to the client the list of currently running tasks and allows to schedule new ones. There is also application B which does the actual hard work.
So app A needs to schedule a task in app B. The only thing they have in common is the database. The simplest thing to do seems to be adding a table with a list of tasks and having app B query that table every once in a while and execute newly scheduled tasks.
Yet, it doesn't seem to be the proper way of doing it. At first glance it seems that the tool for the job in an enterprise environment is a message queue. App A sends a message with task description to the queue, app B reads a message from the queue and executes the task. Is it possible in such case for app A to get the status of all the tasks scheduled (persistent queue?) without creating a table like the one mentioned above to which app B would write the status of completed tasks? Note also that there may be multiple instances of app A and each of them needs to know about all tasks of all instances.
The disadvantage of the 'table approach' is that I need to have DB polling.
The disadvantage of the 'message queue approach' is that I'm introducing a new communication channel into the infrastructure (yet another thing that can fail).
What do you think? Any other ideas?
Thank you in advance for any advice :)
========== UPDATE ==========
Eventually I decided on the following approach: there are two sides of this problem: one is communication between A and B. The other is getting information about the tasks.
For communication the right tool for the job is JMS. For getting data the right tool is the database.
So I'll have app A add a new row to the 'tasks' table descibing a task (I can query this table later on to get list of all tasks). Then A will send a message to B via JMS just to say 'you have work to do'. B will do the work and update task status in the table.
Thank you for all responses!
You need to think about your deployment environment both now and likely changes in the future.
You're effectively looking at two problems, both which can be solved in several ways, depending on how much infrastructure you able to obtain and are also willing to introduce, but it's also important to "right size" your design for your problems.
Whilst you're correct to think about the use of both databases and messaging, you need to consider whether these items are overkill for your domain and only you and others who know your domain can really answer that.
My advice would be to look at what is already in use in your area. If you already have database infrastructure that you can build into, then monitoring task activity and scheduling jobs in a database are not a bad idea. However, if you would have to run your own database, get new hardware, don't have sufficient support resources then introduction of a database may not be a sensible option and you could look at a simpler, but potentially more fragile approach of having your processes write files to schedule jobs and report tasks.
At the same time, don't look at the introduction of a DB or JMS as inherently error prone. Correctly implemented they are stable and proven technologies that will make your system scalable and manageable.
As #kan says, use exposing an web service interface is also a useful option.
Another option is to make the B as a service, e.g. expose control and status interfaces as REST or SOAP interfaces. In this case the A will just be as a client application of the B. The B stores its state in the database. The A is a stateless application which just communicates with B.
BTW, using Spring Remote you could expose an interface and use any of JMS, REST, SOAP or RMI as a transport layer which could be changed later if necessary.
You have messages (JMS) in enterprise architecture. Use these, they are available in Java EE containers like Glassfish. Messages can be serialized to be sure they will be delivered even if the server reboots while they are in the queue. And you even do not need to care how all this is implemented.
There can be couple of approaches here. First, as #kan suggested to have app B expose some web service for the interactions. This will heterogenous clients to communicate with app B. Seems a good approach. App B can internally use whatever persistent store it deems fit.
Alternatively, you can have app B expose some management interface via JMX and have applications like app A talk to app B through this management interface. Implementing the task submission and retrieving the statistics etc. would be simpler. Additionally, you can also leverage JMX notifications for real time updates on task submissions and accomplishments etc. Downside to this is that this would be a Java specific solution and hence supporting heterogenous clients will be distant dream.