I'd like to trace my async application with some key checkpoint.
Is there any popular framework I can use?
For example, I may choose to use vert.x or any other java async framework. For each request/response cycle, I'd make some checkpoint to log something while these points might happen in different threads.
I'd like to see an aggregated view of 1 request to see what's going on. Supporting distributed case would be better but single JVM is good to go.
What you are looking for is OpenTracing. It's an API that allows you to have distributed tracing features in a way that is vendor agnostic.
For your specific case, you'd have to handle the context propagation yourself, as there's no other (reliable) way to do that on OpenTracing yet for teh async case. For other cases (sync JAX-RS, Servlets, Spring Boot, ...), it would be safe to use the native framework integration and/or the Java agent rules.
For Vert.x, you'll need to inject the span context into Vert.x' message, and extract this context later on.
There's an example of OpenTracing + Vert.x on the Hawkular APM example directory, that might help you get started. Note, however, that you might want to use another backend should you decide to move forward, as we (Hawkular APM team) decided to join forces with Jaeger for the OpenTracing backend.
Related
When introducing parallel processing to an application where multiple save entity calls are being made, I see prior dev has chosen to do it via Spring Integration using split().channel(MessageChannels.executor(Executors.newfixedThreadPool(10))).handle("saveClass","saveMethod").aggregate().get() - where this method is mapped to a requestChannel using #Gateway annotation. My question is this task seems to be simpler to do using the parallelStream() and forEach() method. Does IntergrationFlow provide any benefit in this case?
If you really do a plain in-memory data processing where Java's Stream API is enough, then indeed you don't need the whole messaging solution like Spring Integration. But if you deal with distributed requirements to process data from different systems, like from HTTP to Apache Kafka or DB, then it is better to use some tool which allows you smoothly to connect everything together. Also: no one stops you to use Stream API in the Spring Integration application. In the end all your code is Java anyway. Please, learn more what is EIP and why we would need a special framework to implement this messaging-based solutions: https://www.enterpriseintegrationpatterns.com/
To understand if the spring events fits the task im working on I need to understand how they work, where are they stored?
as I can guess they are stored in a spring application context and disappears if the application crashes, is my guess correct?
Spring events are intended to use when calling methods directly would create too much coupling. If you need to track events for auditing or replay purposes you have to save the events yourself. Based on your comments, there are many ways to achieve this, based on the topology and purpose of the application (list not complete):
Model entities that represent the events and store them in a repository
Incorporate a message broker such as Kafka that support message persistence
Install an in-memory cache such as Hazelcast
Use a cloud message service such as AWS SQS
Lastly, please make sure that you carefully evaluate which options suits your needs best. Options 2 to 4 all introduce heavy complexity and distributed applications can bring sorrow and misery to your life. Go for the simplest option if you can and only resort the other options if absolutely necessary.
Best Architecture for implementing a WebService that takes requests from one side, save and enhance that and then call another service with new parameters.
is there any special Design Pattern for this?
There's not a lot to go on, but from what you've said it sounds like a job for "pipes and filters"!
To get a more precise answer, you might want to ask yourself some more detailed questions:
If you need to do any validation or transformation of the incoming message? Will you want to handle all requests the same way, or are there different types? Are the external services likely to change, and if so, will they do this frequently? What do you want to do if the final web service call fails (should you rollback the database record?)? How do you want to report failures/responses - do you need to report these back? Do you need a mechanism to track the progress of a particular request?
Since you are looking for a design pattern, I think you might want to compare the pros and cons of using microservices orchestration vs choreography in the context of your project.
If you do not need an immediate response to the calling system I would suggest to you to use event-driven approach if that's feasible. So instead of REST services, you will have a message broker and your services will be subscribed for certain events. This will hide your consumers behind the message broker which will make your system less coupled.
This can be implemented via Spring Cloud Stream, where you will have a Sink (microservice producing events, transformer - microservice that makes intermediate transformations possible and a source - microservice that receives a final result for further processing).
Another possible case could be Camel. It has basically all the integration patterns built in, so it should not be a problem to implement the solution either based on REST APIs or events.
Requirement: Log events like Page Views and form Submits. Each page has ~1 second SLA. The application can have 100's of concurrent users at a time.
Log events are stored into the Database.
Solution: My initial thought was to use an async logging approach where the control returns back to the application and the logging happens in a different thread (via Spring's Thread pool task executor).
However someone suggested using JMS would be a more robust approach. Is the added work(Setting-up queue(s), writing to the queue(s), reading from the queue(s)) required when using this approach worthwhile?
What are some of the best practices / things to look out for (in a production environment) when implementing something like this?
Both approaches are valid, but one is vulnerable if you app unexpectedly stops. In your first scenario, events yet to be written to the database will be lost. Using a persistent JMS queue will mean that those events will be read from the queue and persisted to the database upon restart.
Of course, if your DB writes are so much slower than placing a message of similar size on to a JMS queue, you may be solving the wrong problem?
Using JMS for logging is a complete mismatch. JMS is a Java Abstraction for a Middleware Tool like MQ Series. That is complete overkill, and will let you go through a setup and configuration hell. JMS also lets you place messages in a transactional context, so you already get quickly the idea that JMS might be not much better than Database writes as #rjsang suggested.
This is not that JMS is not a nice technolgy. It is a good technology where it is applied properly.
For Assynchronous logging, you better just depend on a Logging API that directly supports it like Log4j2. In your case, you might be looking to configure a AsyncAppender with a JDBCAppender. Log4j2 has many more appenders as additional options, including one for JMS. However, by at least using a Logging abstraction, you make that all configurable and will make it possible to change your mind at a later time.
In the future we might have something similar to Asynchronous CDI Events, which should work similar to JMS, but would be much more lightweight. Maybe you can get something similar to work by combining CDI Events with EJB Asynchronous Methods. As long as you don't use EJB's with a remote interface, it should also be pretty lightweight.
You could give it a try using fully async and external tooling if you want to. If you have to stick to your SLA at any price and resilience is important for you, you could try using either logstash or process your logs offline. With doing so, you decouple your application from the database and you are no longer depending on the database performance. If the database is slow and you're using async loggers, queues might run full.
With logstash using GELF the whole log processing is handled within a different (or even remote) JVM. Offline processing (e.g. you write CSV logs) allows you to load the log data afterwards into the database.
Are there any recommendations, best practices or good articles on providing integration hooks ?
Let's say I'm developing a web based ordering system. Eventually I'd like my client to be able to write some code, packaged it into a jar, dump it into the classpath, and it would change the way the software behaves.
For example, if an order comes in, the code
1. may send an email or sms
2. may write some additional data into the database
3. may change data in the database, or decide that the order should not be saved into the database (cancel the data save)
Point 3 is quite dangerous since it interferes too much with data integrity, but if we want integration to be that flexible, is it doable ?
Options so far
1. provide hooks for specific actions, e.g. if this and that occurs, call this method, client will write implementation for that method, this is too rigid though
2. mechanism similar to servlet filters, there is code before the actual action is executed and code after, not quite sure how this could be designed though
We're using Struts2 if that matters.
This integration must be able to detect a "state change", not just the "end state" after the core action executes.
For example if an order changes state from In Progress to Paid, then it will do something, but if it changes from Draft to Paid, it should not do anything.The core action in this case would be loading the order object from the database, changing the state to Paid, and saving it again (or doing an sql update).
Many options, including:
Workflow tool
AOP
Messaging
DB-layer hooks
The easiest (for me at the time) was a message-based approach. I did a sort-of ad-hoc thing using Struts 2 interceptors, but a cleaner approach would use Spring and/or JMS.
As long as the relevant information is contained in the message, it's pretty much completely open-ended. Having a system accessible via services/etc. means the messages can tap back in to the main app in ways you haven't anticipated.
If you want this to work without system restarts, another option would be to implement handlers in a dynamic language (e.g., Groovy). Functionality can be stored in a DB. Using a Spring factory makes this pretty fun and reduces some of the complexity of a message-based approach.
One issue with a synchronous approach, however, is if a handler deadlocks or takes a long time; it can impact that thread at the least, or the system as a whole under some circumstances.