Spring #Transactional and long-running business logic

Spring #Transactional and long-running business logic - java

I am having some problems with Spring #Transactional and hope to find a solution here. I have a method in the service labeled as #Transactional which works by calling a method to request data via HTTP protocol (I'm using Apache HttpClient now), parsing the response and writing the result to the database. Since it all works in one method I'm afraid that this may cause transaction cancellation, because my project has a transaction time limit and request from external API may be really long.
Here I would like an answer on how to most correctly separate HTTP request+response parsing and database operations in such a case. As an option it is possible to have two methods one for transaction and the other for the rest of the logic, but I already have an assumption that there is a generally accepted design pattern for such tasks.

Different types of I/O in single #Tranactional method could cause trouble. If the API call responds very slowly for a while, this method would hold the borrowed Connection while waiting for the response.
You must break the single method.
Method1 will make the rest call. Response received with http status code 200 OK. Store it in suitable collection.
Method 2 will process records from method 1. You can explore on TransactionTemplate.
Refer - Sample code for reference, Sample code here

Related

Use response from a restful webservice endpoint call to be used later on in some other webservice endPoint call

I want response from one webservice call to be used later on by some other webservice call. How do I implement the code flow for it. Do I have to maintain a session?
I am creating the restful webservices using Spring,java
If user 1 calls an endPoint /getUserRecord and 1 minute later calls /checkUserRecord which uses data from the first call, how to handle it since user 2 can call /getUserRecord before user 1 calls /checkUserRecord
I am using Spring java to create RESTFul webservices.
Thanks,
Arpit

technically you can pass UserRecord from get to check.
GET /userrecords/ --> return the one or more record
POST /checkUserRecord with the record as you want to check as request body.
But I strongly advise you to not do this. Data provided by client are unreliable and cannot be trust by your backend code. What if some javascript has altered the original data ? Besides, what if you have a list of data or heterogenous payload to pass back and forth, it would end up to a complete mess of payload exchanges from client and server.
So as Veselin Davidov said, you should probably stick with clean stateless REST paradigm and rely on identifier:
so
GET /userrecords/ --> [ { "id": 1, "data":"myrecorddata"}, ...]
GET /checkUserRecord/{id} like /checkUserRecord/1
and yes, you will have to make two calls to the database. If your concern is performance, you can set some caching mecanism as piy26 points out, but caching could lead you to other issues (how to define a proper and reliable caching strategy ?).
Unless you manage a tremendous amount of data, I think you should first focus on providing a clear, maintainable and safely usable REST API with stateless design.

If you are using Spring Boot, it provides a way to enable caching on your Repository object which is what you are trying to solve.
#EnableCaching
org.springframework.cache.annotation.EnableCaching
You can use UserID as a hash attribute while creating your key so that response from different users remains unique.

Why HTTP method PUT should be idempotent and not the POST in implementation RestFul service?

There are many resources available over the internet wherein PUT vs POST is discussed. But I could not understand how would that affect the Java implementation or back end implementation which is done underneath for a RestFul service? Links I viewed are mentioned below:
https://www.keycdn.com/support/put-vs-post/
https://spring.io/understanding/REST#post
https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
http://javarevisited.blogspot.com/2016/10/difference-between-put-and-post-in-restful-web-service.html
For example let's say there is a RestFul webservice for Address.
So POST /addresses will do the job of updating the Address and PUT /addresses/1 will do the job of creating the one.
Now how the HTTP method PUT and POST can control what weservice code is doing behind the scenes?
PUT /addresses/1
may end up creating multiple entries of the same address in the DB.
So my question is, why the idempotent behavior is linked to the HTTP method?
How will you control the idempotent behavior by using specif HTTP methods? Or is it that just a guideline or standard practice suggested?
I am not looking for an explanation of what is idempotent behavior but what make us tag these HTTP methods so?

This is HTTP specific. As RFC linked by you states that https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html (see up to date RFC links at the bottom of this answer). It is not described as part of REST: https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm
Now you wrote,
I am not looking for an explanation of what is idempotent behavior but
what make us tag these HTTP methods so?
An idempotent operation has always the same result (I know you know it), but the result is not the same thing as the HTTP response. It should be obvious from HTTP perspective that multiple requests with any method even all the same parameters can have different responses (ie. timestamps). So they can actually differ.
What should not change is the result of the operation. So calling multiple times PUT /addresses/1 should not create multiple addresses.
As you see it's called PUT not CREATE for a reason. It may create resource if it does not exist. If it exists then it may overwrite it with the new version (update) if its exactly the same should do nothing on the server and result in the same answer as if it would be the same request (because it may be the same request repeated because the previous request was interrupted and the client did not receive response).
Comparing to SQL PUT would more like INSERT OR UPDATE not only INSERT or UPDATE.
So my question is, why the idempotent behavior is linked to the HTTP method?
It is likned to HTTP method so some services (proxies) know that in case of failure of request they can try safely (not in the terms of safe HTTP method but in the terms of idempotence) repeat them.
How will you control the idempotent behavior by using specif HTTP methods?
I'm not sure what are you asking for.
But:
GET, HEAD just return data it does not change anything (apart maybe some logs, stats, metadata?) so it's safe and idempotent.
POST, PATCH can do anything it is not safe nor idempotent
PUT, DELETE - are not safe (they change data) but they are idempotent so it is "safe" to repeat them.
This basically means that safe method can be made by proxies, caches, web crawlers etc. safely without changing anything. Idempotent can be repeated by software and it will not change the outcome.
Or is it that just a guideline or standard practice suggested?
It is "standard". Maybe RFC is not standard yet but it will eventually be one and we don't have anything else we could (and should) follow.
Edit:
As RFC mentioned above is outdated here are some references to current RFCs about that topic:
Retrying idempotent requests by client or proxy: https://www.rfc-editor.org/rfc/rfc7230#section-6.3.1
Pipelineing idempotent requests: https://www.rfc-editor.org/rfc/rfc7230#section-6.3.2
Idempotent methods in: HTTP https://www.rfc-editor.org/rfc/rfc7231#section-4.2.2
Thanks to Roman Vottner for the suggestion.

So my question is, why the idempotent behavior is linked to the HTTP method?
I am not looking for an explanation of what is idempotent behavior but what make us tag these HTTP methods so?
So that generic, domain agnostic participants in the exchange of messages can make useful contributions.
RFC 7231 calls out a specific example in its definition of idempotent
Idempotent methods are distinguished because the request can be repeated automatically if a communication failure occurs before the client is able to read the server's response. For example, if a client sends a PUT request and the underlying connection is closed before any response is received, then the client can establish a new connection and retry the idempotent request. It knows that repeating the request will have the same intended effect, even if the original request succeeded, though the response might differ.
A client, or intermediary, doesn't need to know anything about your bespoke API, or its underlying implementation, to act this way. All of the necessary information is in the specification (RFC 7231's definitions of PUT and idempotent), and in the server's announcement that the resource supports PUT.
Note that idempotent request handling is required of PUT, but it is not forbidden for POST. It's not wrong to have an idempotent POST request handler, or even one that is safe. But generic components, that have only the metadata and the HTTP spec to work from, will not know or discover that the POST request handler is idempotent.
I could not understand how would that affect the Java implementation or back end implementation which is done underneath for a RestFul service?
There's no magic; using PUT doesn't automatically change the underlying implementation of the service; technically, it doesn't even constrain the underlying implementation. What it does do is clearly document where the responsibility lies.
It's analogous to Fielding's 2002 observation about GET being safe
HTTP does not attempt to require the results of a GET to be safe. What
it does is require that the semantics of the operation be safe, and
therefore it is a fault of the implementation, not the interface
or the user of that interface, if anything happens as a result that
causes loss of property (money, BTW, is considered property for the
sake of this definition).
An important thing to realize is that, as far as HTTP is concerned, there is no "resource hierarchy". There's no relationship between /addresses and /addresses/1 -- for example, messages to one have no effect on cached representations of the other. The notion that /addresses is a "collection" and /addresses/1 is an "item in the /addresses collection" is an implementation detail, private to the origin server.
(It used to be the case that the semantics of POST would refer to subordinate resources, see for example RFC 1945; but even then the spelling of the identifier for the subordinate was not constrainted.)
I mean PUT /employee is acceptable or it has to be PUT/employee/<employee-id>
PUT /employee has the semantics of "replace the current representation of /employee with the representation I'm providing". If /employee is a representation of a collection, it is perfectly fine to modify that collection by passing with PUT a new representation of the collection.
GET /collection
200 OK
{/collection/1, collection/2}
PUT /collection
{/collection/1, /collection/2, /collection/3}
200 OK
GET /collection
200 OK
{/collection/1, /collection/2, /collection/3}
PUT /collection
{/collection/4}
200 OK
GET /collection
200 OK
{/collection/4}
If that's not what you want; if you want to append to the collection, rather than replace the entire representation, then PUT has the wrong semantics when applied to the collection. You either need to PUT the item representation to an item resource, or you need to use some other method on the collection (POST or PATCH are suitable)
GET /collection
200 OK
{/collection/1, collection/2}
PUT /collection/3
200 OK
GET /collection
200 OK
{/collection/1, /collection/2, /collection/3}
PATCH /collection
{ op: add, path: /4, ... }
200 OK
GET /collection
200 OK
{/collection/1, /collection/2, /collection/3, /collection/4 }

How will you control the idempotent behavior by using specific HTTP
methods? Or is it that just a guideline or standard practice
suggested?
It is more about the HTTP specification and an app must follow these specifications. Nothing stops you from altering the behavior on the server side.
There is always a difference between the web service and a Restful Web service.
Consider some legacy apps which uses servlets. Servlets used to have doGet and doPost methods. doPost was always recommended for security reasons and storing data on server/db over doGet. as the info is embedded in the request it self and is not exposed to the outer worlds.
Even there nothing stops you to save data in doGet or return some static pages in doPost hence it's all about following underlying specs

So my question is, why the idempotent behavior is linked to the HTTP method?
Because the HTTP specifications says so:
4.2.2. Idempotent Methods
A request method is considered "idempotent" if the intended effect on
the server of multiple identical requests with that method is the
same as the effect for a single such request. Of the request methods
defined by this specification, PUT, DELETE, and safe request methods
are idempotent.
(Source: RFC 7231)
Why do the specs say this?
Because is useful for implementing HTTP based systems to be able to distinguish idempotent and non-idempotent requests. And the "method" type provides a good way to make the distinction.
Because other parts of the HTTP specification are predicated on being able to distinguish idempotent methods; e.g. proxies and caching rules.
How will you control the idempotent behavior by using specif HTTP methods?
It is up to the server to implement PUT & DELETE to have idempotent behavior. If a server doesn't, then it is violating the HTTP specification.
Or is it that just a guideline or standard practice suggested?
It is required behavior.
Now you could ignore the requirement (there are no protocol police!) but if you do, it is liable to cause your systems to break. Especially if you need to integrate them with systems implemented by other people ... who might write their client code assuming that if it replays a PUT or DELETE that your server won't say "error".
In short, we use specifications like HTTP so that our systems are interoperable. But this strategy only works properly if everyone's code implements the specifications correctly.

POST - is generally not idempotent so that multiple calls will create multiple objects with different IDs
PUT - given the id in the URI you would apply "create or update" query to your database thus once the resource is created every next call will make no difference to the backend state
I.e. there is a clear difference in the backend in how you generate new / updated existing stored objects. E.g. assuming you are using MySQL and auto-generated ID:
POST will end up as INSERT query
PUT will end up in INSERT ... ON DUPLICATE KEY UPDATE query

Normally in Rest APIs. We used
POST - Add data
GET - get data
PUT - update data
DELETE - delete data
Read below post to get more idea.
REST API Best Practices

Jersey/JAX-RS put, delete idempotency - provided or to be done by the programmer

According to REST, put, delete etc are idempotent, i.e. the operation done on a resource repeatedly results in same response. Is this done automatically somehow (caching on browser etc), or is it to be done/ ensured by programmer (in the method)?
Is this idempotency just symbolic?
For example, in my method to handle the delete request -
#DELETE
#Produces({MediaType.TEXT_HTML})
public Response deleteEmployee() {
String response = DAOaccess.deleteEmployee(name);
return Response.noContent().build();
}
I could do anything inside this method. So, do I have to ensure idempotency here, by writing such code (checking for id etc)?
When somebody asks me the difference between put and post, are they asking from HTTP perspective, and not from JAX-RS (since maybe there is no functional difference there in JAX-RS)?

Yes, the developer is responsible for ensuring idempotency here. PUT and DELETE should be idempotent according to the standard, but there is plenty of room for interpretation as to what that means. JAX-RS does relatively little to ensure that the developer is following REST best-practices, and will route every request to the appropriate endpoint, absent a filter that short-circuits the request.
Does the second delete call return a 204 or a 404? Either response leaves the system in the same state given repeated calls to the same method; one signals the client that there was no resource for it to delete, and the other signals the client that there definitely is no such resource at this time.
The only wrong implementation (according to the REST standard) would be for the system to toggle the deleted status of the resource on repeated calls - this would leave the client unsure as to what effect its next call would have on the system.

Do multiple calls to different methods in parallel in spring

I am fetching data from several different APIs. They are rest and soap web services. I have one id that I pass to each API one by one and get data in return. But each API takes few seconds to return the result and so the final response object that I create takes too much time.
My application is a Spring 4 Rest Service. What is the best way to call all these several APIs in parallel so that my Response time reduces to as less as possible.
Thanks.

You can use #Async annotation. You can find an example here

Daniel's answer is right but I would like to add something more to it. If you want to do something with your results but don't want to block with Future#get then I would suggest you to use CompletableFuture class.
It will let you add various actions which will be triggered on its completion.
There is also a really nice article on how to use CompletableFuture with Spring's #async annotation. Here it the link. Completable futures with Spring async

Global Resource Object in Spring

I'm just getting into Spring (and Java), and despite quite a bit of research, I can't seem to even express the terminology for what I'm trying to do. I'll just explain the task, and hopefully someone can point me to the right Spring terms.
I'm writing a Spring-WS application that will act as middleware between two APIs. It receives a SOAP request, does some business logic, calls out to an external XML API, and returns a SOAP response. The external API is weird, though. I have to perform "service discovery" (make some API calls to determine the valid endpoints -- a parameter in the XML request) under a variety of situations (more than X hours since last request, more than Y requests since last discovery, etc.).
My thought was that I could have a class/bean/whatever (not sure of best terminology) that could handle all this service discovery stuff in the background. Then, the request handlers can query this "thing" to get a valid endpoint without needing to perform their own discovery and slow down request processing. (Service discovery only needs to be re-performed rarely, so it would be impactful to do it for every request.)
I thought I had found the answer with singleton beans, but every resource says those shouldn't have state and concurrency will be a problem -- both of which kill the idea.
How can I create an instance of "something" that can:
1) Wake up at a defined interval and run a method (i.e. to check if Service discovery needs to be performed after X hours and if so do it).
2) Provide something like a getter method that can return some strings.
3) Provide a way in #2 to execute a method in the background without delaying return (basically detect that an instance property exceeds a value and execute -- or I suppose, issue a request to execute -- an instance method).
I have experience with multi-threaded programming, and I have no problem using threads and mutexes. I'm just not sure that's the proper way to go in Spring.

Singletons ideally shouldn't have state because of multithreading issues. However, it sounds like what you're describing is essentially a periodic query that returns an object describing the results of the discovery mechanism, and you're implementing a cache. Here's what I'd suggest:
Create an immutable (value) object MyEndpointDiscoveryResults to hold the discovery results (e.g., endpoint address(es) or whatever other information is relevant to the SOAP consumers).
Create a singleton Spring bean MyEndpointDiscoveryService.
On the discovery service, save an AtomicReference<MyEndpointDiscoveryResults> (or even just a plain volatile variable). This will ensure that all threads see updated results, while limiting them to a single, atomically updated field containing an immutable object limits the scope of the concurrency interactions.
Use #Scheduled or another mechanism to run the appropriate discovery protocol. When there's an update, construct the entire result object, then save it into the updated field.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.