My backend system serves about 10K POS devices, each device will request service in a sequential style, however I am wondering how the backend guarantee to handle requests of a given client in sequential style.
For example a device issues a 'sell' request, and timeout(may DB blocked) to get response, so it issue a 'cancellation' to cancel that sale request. In this case, the backend may is still handling 'sale' transaction when get 'cancellation' request, it may cause some unexpected result.
My idea is to setup a persistent queue for each device(client), but is it OK to setup 10K queues? I am not sure, please help.
This is an incredibly complex area of computer science and a lot of these problems have been solved many times. I would not try to reinvent the wheel.
My advice:
Read about and thoroughly understand ACID (summaries paraphrased):
Atomicity - If any part of a transaction fails, the whole transaction fails, and the database is not left in an unknown or corrupted state. This is hugely important. Rely on existing software to make this happen in the actual database. And don't invent data structures that require you to reinvent your own transaction system. Make your transactions as small as possible to reduce failures.
Consistency - The database is never left in an invalid state. All operations committed to it will take it to a new valid state.
Isolation - The operations you perform on a database can be performed at the same time and result in the same state as if performed one after the other. OR performed safely inside a locking transaction.
Durability - Once a transaction is committed, it will remain so.
Both your existing system and your proposed idea sound like they could potentially be violating ACID:
A stateful request system probably violates (or makes it hard not to violate) isolation.
A queue could violate durability if not done in a bullet-proof way.
Not to mention, you have scalability issues as well. Combine scalability and ACID and you have heavyweight situation requiring serious expertise.
If you can help it, I would strongly suggest relying on existing systems, especially if this is for point of sale.
Related
To give some context here, I have been following Project Loom for some time now. I have read The state of Loom. I have done asynchronous programming.
Asynchronous programming (provided by Java NIO) returns the thread to the thread pool when the task waits and it goes to great lengths to not block threads. And this gives a large performance gain, we can now handle many more request as they are not directly bound by the number of OS threads. But what we lose here, is the context. The same task is now NOT associated with just one thread. All the context is lost once we dissociate tasks from threads. Exception traces do not provide very useful information and debugging is difficult.
In comes Project Loom with virtual threads that become the single unit of concurrency. And now you can perform a single task on a single virtual thread.
It's all fine until now, but the article goes on to state, with Project Loom:
A simple, synchronous web server will be able to handle many more requests without requiring more hardware.
I don't understand how we get performance benefits with Project Loom over asynchronous APIs? The asynchrounous API:s make sure to not keep any thread idle. So, what does Project Loom do to make it more efficient and performant that asynchronous API:s?
EDIT
Let me re-phrase the question. Let's say we have an http server that takes in requests and does some crud operations with a backing persistent database. Say, this http server handles a lot of requests - 100K RPM. Two ways of implementing this:
The HTTP server has a dedicated pool of threads. When a request comes in, a thread carries the task up until it reaches the DB, wherein the task has to wait for the response from DB. At this point, the thread is returned to the thread pool and goes on to do the other tasks. When DB responds, it is again handled by some thread from the thread pool and it returns an HTTP response.
The HTTP server just spawns virtual threads for every request. If there is an IO, the virtual thread just waits for the task to complete. And then returns the HTTP Response. Basically, there is no pooling business going on for the virtual threads.
Given that the hardware and the throughput remain the same, would any one solution fare better than the other in terms of response times or handling more throughput?
My guess is that there would not be any difference w.r.t performance.
We don't get benefit over asynchronous API. What we potentially will get is performance similar to asynchronous, but with synchronous code.
The answer by #talex puts it crisply. Adding further to it.
Loom is more about a native concurrency abstraction, which additionally helps one write asynchronous code. Given its a VM level abstraction, rather than just code level (like what we have been doing till now with CompletableFuture etc), It lets one implement asynchronous behavior but with reduce boiler plate.
With Loom, a more powerful abstraction is the savior. We have seen this repeatedly on how abstraction with syntactic sugar, makes one effectively write programs. Whether it was FunctionalInterfaces in JDK8, for-comprehensions in Scala.
With loom, there isn't a need to chain multiple CompletableFuture's (to save on resources). But one can write the code synchronously. And with each blocking operation encountered (ReentrantLock, i/o, JDBC calls), the virtual-thread gets parked. And because these are light-weight threads, the context switch is way-cheaper, distinguishing itself from kernel-threads.
When blocked, the actual carrier-thread (that was running the run-body of the virtual thread), gets engaged for executing some other virtual-thread's run. So effectively, the carrier-thread is not sitting idle but executing some other work. And comes back to continue the execution of the original virtual-thread whenever unparked. Just like how a thread-pool would work. But here, you have a single carrier-thread in a way executing the body of multiple virtual-threads, switching from one to another when blocked.
We get the same behavior (and hence performance) as manually written asynchronous code, but instead avoiding the boiler-plate to do the same thing.
Consider the case of a web-framework, where there is a separate thread-pool to handle i/o and the other for execution of http requests. For simple HTTP requests, one might serve the request from the http-pool thread itself. But if there are any blocking (or) high CPU operations, we let this activity happen on a separate thread asynchronously.
This thread would collect the information from an incoming request, spawn a CompletableFuture, and chain it with a pipeline (read from database as one stage, followed by computation from it, followed by another stage to write back to database case, web service calls etc). Each one is a stage, and the resultant CompletablFuture is returned back to the web-framework.
When the resultant future is complete, the web-framework uses the results to be relayed back to the client. This is how Play-Framework and others, have been dealing with it. Providing an isolation between the http thread handling pool, and the execution of each request. But if we dive deeper in this, why is it that we do this?
One core reason is to use the resources effectively. Particularly blocking calls. And hence we chain with thenApply etc so that no thread is blocked on any activity, and we do more with less number of threads.
This works great, but quite verbose. And debugging is indeed painful, and if one of the intermediary stages results with an exception, the control-flow goes hay-wire, resulting in further code to handle it.
With Loom, we write synchronous code, and let someone else decide what to do when blocked. Rather than sleep and do nothing.
The http server has a dedicated pool of threads ....
How big of a pool? (Number of CPUs)*N + C? N>1 one can fall back to anti-scaling, as lock contention extends latency; where as N=1 can under-utilize available bandwidth. There is a good analysis here.
The http server just spawns...
That would be a very naive implementation of this concept. A more realistic one would strive for collecting from a dynamic pool which kept one real thread for every blocked system call + one for every real CPU. At least that is what the folks behind Go came up with.
The crux is to keep the {handlers, callbacks, completions, virtual threads, goroutines : all PEAs in a pod} from fighting over internal resources; thus they do not lean on system based blocking mechanisms until absolutely necessary This falls under the banner of lock avoidance, and might be accomplished with various queuing strategies (see libdispatch), etc.. Note that this leaves the PEA divorced from the underlying system thread, because they are internally multiplexed between them. This is your concern about divorcing the concepts. In practice, you pass around your favourite languages abstraction of a context pointer.
As 1 indicates, there are tangible results that can be directly linked to this approach; and a few intangibles. Locking is easy -- you just make one big lock around your transactions and you are good to go. That doesn't scale; but fine-grained locking is hard. Hard to get working, hard to choose the fineness of the grain. When to use { locks, CVs, semaphores, barriers, ... } are obvious in textbook examples; a little less so in deeply nested logic. Lock avoidance makes that, for the most part, go away, and be limited to contended leaf components like malloc().
I maintain some skepticism, as the research typically shows a poorly scaled system, which is transformed into a lock avoidance model, then shown to be better. I have yet to see one which unleashes some experienced developers to analyze the synchronization behavior of the system, transform it for scalability, then measure the result. But, even if that were a win experienced developers are a rare(ish) and expensive commodity; the heart of scalability is really financial.
I had been testing an Akka based application for more than a month now. But, if I reflect upon it, I have following conclusions:
Akka actors alone can achieve lot of concurrency. I have reached more than 100,000 messages/sec. This is fine and it is just message passing.
Now, if there is netty layer for connections at one end or you end up with akka actors eventually doing DB calls, REST calls, writing to files, the whole system doesn't make sense anymore. The actors' mailbox gets full and their throughput(here, ability to receive msgs/sec) goes slow.
From a QA perspective, this is like having a huge pipe in which you can forcefully pump lot of water and it can handle. But, if the input hose is bad, or the endpoints cannot handle the pressure, this huge pipe is of no use.
I need answers for the following so that I can suggest or verify in the system:
Should the blocking calls like DB calls, REST calls be handled by actors? Or they good only for message passing?
Can it be like, lets say you have the need of connecting persistently millions of android/ios devices to your akka system. Instead of sockets(so unreliable) etc., can remote actor be implemented as a persistent connection?
Is it ok to do any sort of computation in actor's handleMessage()? Like DB calls etc.
I would request this post to get through by the editors. I cannot ask all of these separately.
1) Yes, they can. But this operation should be done in separate (worker) actor, that uses fork-join-pool in combination with scala.concurrent.blocking around the blocking code, it needs it to prevent thread starvation. If target system (DB, REST and so on) supports several concurrent connections, you may use akka's routers for that (creating one actor per connection in pool). Also you can produce several actors for several different tables (resources, queues etc.), depending on your transaction isolation and storage's consistency requirements.
Another way to handle this is using asynchronous requests with acknowledges instead of blocking. You may also put the blocking operation inside some separate future (thread, worker), which will send acknowledge message at the operation's end.
2) Yes, actor may be implemented as a persistence connection. It will be just an actor, which holds connection's state (as actors are stateful). It may be even more reliable using Akka Persistence, which can save connection to some storage.
3) You can do any non-blocking computations inside the actor's receive (there is no handleMessage method in akka). The failures (like no connection to DB) will be managing automatically by Akka Supervising. For the blocking code, see 1.
P.S. about "huge pipe". The backend-application itself is a pipe (which is becoming huge with akka), so nothing can help you to improve performance if environement can't handle it - there is no pumps in this world. But akka is also a "water tank", which means that outer pressure may be stronger than inner. Btw, it means that developer should be careful with mailboxes - as "too much water" may cause OutOfMemory, the way to prevent that is to organize back pressure. It can be done by not acknowledging incoming message (or simply blocking an endpoint's handler) til it proceeded by akka.
I'm not sure I can understand all of your question, but in general actors are good also for slow work:
1) Yes, they are perfectly fine. Just create/assign 1 actor per every request (maybe behind an akka router for load balancing), and once it's done it can either mark itself as "free for new work" or self-terminate. Remember to execute the slow code in a future. Personally, I like avoiding the ask/pipe pattern due to the implicit timeouts and exception swallowing, just use tells with request id's, but if your latencies and error rates are low, go for ask/pipe.
2) You could, but in that case I'd suggest having a pool of connections rather than spawning them per-request, as that takes longer. If you can provide more details, I can maybe improve this answer.
3) Yes, but think about this: actors are cheap. Create millions of them, every time there is a blocking part, it should be a different, specialized actors. Bring single-responsibility to the extreme. If you have few, blocking actors, you lose all the benefits.
What is most suitable way to handle optimistic locking in jpa. I have below solutions but don't know its better to use this.
Handling exception of optimistic locking in catch blocking and retrying again.
Using Atomic variable flag and checking if its processing then wait until other thread finish its processing. This way data modification or locking contention may be avoided.
Maintaining queue of all incoming database change request and processing it one by one.
Anyone please suggest me if there is better solution to this problem.
You don't say why you are using optimistic locking.
You usually use it to avoid blocking resources (like database rows) for a long time, i.e. data is read from a database and displayed to the user. Eventually the user makes changes to the database, and the data is written back.
You don't want to block the data for other users during that time. In a scenario like this you don't want to use option 2, for the same reason.
Option 1 is not easy, because an optimistic locking exception tells you that something has changed the data behind your back, and you would overwrite these changes with your data. Re-trying to write the data won't help here.
Option 3 might be possible in some situations, but adds a lot of complexity and possible errors. This would be my last resort by far.
In my experience optimistic locking exceptions are quite rare. In most cases the easiest way out is to discard everything, and re-do it from start, even if it means to tell the user: sorry, there was an unexpected problem, do it again.
On the other hand, if you get these problems regularly, between two competing threads, you should try to avoid it. In these cases option 2 might by the way to go, but it depends on the scenario.
If the conflict occurs between a user interaction and a background thread (and not between two users) you could try to change the timing of the background thread, or signal the background thread to delay its work.
To sum it up: It mostly depends on your setup, and when and how the exception occurs.
From App Engine doc on transaction:
Note: In extremely rare cases, the transaction is fully committed even if a transaction
returns a timeout or internal error exception. For this reason, it's best to make transactions
idempotent whenever possible.
Suppose in a situation A transfers money to another person B, the operations should be in a transaction, if the above Note did occur, then it will be in inconsistent state, (the transfer action can not be idempotent). Is my understanding correct?
You would need to make such a transaction idempotent. See this earlier StackOverflow item for a much deeper description of the issue and resolutions: GAE transaction failure and idempotency
What is 'Extended Session Antipattern' ?
An extended (or Long) session (or session-per-conversation) is a session that may live beyond the duration of a transaction, as opposed to transaction-scoped sessions (or session-per-request). This is not necessarily an anti-pattern, this is a way to implement Long conversations (i.e. conversations with the database than span multiple transactions) which are just another way of designing units of work.
Like anything, I'd just say that long conversations can be misused or wrongly implemented.
Here is how the documentation introduces Long conversations:
12.1.2. Long conversations
The session-per-request pattern is
not the only way of designing units of
work. Many business processes require
a whole series of interactions with
the user that are interleaved with
database accesses. In web and
enterprise applications, it is not
acceptable for a database transaction
to span a user interaction. Consider
the following example:
The first screen of a dialog opens. The data seen by the user has been
loaded in a particular Session and
database transaction. The user is free
to modify the objects.
The user clicks "Save" after 5 minutes and expects their
modifications to be made persistent.
The user also expects that they were
the only person editing this
information and that no conflicting
modification has occurred.
From the point of view of the user, we
call this unit of work a long-running
conversation or application
transaction. There are many ways to
implement this in your application.
A first naive implementation might
keep the Session and database
transaction open during user think
time, with locks held in the database
to prevent concurrent modification and
to guarantee isolation and atomicity.
This is an anti-pattern, since lock
contention would not allow the
application to scale with the number
of concurrent users.
You have to use several database
transactions to implement the
conversation. In this case,
maintaining isolation of business
processes becomes the partial
responsibility of the application
tier. A single conversation usually
spans several database transactions.
It will be atomic if only one of these
database transactions (the last one)
stores the updated data. All others
simply read data (for example, in a
wizard-style dialog spanning several
request/response cycles). This is
easier to implement than it might
sound, especially if you utilize some
of Hibernate's features:
Automatic Versioning: Hibernate can perform automatic optimistic
concurrency control for you. It can
automatically detect if a concurrent
modification occurred during user
think time. Check for this at the end
of the conversation.
Detached Objects: if you decide to use the session-per-request pattern,
all loaded instances will be in the
detached state during user think time.
Hibernate allows you to reattach the
objects and persist the modifications.
The pattern is called
session-per-request-with-detached-objects.
Automatic versioning is used to
isolate concurrent modifications.
Extended (or Long) Session: the Hibernate Session can be disconnected
from the underlying JDBC connection
after the database transaction has
been committed and reconnected when a
new client request occurs. This
pattern is known as
session-per-conversation and makes
even reattachment unnecessary.
Automatic versioning is used to
isolate concurrent modifications and
the Session will not be allowed to be
flushed automatically, but
explicitly.
Both
session-per-request-with-detached-objects
and session-per-conversation have
advantages and disadvantages. These
disadvantages are discussed later in
this chapter in the context of
optimistic concurrency control.
I've added some references below but I suggest reading the whole Chapter 12. Transactions and Concurrency.
References
Hibernate Core Reference Guide
12.1.2. Long conversations
12.3. Optimistic concurrency control