How to synchronize 2 ConcurrentHashMaps? - java

I am trying to write a match maker for buying and selling items. Internally, I am using 2 HashMaps, 1 for buys and one for sells for each item i.e. if a user sends me a buy request I put it in my buy HashMap and vice versa. The key is the price while the value is a queue of orders at that price (so that I can entertain requests on a FIFO basis if they have the same price. Once I receive a request e.g. buy, I look in to the sell HashMap for any matches. Users can change the quantity or price of what they want to buy but they not the item itself e.g. can change a bike buying request's price or quantity but cannot change bike to boat)
I would like to make this multithreaded so multiple requests can be handled at the same time. So I made my hashmaps in to ConcurrentHashMaps and the queue in the value a ConcurrentLinkedQueue. However, there still can be concurrency issues e.g. I am looking in to the sell map to find a match for my buy request but while I am making the match that sell request gets amended by the user to, say, a different price.
How can I synchronize the two maps with eachother? I would like to lock the same segment (i.e. the queue at that price) in both maps at the same time.

Add single Map where you keep track of available stock, do your locks only there.
Note: Also ConcurrentHashMap doesn't lock single entry it locks bucket so there may be more elements.

I think the best option for you is to use ReentrantReadWriteLock. This will help the reader thread to read unless anybody is writing to the Map. You can also achieve the same using a synchronized block to perform all the operations, But that will not scale.
for ReentrantReadWriteLock doc Please check https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/ReentrantReadWriteLock.html

Related

How to verify prices in real time

I've been breaking my head for a few days and can not seem to reach a creative solution.
I'm building a price comparison app.
I have about 50 stores, each holding the same products, for example: milk, bread, meat.
A user enters my application, selects a category, for example milk, and once he has chosen a product, I have to go through all the shops and check where the cheapest milk is.
My question is, how do the verification prices in the most efficient way.
I thought of such an idea:
Hold all the stores in the array, all products hold JSON file and give each product ID.
Once the user selects the product, run all the array stores.
The question of how to pull from them the prices? How to do the PARSING.
Ideas?
This is an example site that illustrates exactly what I want to do.
https://www.zapmarket.co.il/fruits_vegetables
Translate it to english=]
Instead of holding the stores in the array, hold a product class with all the attributs needed. Then you can sort by price and get the store

last item in the cart

I am learning microservices and trying to design an e-commerce website. I can't figure out how big shopping sites take care of the last item in the cart problem.
For example, I selected an item from Amazon which had just a single item available in stock. I logged in from two different accounts and placed the item in cart. I even reached the payment page from both the account and the site didn't restrict me anywhere saying that the item is not available. I am not sure after the payment page when payment from both the account is in progress, how Amazon handles it.
Few solutions which come to my mind are like:
Accept payment from both the accounts and later cancel transaction for one of them which paid later than the first. This will not be a good practice though as it will result in bas customer experience.
Keep few items in reserve and use them in case of overbooking.
I forget what Amazon is doing and implement quantity checks in Order service from Item service via REST calls, at every stage of the order. But these checks sometimes can fail when a lot of people are ordering the same item. for e.g. in flash sales
Please share if you guys have worked on similar problem and solved it even with few limitations. If I need to put any in more details, let me know.
I cannot answer how Amaozon does it, nor I think anyone could on a public forum I can tell you how I think this could be managed.
So you have to take lock on your inventory if you want to make sure you precisely map inventory to an order. If you intend to do that, question will be where you take lock. When an item gets added to the cart, when user goes for payment or when payment is done. But the problem with lock is that it will make you system slow.
So that is something you should avoid.
Rest all the options you have already covered in your question and it boils down to tradeoffs.
First point, user experience will suffer and you also need to incur the cost of the transaction.
Second option ask you to be ready to undersell or oversell.
When you keep reserves, you are basically saying that I will be underselling. This can also backfire because say you decide to reserve 5 items but you get 20 concurrent request foir checkout and payment, you will be back to the square one. But it can help in most scenarios, given you are willing to take a hit.
Doing inventory check at checkout can help you get better resolution on inventory but it will not help when you literally have last item in inventory and 10 people doing a checkout on it. Read calls even for two such request coincides you will give them inventory and back to square one.
So what I do in such scenarios, is
1. My inventory goes not as number but enum i.e critical, low, med, high, very high
Depending on some analytics we configure inventory check. For high and very high we will not do any check and book the item. for critical and we take the lock. (not exactly a db lock but we reserve the inventory for them), for low and medium we check the inventory and proceed if we have enough. All these values are configurable and help us mitigate the scenarios we have.
Another thing that we are trying is to distribute inventory to inventory brokers and assign inventory broker to some set of services to see this inventory. Even if we reserve the inventory on one broker others can continue selling freely. And there brokers regularly update the inventory master about the status of inventory. Its like Inventory master has 50 items, it distributes 5 each to all ten. After 10 mins they come back and if they need more inventory they ask for it, if they have left over (in case of failure) they drop back inventory to the master for it to be assigned to others.
The above approach will not help you resolve the issue precisely but it gives you certain degree of freedom as to how you can manage the inventory.
Consider doing:
On the payment page, you should re-check if the product is still available. This can be a simple HTTP GET.
If the GET call is slow for you, consider caching recent product added by user to some in-memory databases (eg. REDIS). Now if first users successfully processes the payment, decrease counter for that product-id in redis. And before proceeding payment for second user, check the counter of that product-id in redis.
(BONUS: Redis offers atomic operations, so you can successfully handle the race condition in ordering the product as well.)

Need advice on most effective List to use, and the best practice to generate unique ids to each member

So I've got this school project, and I would really like to approach it with the best practices.
I need to make a list of customers for an insurance company. Each of these shall have a unique customer number, generated in ascending order.
Every customer can have zero to many insurances, also stored in seperate lists for each customer. Adding of insurances will happen more often than adding of customers.
Every customer can also have any numbers of claims. Every claim also has a unique id number.
If a customer cancels all insurances. All data on this customer will remain as history.
All data need to be stored via one of the file classes in the Java Standard Library. Databases are not allowed.
Actions such as showing of statistics will also be available.
Users of the program will be employees, with rights to edit every data field.
Questions:
What Collection class would be the most effective one to use? LinkedList, ArrayList, Hashmap or any other?
What file class would be the best one for saving the lists? ObjectOutputStream?
What is the best method of generating new unique ids for both customers and claims? As private fields in the customer list class? Information on the next unique id has to be restored every time the program exits and restarts.
Edit:
Not looking for help with any code. Just advice on the most common classes to use in a scenario like this.
What Collection class would be the most effective one to use?
LinkedList, ArrayList, Hashmap or any other?
Ans - LinkedList and ArrayList are types of List. HashMap is a type of Map.
What implementation of List you want to use depends on your requirement. If you are going to perform insertions and removals of elements at different points of a List frequently, then LinkedList makes more sense. It is more efficient at, say for example, removing an element in the middle of the List. Otherwise prefer to use ArrayList.
What is the best method of generating new unique ids for both
customers and claims? As private fields in the customer list class?
Information on the next unique id has to be restored every time the
program exits and restarts.
You may want to use a Singleton to generate IDs, and also persist them to a file.

Concurrency control for inventory control

There are many similar questions related to my question. But I didnt find any satisfactory ans for my question. So I am putting this question in this forum.
I have a question in concurrency control in inventory management system.
Say I have products A, B, C with quantity 2,3,4. And my application is multi user.
I have product page where user see the list of products and available quantity.
and I have check out and payment page which may take some time to reach after product page.
Now if it is multi user web application and say user 1 has ordered 2 quantity product A but order is not yet placed, user 2 can still see A with 2 quantities.
Should I temporarily(configurable time) lock the 2 quantities of product A until order is placed? Is it a good design. If yes, should I lock in java or in the database?
No, it's not a good design because it lend itself for abuse and problems. What if a user (competitor?) locks all your products during the whole month? What if another person fills up his/her shopping cart with product and then decide it's too much money and just turns off his/her device?
Best alternatives are:
1) Tell the user how much availability is of each product, but also tell him/her it could be gone if an order is not placed soon. This should also incentive sales. Do lock/unlock your products after the payment page, i.e. when there is a business commit to the operation. If available quantities are not enough anymore, send the user back a previous page, updating the amounts to whetever is available.
2) Similar to 1), but you could also update availability from time to time. Or send warnings like "the stock of some items in your cart is running low". Again, this also could prompt sales.
3) Reserve ("lock") items as they are moved to a shopping cart, but not forever. Release the items after being locked for certain amount of time. Keep informed the user. The time-out can be per the whole shopping cart or per item.
It is very important to notice that any "lock" mentioned above is a "business lock/reservation". It doesn't need to be implemented in the form of locks or any other particular technical solution. For example, 3) above can be implemented by adding fields locked_by and locked_until. While you check/update/manipulate these fields you will probably need to do it within a "technical lock". After the checks/updates are done, the technical locks are released. However, a "business lock" could still in place because locked_until has not elapsed yet and any other code will check this field to consider the product available or not. Why? Because business rules mandate so (not because there is a technical lock in place, which in fact is not).
"Technical locks" should be very quick. "Business locks" can be much longer (but never forever; always define a time limit for them).
It's hard to tell if you should lock "in Java" on "in the database" with the very little information you give. Are you using Entity Beans, for example? What are your fault tolerance requirements? Etc.
That said, in the general case it's probably better to keep locks (as long as they are "business locks") in the database, for these main reasons:
Persistence (in case of a power failure). You should also provide a mechanism to recover shopping carts. Perhaps also storing them in the DB?
Ability to interface with other environments (i.e. a corporate ERP or fullfillment system).
Should I temporarily(configurable time) lock the 2 quantities of product A until order is placed? Is it a good design.
It depends on the conversation rate of your site, i.e. the number of checkouts/the number of payments. If you have a high conversation rate, you can pre lock the quantities to get a better user experience.
If yes, should I lock in java or in the database?
You need a global lock to guarantee the correctness. If you have multiple application servers, you have to put the lock in the database.
Inventory Management should be handled by a Custom System Built from ground up. You cannot rely simply on ACID compliance for performance reasons. Implementing Transactions on Inventory during the order is a very bad idea and is not scalable. I propose the following solution.
An Inventory Management Backend App that updates the inventory as new items come in. Use row lock to update the inventory.
An Inventory Management Micro Service App to give the inventory to Order Management System as it needs the inventory to finish the order and keep track of time out.
a. If the order finished with Acknowledgement, The given inventory is already deducted by management system. We are good.
b. If the order is not finished and no Acknowledgement is received, the given inventory is added back to the inventory system after the timeout (Typically 2 - 3 minutes).
c. If the order has failed and acknowledgement is received that the order is failed, the given inventory is added back to the inventory system.
So, you don't lock the product rows until the order is finished. Instead, you do a soft lock on inventory and release if the order was not successful or failed due to an exception/App Failure.

Choosing data strucutures to sort TOP 10 items out of zillion items based on users rating

Lets say you're running a movie database website like IMDb/Netflix and users rate each movie from 1-10 star. When a user rate movie, I get id (long) and rating from 1-10 in the request. The Movie class looks like this.
class Movie
{
long id;
String name;
double avgRating; //Avg Rating of this movie
long numberOfRatings; //how many times this movie was rated.
}
public void updateRating(long movieId, int rating)
{
//code to update movie rating and update top 10 movie to show on page.
}
My question is what data structures I can choose to keep huge movies data in memory so that on each updateRating call, i update movie rating as well as update Top 10 movie and reflect on the webpage and users will always see the latest top 10 movies. I have a lot of space on web server and i can keep all the movies objects in memory. The challenges here are
1) Look up a movie by id.
2) update movie rating.
3) choose new location of this movie in the sorted collection of movies (sorted by ratings)
and if its new position is in first top 10, show it on web page.
All these operations should be done in best optimal time.
this is not a homework but a general programming and data structure question.
I'd personally use a relational database for this.
Make a Movie table with an ID, and Name field, using the ID as the primary key (clustered)
Make a Rating table with an ID, UserId, MovieId, and Rating field. Use the obvious foreign key references.
Use an ORM to construct your Movie object based on a query across these tables.
But I suppose if you're looking at it purely from a data structures and algorithms standpoint, I'd begin by changing your Movie class to have a running ratingSum field, so that you can calculate the average on the fly. Then I'd create a list that maxes out at ten objects. Any time a rating is added, I would check to see if the new average for that movie is higher than the least of the items in the "top 10" list. If it is, then I'd insert it into the appropriate place in that list and drop the last item off the bottom of the list. Obviously, if it's already in the list then you only need to worry about reordering the existing items rather than removing one. This is a simple approach that would only have a tiny cost with each ratings update.
(A Linked List would probably give you the best performance for your "top 10" list, but with only 10 items that only get rearranged a few times a week at most, you probably wouldn't notice a difference.)
Obviously, you'll have to have all of the movies in a collection with quick lookup times (like a Hashtable) in order to find them by ID. Of course, with a zillion items, you're going to be hard pressed to fit all this into memory. Hence the Relational Database.
It seems like there are two parallel structures here. First, you need a lookup table that can map from IDs to movies. Second, you need to maintain some sort of priority queue that can be used to track the top ten movies overall.
One way to solve this problem would be to simply maintain these two structures concurrently. Since you know that each movie has an integral ID, you could either store the movies in a giant array, or if you expect the IDs to be sparse in a hash table. Additionally, you could maintain a priority queue (perhaps backed by a binary or binomial heap) that stores all movies with priority equal to their rating. This would allow you to determine the top ten movies by dequeuing ten elements from the priority queue and then reinserting them.
However, to squeeze more performance out of your priority queue, I'd suggest using a slightly modified queue structure in which you have an array of the top ten movies in sorted order and a priority queue of all other movies that are not in the top ten. Whenever you update the priority of a movie, you could do the following:
If the movie is in the top-ten array, remove it from that array and shuffle the elements after it up one spot. Then insert it into the priority queue with its new rating.
Otherwise, use the priority queue's decrease-key function to reduce its key. If the rating is now higher than the tenth-most popular movie in the top ten list, remove that movie from the top ten list and insert it into the priority queue. Otherwise, we are done.
(At this point, the element is now in the priority queue at its proper location, and the top ten movies array has nine elements in it)
Use the priority queue's dequeue-max function to extract the most popular movie from the priority queue, then use a simple insertion sort to insert it into the array of the top ten most popular movies.
The overall time complexity for this approach (assuming you use a binary or binomial heap) is O(k2 + lg n), where k is the number of elements in the top-ten list and n is the total number of movies. On average, it runs in O(lg n) time, since chances are you don't need to update the top ten list. In either case, since k is small (ten), I'd assume that this would work very quickly. Moreover, it gives you O(1) lookup for any of the top k movies, which I expect will be a pretty common operation.
Hope this helps!
If you need to access the entire data set sorted and I would suggest using a sorted tree and compare your items by rating.
If, however, you only need to view the top ten. Then you could use a sorted deque, and every time you update an item's rating add it to the deque and immediatly trim it to no more than 10 items (unless you use a bounded implementation, then that is done for you).
To populate the top 10 list initially you'll have to make a pass over all the data. However, after that you could keep the rating of the #10 movie and, each time a vote is cast, update the top 10 only if the updated movie's rating is greater than or equal to the rating of #10. Anything less than that average rating would not affect the top 10.
Also, I'd store the data in a relational database as has already been suggested, and keep only the top 10 in memory.

Categories

Resources