I've read about Semaphore class there and now what I'd like to understand is how can I use that class in a real code? What is the usefulness of the Semaphores? I got the thing that we could use semaphores in order to improve performance, reducing a concurrency for a resource. Is it the main usage of the Semaphore?
tl;dr Answer: Semaphores let you limit access to some code path to a certain number of threads - without controlling the mechanism handling those threads. A sample use case is a webservice that offers some resource-intensive task - using a Semaphore, you can limit that task to i.e. 5 threads while using the app server's larger thread pool handle both this and some other types of requests.
Long answer: See Furkan Omay's comment.
Semaphores are a part of the concurrency package in java. So as the package says, it is used to leverage the flow of concurrent access to the code. Unlike 'Synchronized' and 'Lock' in java using semaphores you can control the access of your code to a certain number of users.
Consider it as a bouncer in the pub who allows people whether they can enter or not. If the pub is full and cant take any more people he stops until someone leaves. Semaphores can be used to do something like this to your code!!
Hope it helps!!
Related
Imagine a situation where multiple processes try to use a shared resource.
You can protect it by using a java monitor ( for example - synchronized methods).
But what if your classes must obey to that protocol?
request method - critical section - end method
Any process is the only one executing the request and end methods simultaneously, thanks to the synchronized blocks, but what about the core of the critical section?
Using other constructs like Semaphores or Lock/Condition you can make it easily, but with native monitor you are bonded to the fact that a synchronization is identified by a block that cannot cross multiple methods.
If you use a boolean that tells you whether the resource is busy (calling wait() right after) or not, deadlock can occurr!
So, what could be a good solution for this?
Imagine a situation where...
There's a name for that, it's long transaction, and if you think you need to implement it, that's a sign that it may be time to re-think your design.
Why it's bad, and how to avoid it is a book-level topic.
Here's one book that covers it pretty well:
https://www.amazon.com/Patterns-Enterprise-Application-Architecture-Martin/dp/0321127420
In my web application, we do some socket job in our servlet, and we log socket data into database.
I want to make that logging process asynchronous to improve performance.
My idea is using a separate dedicated thread to do the logging job. In my servlet, I just submit data to a cache, and let the logging thread to process them one by one.
I have a little experience in threading, What collection I can use as the cache ? What's the basic code pattern to implement this ? Please provide some code to show how to achieve that.
sorry for my poor English
As my application is legacy system running in production environment.It just use servlet and jsp no other Java EE technology. It seems that adding JMS support is too expensive for me.
A queue and a thread pool should be good.Publish your messages on a queue, let the workers thread pick messages from the queue and save them in database. Depending on your requirement/load, you may tune your queue and thread pool size.
If you're looking to output the log to a single file, you could try using a Semaphore (preferably a Mutex) on the logger class to prevent simultaneous writes / a race condition. Semaphores are synchronization primitives designed so that the programmer can use them to ensure only a certain number of accesses can be made to any one data structure at any one time. I won't explain the whole concept but Java provides these things in the java.util.concurrent.Semaphore class. A mutex (mutual exclusion lock) is a semaphore that only allows one thread to be "hold" it at any given time. Give it a try!
If you are looking at using a dedicated thread to handle the logging, you will want to implement a Producer/Consumer pattern, and use the Queue to handle the storing of the information objects. The producer/consumer pattern is mostly used to help with thread synchronization and communication. Here is an example of a Producer/Consumer implementation that might help: http://www.tutorialspoint.com/javaexamples/thread_procon.htm
The other option, is to generate a standard logging operation, and then create thread pool threads that do this work. The benefit of this is the thread pool handles the scheduling of the threads and when they execute, but the down side is that you are not guaranteed FIFO logging with this method since the thread scheduler can arbitrarily choose which thread in the pool to run next.
Unless your leader insists about reinventing the wheel, use slf4j with logback's DBAppender behind an AsyncAppender. It's ready out of the box, it works like a charm.
You should really read about logback's appenders.
A full example can be found here.
I'm having trouble understanding the synchronized keyword. As far as I know, it is used to make sure that only one thread can access the synchronized method/block at the same time. Then, is there sometimes a reason to make some methods synchronized if only one thread calls them?
If your program is single threaded, there's no need to synchronize methods.
Another case would be that you write a library and indicate that it's not thread safe. The user would then be responsible for handling possible multi-threading use, but you could write it all without synchronization.
If you are sure your class will be always used under single thread there is no reason to use any synchronized methods. But, the reality is - Java is inherently multi threaded environment. At some point of time somebody will use multiple threads. Therefore whichever class needs thread safety should have adequately synchonized methods/synchronized blocks to avoid problems.
No, you don't need synchronization if there is single thread involved.
Always specify thread-safety policy
Actually you never know how a class written by you is going to be used by others in future. So it is always better to explicitly state your policy. So that if in future someone tries to use it in multi-threaded way then they can be aware of the implications.
And the best place to specify the thread-safety policy is in JavaDocs. Always specify in JavaDocs as to whether the class that you are creating is thread safe or not.
When two or more threads need access to a shared resource, they need some way to ensure that the resource will be used by only one thread at a time.
Synchronized method is used to block the Shared resource between the multiple Threads.
So, No need to apply Synchronization for the Single Thread
Consider that you are designing a movie ticket seller application. And lets drop all the technology capabilities that are provided these days, for the sake of visualizing the problem.
There is only one ticket left for the show with 5 different counters selling tickets. Consider there are 2 people trying to buy the last ticket of the show at the counters.
Consider your application workflow to be such
You take in details of the buyer, his name, and his credit card
number. (this is the read operation)
Then you find out how many tickets are left for the show (this
is again a read operation)
Then you book the ticket with the credit card (this is the write
operation)
If this logic isnt synchronised, what would happen?
The details of Customer 1 and Customer 2 would be read up until step 2. Both will try to book the ticket and both their tickets would be booked.
If it is modified to be
You take in details of the buyer, his name, and his credit card
number. (this is the read operation)
Synchronize(
Then you find out how many tickets are left for the show (this is
again a read operation)
Then you book the ticket with the credit card (this is the write
operation) )
There is no chance of overbooking the show due to a thread race condition.
Now, consider this example where you absolutely know that there will be only and only 1 person booking tickets. There is no need for synchronization.
The ticket seller here would be your single thread in case of your
application
I have tried to put this in a very very simplistic manner. There are frameworks, and constraints which you put on the DB to avoid such a simple scenario. But the intent of the answer is to prove the theory of why thread synchronization, and not the capabilities of the way to avoid it.
The Semaphore class overview in developer.android.com looks pretty good - for those who are already familiar with the concepts and terminology.
I am familiar with some of the acronyms and other jargon there (e.g. FIFO, lock, etc.) but others such as permits, fairness and barging are new to me.
Can you recommend a good online source for explaining these concepts? (I can probably figure out what permits and fairness are but barging is an unknown at this point).
EDIT: After receiving the two answers below, I realized that I need a refresh on semaphores (to re-acquire() terminology). I found the following resources to be useful:
Semaphore_(programming)
Introduction to Semaphores by
Dr. Richard S. Hall
http://download.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/locks/ReentrantLock.html
http://download.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/Semaphore.html
This is an excerpt from what is considered one of the seminal works in java concurrency you should check it out.
http://my.safaribooksonline.com/book/programming/java/0321349601/explicit-locks/287
Hadn't come across these myself, but thought I'd research and summarise my findings as it's better to in-line answers than link externally (although, yes, the OP is after recommending reading):
permits are the number of concurrent accesses allowed to the semaphore-protected code. Although often semaphores are simple Mutex's, it is sometimes desirable to have more than one thread touching code. This is similar to phoning a call-centre, where there's one phone number connected to 8 lines/operators.
fairness is when a semaphore is made available to requesters in strict order of who requested first. Staying with the call-centre analogy, this means the on-hold queue is a strict FIFO.
barging is essentially an out-of-band request, that puts a thread to the top of the queue for a semaphore. The analogy is where preferred customers (or internal calls) go to the top of a call-centre queue, rather than waiting their turn.
If neither fairness nor barging are specified, then it's within spec to grant access to the most recent request, depending on timing of context switches. The 'phone analogy is a call to a company switchboard/reception, where even if calls are on hold waiting for answer, you may get lucky and ring between one call ending and another call being taken off-hold.
Let me know through comments if I've got this wrong, and I'll fix / cw my answer.
How to determine part of what Java code needs to be synchronized? Are there any unit testing technics?
Samples of code are welcome.
Code needs to be synchronized when there might be multiple threads that work on the same data at the same time.
Whether code needs to be synchronized is not something that you can discover by unit testing. You must think and design your program carefully when your program is multi-threaded to avoid issues.
A good book on concurrent programming in Java is Java Concurrency in Practice.
If I understand your question correctly, you want to know what you have to synchronise. Unfortunately there isn't a boiler plate code to provide that shows you what to synchronise - you should take a look at methods and instance variables that can be accessed by multiple threads at the same time. If there aren't such, you usually don't need to worry about synchronisation too much.
This is a good source for some general information:
http://weblogs.java.net/blog/caroljmcdonald/archive/2009/09/17/some-java-concurrency-tips
When you are in a multithreaded environment in Java and you want to do many things in parallel, I would suggest using an approach which uses the concurrent Queue (like BlockingQueue or ConcurrentLinkedQueue) implementations and a simple Runnable that has a reference to the queue and pulls 'messages' of the queue. Use an ExecutorService to manage the tasks. Sort of a (very simplified) Actor type of model.
So choose not to share state as much as possible, because if you do, you need to synchronize, or use a data structure that supports concurrent access like the ConcurrentHashMap.
There's no substitute for thinking about the issues surrounding your code (as the other answers here illustrate). Once you've done that, though, it's worth running FindBugs over your code. It will identify where you've applied synchronisation inconsistently, and is a great help in tracking otherwise hard-to-find bugs.
Lot of nice answers here:
Java synchronization and performance in an aspect
A nice analysis of your problem is available here:
http://portal.acm.org/citation.cfm?id=1370093&dl=GUIDE&coll=GUIDE&CFID=57662261&CFTOKEN=95754288 (require access to ACM portal)
Yes, all these folks are right - no alternative for thinking. But here is the thumb rule..
1. If its a read - perhaps you do not need synchronization
2. If its a 'write' - you should consider it...