How to determine part of what Java code needs to be synchronized? Are there any unit testing technics?
Samples of code are welcome.
Code needs to be synchronized when there might be multiple threads that work on the same data at the same time.
Whether code needs to be synchronized is not something that you can discover by unit testing. You must think and design your program carefully when your program is multi-threaded to avoid issues.
A good book on concurrent programming in Java is Java Concurrency in Practice.
If I understand your question correctly, you want to know what you have to synchronise. Unfortunately there isn't a boiler plate code to provide that shows you what to synchronise - you should take a look at methods and instance variables that can be accessed by multiple threads at the same time. If there aren't such, you usually don't need to worry about synchronisation too much.
This is a good source for some general information:
http://weblogs.java.net/blog/caroljmcdonald/archive/2009/09/17/some-java-concurrency-tips
When you are in a multithreaded environment in Java and you want to do many things in parallel, I would suggest using an approach which uses the concurrent Queue (like BlockingQueue or ConcurrentLinkedQueue) implementations and a simple Runnable that has a reference to the queue and pulls 'messages' of the queue. Use an ExecutorService to manage the tasks. Sort of a (very simplified) Actor type of model.
So choose not to share state as much as possible, because if you do, you need to synchronize, or use a data structure that supports concurrent access like the ConcurrentHashMap.
There's no substitute for thinking about the issues surrounding your code (as the other answers here illustrate). Once you've done that, though, it's worth running FindBugs over your code. It will identify where you've applied synchronisation inconsistently, and is a great help in tracking otherwise hard-to-find bugs.
Lot of nice answers here:
Java synchronization and performance in an aspect
A nice analysis of your problem is available here:
http://portal.acm.org/citation.cfm?id=1370093&dl=GUIDE&coll=GUIDE&CFID=57662261&CFTOKEN=95754288 (require access to ACM portal)
Yes, all these folks are right - no alternative for thinking. But here is the thumb rule..
1. If its a read - perhaps you do not need synchronization
2. If its a 'write' - you should consider it...
Related
Using async/await it is possible to code asynchronous functions in an imperative style. This can greatly facilitate asynchronous programming. After it was first introduced in C#, it was adopted by many languages such as JavaScript, Python, and Kotlin.
EA Async is a library that adds async/await like functionality to Java. The library abstracts away the complexity of working with CompletableFutures.
But why has async/await neither been added to Java SE, nor are there any plans to add it in the future?
The short answer is that the designers of Java try to eliminate the need for asynchronous methods instead of facilitating their use.
According to Ron Pressler's talk asynchronous programming using CompletableFuture causes three main problems.
branching or looping over the results of asynchronous method calls is not possible
stacktraces cannot be used to identify the source of errors, profiling becomes impossible
it is viral: all methods that do asynchronous calls have to be asynchronous as well, i.e. synchronous and asynchronous worlds don't mix
While async/await solves the first problem it can only partially solve the second problem and does not solve the third problem at all (e.g. all methods in C# doing an await have to be marked as async).
But why is asynchronous programming needed at all? Only to prevent the blocking of threads, because threads are expensive. Thus instead of introducing async/await in Java, in project Loom Java designers are working on virtual threads (aka fibers/lightweight threads) which will aim to significantly reduce the cost of threads and thus eliminate the need of asynchronous programming. This would make all three problems above also obsolete.
Better late than never!!!
Java is 10+ years late in trying to come up with lighter weight units of execution which can be executed in parallel. As a side note, Project loom also aims to expose in Java 'delimited continuation' which, I believe is nothing more than good old 'yield' keyword of C# (again almost 20 years late!!)
Java does recognize the need for solving the bigger problem solved by asyn await (or actually Tasks in C# which is the big idea. Async Await is more of a syntactical sugar. Highly significant improvement, but still not a necessity to solve the actual problem of OS mapped Threads being heavier than desired).
Look at the proposal for project loom here: https://cr.openjdk.java.net/~rpressler/loom/Loom-Proposal.html
and navigate to last section 'Other Approaches'. You will see why Java does not want to introduce async/await.
Having said this, I don't really agree with the reasoning being provided. Neither in this proposal nor in Stephan's answer.
First let us diagnose Stephan's answer
async await solves point 1 mentioned there. (Stephan also acknowledges it further down the answer)
It is extra work for sure on the part of the framework and tools but not at all on the part of the programmers. Even with async await, .Net debuggers are pretty good in this aspect.
This I only partially agree with. Whole purpose of async await is to elegantly mix asynchronous world with synchronous constructs. But yes, you either need to declare the caller also as async or deal directly with Task in the caller routine. However, project loom will not solve it either in a meaningful way. To fully benefit from the light weight virtual threads, even the caller routine must be getting executed on a virtual thread. Otherwise what's the benefit? You will end up blocking an OS backed thread!!! Hence even virtual threads need to be 'viral' in the code. On the contrary, it will be easier in Java to not notice that the routine you are calling is async and will block the calling thread (which will be concerning if the calling routine is itself not executing on a virtual thread). Async keyword in C# makes the intent very clear and forces you to decide (it is possible in C# to block as well if you want by asking for Task.Result. Most of the time the calling routine can just as easily be async itself).
Stephan is right when he says async programming is needed to prevent blocking of (OS) threads as (OS) threads are expensive. And that's precisely the whole reason why virtual threads (or C# tasks) are needed. You should be able to 'block' on these tasks without losing your sleep. Offcourse to not lose the sleep, either the calling routine itself should be a task or blocking should be on non-blocking IO, with framework being smart enough to not block the calling thread in that case (power of continuation).
C# supports this and proposed Java feature aims to support this.
According to the proposed Java api, blocking on virtual thread will require calling vThread.join() method in Java.
How is it really more beneficial than calling await workDoneByVThread()?
Now let us look at project loom proposal reasoning
Continuations and fibers dominate async/await in the sense that async/await is easily implemented with continuations (in fact, it can be implemented with a weak form of delimited continuations known as stackless continuations, that don't capture an entire call-stack but only the local context of a single subroutine), but not vice-versa
I don't simply understand this statement. If someone does, please let me know in the comments.
For me, async/await are implemented using continuations and as far as stack trace is concerned, since the fibres/virtual threads/tasks are within the virtual machine, it must be possible to manage that aspect. In-fact .net tools do manage that.
While async/await makes code simpler and gives it the appearance of normal, sequential code, like asynchronous code it still requires significant changes to existing code, explicit support in libraries, and does not interoperate well with synchronous code
I have already covered this. Not making significant changes to existing code and no explicit support in libraries will actually mean not using this feature effectively. Until and unless Java is aiming to transparently transform all the threads to virtual threads, which it can't and isn't, this statement does not make sense to me.
As a core idea, I find no real difference between Java virtual threads and C# tasks. To the point that project loom is also aiming for work-stealing scheduler as default, same as the scheduler used by .Net by default (https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.taskscheduler?view=net-5.0, scroll to last remarks section ).
Only debate it seems is on what syntax should be adopted to consume these.
C# adopted
A distinct class and interface as compared to existing threads
Very helpful syntactical sugar for marrying async with sync
Java is aiming for:
Same familiar interface of Java Thread
No special constructs apart from try-with-resources support for ExecutorService so that the result for submitted tasks/virtual threads can be automatically waited for (thus blocking the calling thread, virtual/non-virtual).
IMHO, Java's choices are worse than those of C#. Having a separate interface and class actually makes it very clear that the behavior is a lot different. Retaining same old interface can lead to subtle bugs when a programmer does not realize that she is now dealing with something different or when a library implementation changes to take advantage of the new constructs but ends up blocking the calling (non-virtual) thread.
Also no special language syntax means that reading async code will remain difficult to understand and reason about (I don't know why Java thinks programmers are in love with Java's Thread syntax and they will be thrilled to know that instead of writing sync looking code they will be using the lovely Thread class)
Heck, even Javascript now has async await (with all its 'single-threadedness').
I release a new project JAsync implement async-await fashion in java which use Reactor as its low level framework. It is in the alpha stage. I need more suggest and test case.
This project makes the developer's asynchronous programming experience as close as possible to the usual synchronous programming, including both coding and debugging.
I think my project solves point 1 mentioned by Stephan.
Here is an example:
#RestController
#RequestMapping("/employees")
public class MyRestController {
#Inject
private EmployeeRepository employeeRepository;
#Inject
private SalaryRepository salaryRepository;
// The standard JAsync async method must be annotated with the Async annotation, and return a JPromise object.
#Async()
private JPromise<Double> _getEmployeeTotalSalaryByDepartment(String department) {
double money = 0.0;
// A Mono object can be transformed to the JPromise object. So we get a Mono object first.
Mono<List<Employee>> empsMono = employeeRepository.findEmployeeByDepartment(department);
// Transformed the Mono object to the JPromise object.
JPromise<List<Employee>> empsPromise = Promises.from(empsMono);
// Use await just like es and c# to get the value of the JPromise without blocking the current thread.
for (Employee employee : empsPromise.await()) {
// The method findSalaryByEmployee also return a Mono object. We transform it to the JPromise just like above. And then await to get the result.
Salary salary = Promises.from(salaryRepository.findSalaryByEmployee(employee.id)).await();
money += salary.total;
}
// The async method must return a JPromise object, so we use just method to wrap the result to a JPromise.
return JAsync.just(money);
}
// This is a normal webflux method.
#GetMapping("/{department}/salary")
public Mono<Double> getEmployeeTotalSalaryByDepartment(#PathVariable String department) {
// Use unwrap method to transform the JPromise object back to the Mono object.
return _getEmployeeTotalSalaryByDepartment(department).unwrap(Mono.class);
}
}
In addition to coding, JAsync also greatly improves the debugging experience of async code.
When debugging, you can see all variables in the monitor window just like when debugging normal code. I will try my best to solve point 2 mentioned by Stephan.
For point 3, I think it is not a big problem. Async/Await is popular in c# and es even if it is not satisfied with it.
I'm reading J. Bloch's effective java and now I'm at the section about executors. He said that we should prefer using executors to directly usage of Threads. As far as I got the primary reason for that is
The key abstraction is no longer Thread , which served as both the
unit of work and the mechanism for executing it. Now the unit of work
and mechanism are separate. The key abstraction is the unit of
work, which is called a task.
It's not quite clear what the unit of work means here. I tried to search for it and found that there's a design pattern related to db-operation. But how does it tie with Threads? Or there is another interpretation of this pattern?
It's purposefully nebulous: it's just "a thing you want done," and the more precise meaning is up to you.
If you want to download a file, that's a unit of work. If you want to compute a hash of some big chuck of data, that's a unit of work. If you want to draw something on the screen, that's a unit of work.
What this blurb is getting at is that this unit of work used to be tied directly to a Java thread (via the Thread class), which is in turn tied relatively directly to the OS's threads (some hand-waving there :) ). A more modern approach is to define the work as a task, and then give a bunch of tasks to a Thread whose life cycle is longer than any of those tasks. That thread then executes those tasks. This lets you more explicitly manage Thread resources, which are relatively heavyweight.
A rough analogy would be to hire a new employee for every task you want done (write this spec, or make some coffee, or fix this bug) vs hiring just a few employees and giving them small tasks as needed.
This question is similar to this SO post (q.v. Jon Skeet's highly upvoted answer), but I will give an answer anyway. There are two ways to create a thread in Java. The first way is to extend the Thread class directly, and the second is to implement the Runnable interface. These two options are what Bloch is generally arguing over.
If you choose to extend the Thread class directly, then you are not free to extend any other class (since Java does not allow multiple inheritance). This is a design limitation. On the other hand, if you implement the marker interface Runnable, then you are still free to extend any other class you wish.
Philosophically, using the interface is generally considered better design, because it simply marks a class a something which can be run as a thread, without explicitly saying how it will be run.
The simplest (and so not quite correct) approach is to understand the unit of work as execution of a method (procedure, function). Anyway, unit of work is always represented as a method.
I've read about Semaphore class there and now what I'd like to understand is how can I use that class in a real code? What is the usefulness of the Semaphores? I got the thing that we could use semaphores in order to improve performance, reducing a concurrency for a resource. Is it the main usage of the Semaphore?
tl;dr Answer: Semaphores let you limit access to some code path to a certain number of threads - without controlling the mechanism handling those threads. A sample use case is a webservice that offers some resource-intensive task - using a Semaphore, you can limit that task to i.e. 5 threads while using the app server's larger thread pool handle both this and some other types of requests.
Long answer: See Furkan Omay's comment.
Semaphores are a part of the concurrency package in java. So as the package says, it is used to leverage the flow of concurrent access to the code. Unlike 'Synchronized' and 'Lock' in java using semaphores you can control the access of your code to a certain number of users.
Consider it as a bouncer in the pub who allows people whether they can enter or not. If the pub is full and cant take any more people he stops until someone leaves. Semaphores can be used to do something like this to your code!!
Hope it helps!!
I have two questions:
I need to stop child processes through my main process and then start them again after something happened in my main process.have can I do that?
thanks alot.
I'm not entirely sure what you mean in the above post - I suspect they are different questions and the second is related to Glassfish, which I probably can't answer.
However, for the first I can if you mean threads rather than processes - Java has a wait/notify method pair that used in combination allow you to launch n child threads and wait for them all to complete before continuing in the main process. I think this is what you need, rather than stopping the child process from the main process - in concurrent programming this should never be done as you can't guarantee where you're up to in the child process. Have a look at: http://www.javamex.com/tutorials/synchronization_wait_notify_4.shtml
For your first part there are some classes in java.util.concurrent.locks that may help you. Have a look at LockSupport.
The answer to the first part of your question depends on whether the "processes" you are talking about are Process or Thread. But in both cases, there is no good way to cause an uncooperative process to "stop".
In the Process case, the OS may well provide support for suspending processes, but the Java Process APIs don't offer this functionality. So you'd need to resort to non-portable means (e.g. JNI/JNA) to implement this.
In the Thread case, there are methods called suspend and resume, but they should not be used because they are fundamentally unsafe. And the Javadoc says so very clearly!
So if you implement a suspend/resume mechanism, you need your processes to participate / cooperate. In the Thread case, you could implement your suspend / resume mechanism using the low-level synchronization primitives, or using something like the CyclicBarrier class.
Well it was a long time ago and I was really confused probably that forgot to look for the answers. Thanks but there actually a way to take care of the first part and the answer was Java Remote Method Invocation or simpli RMI:
http://en.wikipedia.org/wiki/Java_remote_method_invocation
I am going to remove the second part of my question as I simply don't remember what I was on!
I was reading some of the concurrency patterns in Brian Goetze's Java Concurrency in Practice and got confused over when is the right time to make the code thread safe.
I normally write code that's meant to run in a single thread so I do not worry too much about thread safety and synchronization etc. However, there always exists a possibility that the same code may be re-used sometime later in a multi-threaded environment.
So my question is, when should one start thinking about thread safety? Should I assume the worst at the onset and always write thread-safe code from the beginning or should I revisit the code and modify for thread safety if such a need arises later ?
Are there some concurrency patterns/anti-patterns that I must always be aware of even while writing single-threaded applications so that my code doesn't break if it's later used in a multi-threaded environment ?
You should think about thread safety when your code will be used in a multithreaded environment. There is no point in tackling the complexity if it will only be run in a singlethreaded environment.
That being said, there are simple things you can do that are good practices anyway and will help with multithreading:
As Josh Bloch says, Favor Immutability. Immutable classes are threadsafe almost by definition;
Only use data members or static variables where required rather than a convenience.
Making your code thread safe can be as simple as adding a comment that says the class was not designed for concurrent use by multiple threads. So, in that sense: yes, all of your classes should be thread safe.
However, in practice, many, many types are likely to be used by only a single thread, often only referenced as local variables. This can be true even if the program as a whole is multi-threaded. It would be a mistake to make every object safe for multi-threaded access. While the penalty may be small, it is pervasive, and can add up to be a significant, hard-to-fix performance problem.
I advise you to obtain a copy of "Effective Java", 2nd Ed. by Joshua Bloch. That book devotes a whole chapter to concurrency, including a solid exploration of the issue of when (and when not) to synchronize. Note, for example, the title of item 67 in "Effective Java": 'Avoid excessive synchronization', which is elaborated over five pages.
As was stated previously, you need thread safety when you think your code will be used in a multithreaded environment.
Consider the approach taken by the Collections classes, where you provide a thread-unsafe class that does all its work without using synchronize, and you also provide another class that wraps the unsynchonized class and providing all of the same public methods but making them synchronize on the underlying object.
This gives your clients a choice of using the multi-threaded or the single-threaded version of your code. It may also simplify your coding by isolating all of the threading/locking logic in a separate class.