assume we have 2 threads, thread A and thread B.
thread A is the main method and contain large data structures.
is it possible to create a second thread and pass the address(a pointer) of the data structure (local to thread A) to thread B so both thread can read from the data structure?
the point of this is to avoid the need to duplicate the entire data structure on thread B or spend a lot of time pulling relevant information from the data structure for thread B to use
keep in mind that neither thread is modifying the data
In Java, the term pointer is not used, but reference.
It is possible to pass it, as any other object, to another thread.
As any (non-final) class in Java, you can extend it, add members, add constructors etc.
(If you need to modify the data) You need to make sure that there are no concurrency issues.
It's known as a reference in java, as you don't have access directly to a pointer in a conventional sense. (For most cases it's "safe" to think of it as every reference is a pointer that is always passed by value and the only legal operation is to dereference it. It is NOT the same as a C++ 'reference.')
You can certainly share references among threads. Anything that's on the heap can be seen and used by any thread that can get a reference to it. You can either put it in a static location, or set the value of a reference on your Runnable to point to the data.
public class SharedDataTest {
private static class SomeWork implements Runnable {
private Map<String, String> dataTable;
public SomeWork(Map<String, String> dataTable) {
this.dataTable = dataTable;
}
#Override
public void run() {
//do some stuff with dataTable
}
}
public static void main(String[] args) {
Map<String, String> dataTable = new ConcurrentHashMap<String, String>();
Runnable work1 = new SomeWork(dataTable);
Runnable work2 = new SomeWork(dataTable);
new Thread(work1).start();
new Thread(work2).start();
}
}
Yes it is possible and is a usual thing to do but you need to make sure that you use proper synchronization to ensure that both threads see an up to date version of the data.
It is safe to share a reference to immutable object. Roughly speaking, immutable object is the object that doesn't change its state after construction. Semantically immutable object should contain only final fields which in turn reference immutable objects.
If you want to share reference to mutable object you need to use proper synchronization, for example by using synchronized or volatile keywords.
Easy way to share data safely would be to use utilities from java.util.concurrent package such as AtomicReference or ConcurrentHashMap, however you still have to be very careful if objects you share are mutable.
If you are not doing any modification in the shared data you can have a shared reference and there will be no significant overhead.
Be careful however when you start modifying the shared object concurrently, in this case you can use the data structures provided in java (see for instance factory methods in Collections), or use a custom synchronisation scheme, for instance with java.util.concurrent.locks.ReentrantLock.
Related
Here is a question that has been asked many times, I have double-checked numerous issues that have been raised formerly but none gave me an answer element so I thought I would put it here.
The question is about making my code thread-safe in java knowing that there is only one shared variable but it can change anytime and actually I have the feeling that the code I am optimizing has not been thought for a multi-threading environment, so I might have to think it over...
Basically, I have one class which can be shared between, say, 5 threads. This class has a private property 'myProperty' which can take 5 different values (one for each thread). The problem is that, once it's instantiated by the constructor, that value should not be changed anymore for the rest of the thread's life.
I am pretty well aware of some techniques used to turn most of pieces of code "thead-safe" including locks, the "synchronized" keyword, volatile variables and atomic types but I have the feeling that these won't help in the current situation as they do not prevent the variable from being modified.
Here is the code :
// The thread that calls for the class containing the shared variable //
public class myThread implements Runnable {
#Autowired
private Shared myProperty;
//some code
}
// The class containing the shared variable //
public class Shared {
private String operator;
private Lock lock = new ReentrantLock();
public void inititiate(){
this.lock.lock()
try{
this.operator.initiate() // Gets a different value depending on the calling thread
} finally {
this.lock.unlock();
}
}
// some code
}
As it happens, the above code only guarantees that two threads won't change the variable at the same time, but the latter will still change. A "naive" workaround would consist in creating a table (operatorList) for instance (or a list, a map, etc. ) associating an operator with its calling thread's ID, this way each thread would just have to access its operator using its id in the table but doing this would make us change all the thread classes which access the shared variable and there are many. Any idea as to how I could store the different operator string values in an exclusive manner for each calling thread with minimal changes (without using magic) ?
I'm not 100% sure I understood your question correctly, but I'll give it a shot anyway. Correct me if I'm wrong.
A "naive" workaround would consist in creating a table (operatorList)
for instance (or a list, a map, etc. ) associating an operator with
its calling thread's ID, this way each thread would just have to
access its operator using its id in the table but doing this would
make us change all the thread classes which access the shared variable
and there are many.
There's already something similar in Java - the ThreadLocal class?
You can create a thread-local copy of any object:
private static final ThreadLocal<MyObject> operator =
new ThreadLocal<MyObject>() {
#Override
protected MyObject initialValue() {
// return thread-local copy of the "MyObject"
}
};
Later in your code, when a specific thread needs to get its own local copy, all it needs to do is: operator.get(). In reality, the implementation of ThreadLocal is similar to what you've described - a Map of ThreadLocal values for each Thread. Only the Map is not static, and is actually tied to the specific thread. This way, when a thread dies, it takes its ThreadLocal variables with it.
I'm not sure if I totally understand the situation, but if you want to ensure that each thread uses a thread-specific instance for a variable, the solution is use a variable of type ThreadLocal<T>.
I have a static HashMap which will cache objects identifed by unique integers; it will be accessed from multiple threads. I will have multiple instances of the type HashmapUser running in different threads, each of which will want to utilize the same HashMap (which is why it's static).
Generally, the HashmapUsers will be retrieving from the HashMap. Though if it is empty, it needs to be populated from a Database. Also, in some cases the HashMap will be cleared because it needs the data has change and it needs to be repopulated.
So, I just make all interactions with the Map syncrhonized. But I'm not positive that this is safe, smart, or that it works for a static variable.
Is the below implementation of this thread safe? Any suggestions to simplify or otherwise improve it?
public class HashmapUser {
private static HashMap<Integer, AType> theMap = new HashSet<>();
public HashmapUser() {
//....
}
public void performTask(boolean needsRefresh, Integer id) {
//....
AType x = getAtype(needsRefresh, id);
//....
}
private synchronized AType getAtype(boolean needsRefresh, Integer id) {
if (needsRefresh) {
theMap.clear();
}
if (theMap.size() == 0) {
// populate the set
}
return theMap.get(id);
}
}
As it is, it is definitely not thread-safe. Each instance of HashmapUsers will use a different lock (this), which does nothing useful. You have to synchronise on the same object, such as the HashMap itself.
Change getAtype to:
private AType getAtype(boolean needsRefresh, Integer id) {
synchronized(theMap) {
if (needsRefresh) {
theMap.clear();
}
if (theMap.size() == 0) {
// populate the set
}
return theMap.get(id);
}
}
Edit:
Note that you can synchronize on any object, provided that all instances use the same object for synchronization. You could synchronize on HashmapUsers.class, which also allows for other objects to lock access to the map (though it is typically best practice to use a private lock).
Because of this, simply making your getAtype method static would work, since the implied lock would now be HashMapUsers.class instead of this. However, this exposes your lock, which may or may not be what you want.
No, this won't work at all.
If you don't specify lock object, e.g. declare method synchronized, the implicit lock will be instance. Unless the method is static then the lock will be class. Since there are multiple instances, there are also multiple locks, which i doubt is desired.
What you should do is create another class which is the only class with the access to HashMap.
Clients of HashMap, such as the HashMapUser must not even be aware that there is synchronization in place. Instead, thread safety should be assured by the proper class wrapping the HashMap hiding the synchronization from the clients.
This lets you easily add additional clients to the HashMap since synchronization is hidden from them, otherwise you would have to add some kind of synchronization between the different client types too.
I would suggest you go with either ConcurrentHashMap or SynchronizedMap.
More info here: http://crunchify.com/hashmap-vs-concurrenthashmap-vs-synchronizedmap-how-a-hashmap-can-be-synchronized-in-java/
ConcurrentHashMap is more suitable for high - concurrency scenarios. This implementation doesn't synchronize on the whole object, but rather does that in an optimised way, so different threads, accessing different keys can do that simultaneously.
SynchronizerMap is simpler and does synchronization on the object level - the access to the instance is serial.
I think you need performance, so I think you should probably go with ConcurrentHashMap.
I have several threads trying to increment a counter for a certain key in a not thread-safe custom data structure (which you can image to be similiar to a HashMap). I was wondering what the right way to increment the counter in this case would be.
Is it sufficient to synchronize the increment function or do I also need to synchronize the get operation?
public class Example {
private MyDataStructure<Key, Integer> datastructure = new CustomDataStructure<Key, Integer>();
private class MyThread implements Runnable() {
private synchronized void incrementCnt(Key key) {
// from the datastructure documentation: if a value already exists for the given key, the
// previous value will be replaced by this value
datastructure.put(key, getCnt(key)+1);
// or can I do it without using the getCnt() function? like this:
datastructure.put(key, datastructure.get(key)+1));
}
private synchronized int getCnt(Key key) {
return datastructure.get(key);
}
// run method...
}
}
If I have two threads t1, t2 for example, I would to something like:
t1.incrementCnt();
t2.incrmentCnt();
Can this lead to any kind of deadlock? Is there a better way to solve this?
Main issue with this code is that it's likely to fail in providing synchronization access to datastructure, since accessing code synchronizing on this of an inner class. Which is different for different instances of MyThread, so no mutual exclusion will happen.
More correct way is to make datastructure a final field, and then to synchronize on it:
private final MyDataStructure<Key, Integer> datastructure = new CustomDataStructure<Key, Integer>();
private class MyThread implements Runnable() {
private void incrementCnt(Key key) {
synchronized (datastructure) {
// or can I do it without using the getCnt() function? like this:
datastructure.put(key, datastructure.get(key)+1));
}
}
As long as all data access is done using synchronized (datastructure), code is thread-safe and it's safe to just use datastructure.get(...). There should be no dead-locks, since deadlocks can occur only when there's more than one lock to compete for.
As the other answer told you, you should synchronize on your data structure, rather than on the thread/runnable object. It is a common mistake to try to use synchronized methods in the thread or runnable object. Synchronization locks are instance-based, not class-based (unless the method is static), and when you are running multiple threads, this means that there are actually multiple thread instances.
It's less clear-cut about Runnables: you could be using a single instance of your Runnable class with several threads. So in principle you could synchronize on it. But I still think it's bad form because in the future you may want to create more than one instance of it, and get a really nasty bug.
So the general best practice is to synchronize on the actual item that you are accessing.
Furthermore, the design conundrum of whether or not to use two methods should be solved by moving the whole thing into the data structure itself, if you can do so (if the class source is under your control). This is an operation that is confined to the data structure and applies only to it, and doing the increment outside of it is not good encapsulation. If your data structure exposes a synchronized incrementCnt method, then:
It synchronizes on itself, which is what you wanted.
It can use its own private fields directly, which means you don't actually need to call a getter and a setter.
It is free to have the implementation changed to one of the atomic structures in the future if it becomes possible, or add other implementation details (such as logging increment operations separately from setter access operations).
I have a class which reads an xml file and populates them in a private static data-structure(say, HashMap). This initial population happens in a static block. Then I have method to get value of a given key, which intern refers that static HashMap. Cosider the case, when multiple threads tries to get value for a given key, will there be any performance hit; like, when one thread is reading that static object other threads has to wait.
public class Parser
{
private static HashMap resource = new HashMap();
static
{
parseResource();
}
private Parser()
{
}
private static parseResource()
{
//parses the resource and populates the resource object
}
public static Object getValue( String key)
{
//may be some check will be done here, but not any
//update/modification actions
return resource.get(key);
}
}
Firstly, it's worth being aware that this really has very little to do with static. There's no such thing as a "static object" - there are just objects, and there are fields and methods which may or may not be static. For example, there could be an instance field and a static field which both refer to the same object.
In terms of thread safety, you need to consider the safety of the operations you're interested in on a single object - it doesn't matter how the multiple threads have "reached" that object.
like, when one thread is reading that static object other threads has to wait.
No, it doesn't.
If you are just reading from the HashMap after constructing it in a way that prevented it from being visible to other threads until it had been finished, that's fine. (Having reread your comment, it looks like that's the case in getValue.)
If you need to perform any mutations on the map while other threads are reading from it, consider using ConcurrentHashMap or use synchronization.
From the docs for HashMap:
Note that this implementation is not synchronized. If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally.
You have no locking happening in your example code, so there's no way that multiple threads would need to wait.
Just adding to Jon Skeet's answer, for this kind of use you might want to consider Guava's ImmutableMap, which enforces immutability.
Just use the synchronized keyword and everything should work fine.
I'm trying to develop a program that takes requests for data which is stored in a map. The map is declared in the main method as shown below:
Map m = Collections.synchronizedMap(new HashMap());
synchronized(m) {
while (listening) {
new BrokerLookupServerHandlerThread(serverSocket.accept(), m).start();
}
}
The code for the BrokerLookupServerHandlerThread takes the input and makes it one of the object's variables. If I use it in this class, will the original map be updated as well? I understand that Java is pass by value, (I'm used to C/C++) so I just wanted to be sure if this implementation of a synchronized object makes sense.
private Socket socket = null;
//private String t ="MSFT";
public Map m;
public BrokerLookupServerHandlerThread(Socket socket, Map m) {
super("NamingServerHandlerThread");
this.socket = socket;
this.m = m;
System.out.println("Created new Thread to handle client");
}
Thanks for your help.
Yes original object will be updated. I suggest you use ConcurrentHashMap though.
A hash table supporting full
concurrency of retrievals and
adjustable expected concurrency for
updates. This class obeys the same
functional specification as Hashtable,
and includes versions of methods
corresponding to each method of
Hashtable. However, even though all
operations are thread-safe,
retrieval operations do not entail locking, and there is not any
support for locking the entire table
in a way that prevents all access.
This class is fully interoperable with
Hashtable in programs that rely on its
thread safety but not on its
synchronization details.
Yes, changes made to the map will be seen by both threads.
Java does indeed use pass by value - but the value in this case is a reference (similar to a pointer). The value of a reference-type variable in Java is always a reference to an object, or null. It's never the object itself.
So your code won't create a new map. There are very few operations which implicitly create a new object. I can only think of the use of string literals (where the literals are interned anyway) and autoboxing of primitive types. Other than that, you'll only get a new object via the new operator. (Obviously any method you call could create a new object too...)
Note that this is entirely separate to the issue of synchronization between threads. The business about copying objects vs copying references is orthogonal to threading. In this case it looks like you've solved the threading aspect using Collections.synchronizedMap; as Pangea says you may want to use ConcurrentHashMap instead which won't use nearly as much locking (if any). Another implementation of the ConcurrentMap interface is ConcurrentSkipListMap. Look at the docs for both classes to decide what suits you best.