In .net, there is an "uber" thread-local-storage (TLS) which allows arbitrary TLS data to auto-magically "jump" from one thread to another. It is based on the CallContext class.
In other words, a logical request can spawn a hierarchy of new threads - and each of those threads will have access to the same TLS of the original thread. It is a very powerful feature, particularly for logging, authorization, multi-tenancy, or branding concerns.
What is the equivalent in Java?
Only in .net 4.5 has the "logical callcontext" gained a "copy on write" capability that allows threads to make private modifications to the logical callcontext. In other words, .net is still maturing this capability and providing greater stability.
If Java has an equivalent notion, how stable is it? What issues does it have?
Clarification
I already know that Java has a thread local storage (TLS) capability. That is not the question. I am asking if Java has an equivalent of the .net "logical call context" which is a much more powerful construct than simple TLS.
Maybe InheritableThreadLocal is what you're looking for?
I'm not sure if it's exactly the same, but as far as I understand it meets this requirement:
a logical request can spawn a hierarchy of new threads - and each of those threads will have access to the same TLS of the original thread.
From the docs
This class extends ThreadLocal to provide inheritance of values from parent thread to child thread: when a child thread is created, the child receives initial values for all inheritable thread-local variables for which the parent has values. Normally the child's values will be identical to the parent's; however, the child's value can be made an arbitrary function of the parent's by overriding the childValue method in this class.
Inheritable thread-local variables are used in preference to ordinary thread-local variables when the per-thread-attribute being maintained in the variable (e.g., User ID, Transaction ID) must be automatically transmitted to any child threads that are created.
I don't know about the "copy on write" capability you mentioned, but I guess you can override InheritableThreadLocal.childValue(T) to proxy the parent's value so that writes don't go through to the parent and modify the current thread's local storage
I am not familiar with .net but from your Description the closest thing I could think of was a ThreadLocal.
However I found an article that you might find helpfull.
https://dzone.com/articles/thread-local-storage-java
Related
There is this note in the akka-stream docs stating as follows:
… a reusable flow description cannot be bound to “live” resources, any connection to or allocation of such resources must be deferred until materialization time. Examples of “live” resources are already existing TCP connections, a multicast Publisher, etc.; …
I have several questions concerning the note:
Apart from the these two examples, what other resource counts as a live?
Anything that cannot be safely (deep)copied? Like a Thread?
Should I also avoid sharing anything that's not thread-safe?
What about an ActorRef existing in the ActorSystem used by the ActorFlowMaterializer?
How to defer allocation until materialization time? Is it safe for example to allocate it in the constructor of a PushPullStage but not in the create function of a FlowGraph?
The problem here is a common problem if we consider webservices, RMI connections or any other communication protocol. It's always recommended sharing "primitive" values then references, because marshalling/unmarshalling or serializing/unserializing is always a headache. Also think of different types of environments communicating each other. Sharing solid values is a safe way to solve communication.
Akka by itself is a good example of "microservices" communicating actors each other. When I read the documentation of Akka, one good word defines Akka actors very well. Actors are like mailbox clients and you can think of each client has a mailbox. When you pass a variable, it's just like you got a new email.
Short result of long story, be avoid sharing "dependent" objects that can be invalidated before it's read from another actor. Additionally, if your system names actorRefs dynamically, avoid calling them by its reference.
"Materializing" is explained in docs of akka-streams.
The process of materialization may be parameterized, e.g. instantiating a blueprint for handling a TCP connection’s data with specific information about the connection’s address and port information. Additionally, materialization will often create specific objects that are useful to interact with the processing engine once it is running, for example for shutting it down or for extracting metrics. This means that the materialization function takes a set of parameters from the outside and it produces a set of results. Compositionality demands that these two sets cannot interact, because that would establish a covert channel by which different pieces could communicate, leading to problems of initialization order and inscrutable runtime failures.
So use parameters instead of passing "connection" itself.
Deferring a live resource is not a big think. That means if you use one connection for all system, you should keep it alive always. Or when you create a transaction in actor-1 and send it to actor-2, you shouldn't terminate the transaction in actor-1 until actor-2 finished its job with transaction.
Then how you can understand ? Then you use "Future" and "offer()".
Hope I understand your question and hope I can express myself.
I am working with a 3rd party proprietary library (no source code) which creates instances of a non thread safe component. Does this mean that I shouldn't use multiple threads to run jobs in parallel? Running each job in it's own JVM crossed my mind but is overkill.
Then I read the article here
http://cscarioni.blogspot.com/2011/09/alternatives-to-threading-in-java-stm.html
Is it recommended to follow that article's advice? What other alternatives exist out there?
Response to Martin James:
Vendor tells me that there is only one thread in which multiple instances of the component exist (Factory pattern to create the component instance) and each instance is independently controllable from it's API.
So does this mean that I can still use multiple threads while controlling each component instances running in one big thread?
No, it does not mean this.
It means that you should care about data protection yourself. One possible way is to synchronize access to that library in code that calls it (your code). Other possible way is using immutable objects (for example make private copy of non-threadsafe data structure every time you want to work with it).
Other way is to design your application that way that the code that works with certain object always run in the same thread. It does not mean that code that is working with other object (even of the same class) cannot run int other thread. So, the system is multi-threaded but no data clashes are created.
'Vendor tells me that there is only one thread in which multiple instances of the componenet exist (Factory pattern to create the component instance) and each instance is independently controllable from it's API.'
That is not exactly 100% clear. What I think it means is:
1) Creation of components is not thread-safe. Maybe they are all stored internally in a non-threadsafe container. Presumably, destruction of the components is not thread-safe either.
2) Once created, the components are 'independently controllable' - this suggests strongly that they are thread-safe.
That's my take on it so far. Maybe your vendor could confirm it, just to be sure, before you proceed any further with a design.
It all depends on what your code is actually doing with the components. For example, ArrayList is not thread safe, but Vector is thread safe. However, if you use an ArrayList inside a thread in a way that is thread safe or thread neutral, it doesn't matter. For example, you can use ArrayLists without any issue in a JavaEE container for web services because each web service call is going to be on its own thread and no one in their right mind would have web service handling threads communicating with each other. In fact, Vectors are very bad in a JavaEE container if you can avoid using them because they're synchronized on most of their methods, which means the container's threads will block until any operation is done.
As AlexR said, you can synchronize things, but the best approach is to really look at your code and figure out if the threads are actually going to be sharing data and state or going off and doing their own thing.
I have multiple client handler threads, these threads need to pass received object to a server queue and the sever queue will pass another type of object back to the sending thread. The server queue is started and keeps running when the server starts.I am not sure which thread mechanism to use for the client handler threads notified an object is sent back. I don't intend to use socket or writing to a file.
If you wanted to do actual message passing take a look at SynchronusQueue. Each thread will have reference to the queue and would wait until one thread passed the reference through the queue.
This would be thread safe and address your requirements.
Though if you are simply looking to have threads read and write a shared variable you can use normalocity's suggestion though it's thread-safety depends on how you access it (via sychronized or volatile)
As far as making objects accessible in Java, there's no difference between multi-thread and single-thread. You just follow the scope rules (public, private, protected), and that's it. Multiple threads all run within the same process, so there isn't any special thread-only scope rules to know about.
For example, define a method where you pass the object in, and make that method accessible from the other thread. The object you want to pass around simply needs to be accessible from the other thread's scope.
As far as thread-safety, you can synchronize your writes, and for the most part, that will take care of things. Thread safety can get a bit hairy the more complicated your code, but I think this will get you started.
One method for processing objects, and producing result objects is to have a shared array or LinkedList that acts as a queue of objects, containing the objects to be processed, and the resulting objects from that processing. It's hard to go into much more detail than that without more specifics on what exactly you're trying to do, but most shared access to objects between threads comes down to either inter-thread method calls, or some shared collection/queue of objects.
Unless you are absolutely certain that it will always be only a single object at a time, use some sort of Queue.
If you are certain that it will always be only a single object at a time, use some sort of Queue anyway. :-)
Use a concurrent queue from the java.util.concurrent.*.
why? Almost guaranteed to provide better general performance than any thing hand rolled.
recommendation: use a bound queue and you will get back-pressure for free.
note: the depth of queue determines your general latency characteristics: shallower queues will have lower latencies at the cost of reduced bandwidth.
Use Future semantics
why? Futures provide a proven and standard means of getting asynchronous result.
recommendation: create a simple Request class and expose a method #getFutureResponse(). The implementation of this method can use a variety of signaling strategies, such as Lock, flag (using Atomic/CAS), etc.
note: use of timeout semantics in Future will allow you to link server behavior to your server SLA e.g. #getFutureResponse(sla_timeout_ms).
A book tip for if you want to dive a bit more into communication between threads (or processes, or systems): Pattern-Oriented Software Architecture Volume 2: Patterns for Concurrent and Networked Objects
Just use simple dependency injection.
MyFirstThread extends Thread{
public void setData(Object o){...}
}
MySecondThread extends Thread{
MyFirstThread callback;
MySecondThread(MyFirstThread callback){this.callback=callback)
}
MyFirstThread t1 = new MyFirstThread();
MySecondThread t2 = new MySecondThread(t1);
t1.start();
t2.start();
You can now do callback.setData(...) in your second thread.
I find this to be the safest way. Other solutions involve using volatile or some kind of shared object which I think is an overkill.
You may also want to use BlockingQueue and pass both of those to each thread. If you plan to have more than one thread then it is probably a better solution.
Here's my thinking:
Even though a HTTP request cycle is essentially handled by a 'single thread', each time a HTTP request is processed for that same session it is likely to be processed by a different thread from the thread pool.
Without the volatile keyword being used on a domain model object, whose lifecycle extends across multiple HTTP requests for the same session, then, according to my understanding, isn't it possible that the attribute could be thread local cached (an optimization by the compiler) in the thread that serviced the first HTTP request? If the second HTTP request is serviced by another thread then that second thread may not see the changes in that attribute that were made by the first thread.
Does this spell "Danger Will Robinson"? Or am I missing a vital plot point about the use (or not) of the volatile keyword?
I think you are forgetting that the threads handling the HTTP request first need to retrieve the instance of the domain model object from the HttpSession provided by your application server. The thread handling request 2 in the scenario you describe does not already have an instance of this domain model - it has to retrieve it from the session implementation at the start of handling each and every request.
I think it is completely reasonable to assume that the session-handling implementation in your application server is handling session data in such a way that memory model visibility issues are avoided. Apache Tomcat's default (non-clustered) HttpSession implementation, for example, stores the session attributes in a ConcurrentHashMap.
Adding volatile seems completely unnecessary to me. I have never seen this done for domain model objects handled by HTTP requests in a Servlet environment in any project I have worked in.
This would be a different story if thread-1 and thread-2 had references to the same object instance simulatenously while processing two different requests, and you were concerned about changes in one thread being visible to the other as each are processing the request, but this does not sound like what you are asking about.
Yes, if you are sharing an object between different threads, you may have race conditions. Without a happens before relationship, writes made by one thread may not be seen by a read in another thread.
Doing a volatile write in one thread and doing a volatile read of the same field in another thread establishes a happens before relationship between the two threads, and ensures visibility of the write.
This is a complicated problem, simply using a volatile keyword is probably not a good solution.
I think your understanding of it is correct. Given your description I would say it should be used. If its something more than a primitive type I would rather synchronize.
Good information on volatile:
http://www.javamex.com/tutorials/synchronization_volatile_when.shtml
If you have a mutable object in session, that is trouble. But usually the solution is not to guard individual fields; rather the entire object should be swapped.
Say you have the user object in the session. Most requests simply retrieve it, read it and display it.
There is a request that can modify user information. It would be a really bad idea to retrieve the user object, modify it. It's better to create complete new user object, and insert it into session.
In that case, fields in User don't need any protection; thread safety is guaranteed by session setAttribute() - getAttribute()
If you have concurrency issues, just adding 'volatile' probably won't help you.
As for keeping the object as an attribute of Session, I'd recommend you to keep just the object's ID, and use it to retrieve a 'live' instance when you need it (if you use Hibernate, successive retrieves will return the same object, so this shouldn't cause performance problems). Encapsulate all modification logic to this specific object into a single façade, and do the control concurrency there, using dababase locking.
Or, if you really, really, really want to use memory-based locking, and are really sure that you'll never have two instances of the application running in a cluster, make sure that your façade logic is synchronized at the right level. If your synchronization is too fine grained (low-level operations, such as volatile variables), it probably won't be enough to make your code thread-safe. For example, java.util.Hashtable is fully synchronized, but it doesn't mean anything if you have logic like this:
01 if (!hashtable.containsKey(key)) {
02 hashtable.put(key, calculate(key));
03 }
If two threads, say, t1 and t2, hit this block at the same time, t1 may execute line 01, then t2 may also execute 01, and then 02, and t1 then will execute 02, overwriting what t2 had done. The operations containsKey() and put() are atomic individually, but what should be atomic is the whole block.
Sometimes recalculating a value doesn't matter, but sometimes it does, and it will break.
When it comes to concurrency, there's no magic. I mean, seam some crappy frameworks try to sell you the idea that they solve this problem for you. They don't. Even if it works 99% of the time, it will break spectacularly when you go to production and start to get heavy traffic. Or (much, much) worse, it will silently generate wrong results.
Concurrency is one of the most complex problems in programming. And the only way to handle it is to avoid it. All this functional programming trend is not about dealing with concurrency, is about avoiding it altogether.
It turns out that volatile was not needed in the end. The problem that "appeared" to be fixed with volatile was actually a very subtle timing sensitive bug that was fixed in a much more elegant and proper way ;)
So sbrigdes was correct when he said "simply using a volatile keyword is probably not a good solution."
My question is about threads being queued. For my example I have one Spring context. I have a method named CalculateTax in a stateless class. A request comes in, a thread is created (tA) and it eventually enters the CalculateTax method. Within the same "time frame" another request comes in and another thread is created (tB). Now, here is what I want to understand. AFAIK tB cannot execute CalculateTax until tA has exited the method. Is this true?
As long as CalculateTax only uses local variables (i.e. declared in the method), you will not have any thread sync issues and multiple threads can call the method without a problem.
However if for some reason CalculateTax uses variables defined at the class level, and you are using the Singleton pattern (you tagged your question with "singleton", so I guess you are), you may have thread sync issues.
No it is not true if they are parallel thread, each thread is in its own stack of execution so it should be able to execute while tA is executing.
This is what Threads are for.
Generally speaking the answer is undefined. If your 'request' comes from remote client the answer depends on implementation details of the mechanism used for service exposing.
However, I'm not aware about remote communication frameworks that really make the proxy serializing the requests, i.e. that is assumed to be addressed by target service developer (e.g. its your task to provide thread-safety for the service implementation OR serialize all requests using explicit synchronization etc).