How to configure Guice modules in parallel? - java

Background: My codebase has a lot of modules and, according to some profiling, creating the Guice injector takes a significant amount of time. This is almost certainly due to the massive amount of modules and the existence of a few modules that take a long time to configure. In theory I can produce 2+ lists of modules that can be configured separately.
Is there any way to parallelize the configuration of these modules?
For example, if there's a way to combine or merge two Guice injectors I could create them in separate threads then join them afterwards.

You might want to use Concurrent Singleton library from Netflix, it allows you to lazy load the Guice modules in concurrent fashion.
https://github.com/Netflix/governator/wiki/Concurrent-Singleton
Guice’s default Singleton scope synchronizes all object creation on a
single lock (see here). It does this to avoid deadlocks with circular
dependencies. Governator adds the FineGrainedLazySingleton annotation
that locks on the Guice Key so that multiple singletons can be created
concurrently. Circular dependencies are rare so
FineGrainedLazySingleton risks deadlocks in those situations for the
benefit of better concurrency.
A class annotated with FineGrainedLazySingleton will be
lazily created (like Lazy Singleton)
created by the FineGrainedLazySingletonScope which synchronizes on
the Guice Key instead of InternalInjectorCreator.class)
able to be created alongside other FineGrainedLazySingleton in
different threads

Related

Dedicated Spring Container vs Spring Scope

What are pros and cons of implementing a custom scope vs using a dedicated child container for an independent job execution?
I'm implementing a complex business functionality that needs a couple of dedicated beans. So far I managed fine to create objects with custom factories or object providers in prototype scope per execution and keep others stateless as singleton beans, but as complexity rises, I want to get a better container support for my job executions.
I did a project some time ago, where I implemented a custom scope. We used a thread scope that's inherited by child threads, so we only needed to ensure that the job was executed in it's own thread graph.
My last project used spring batch for batch job execution. In the spring batch docs I read the recommendation to use a child container per job, once thing's are starting to get complex. Additionally, it provides JobScope and StepScope.
I would be really interested in performance statistics and complexity experience?

Singleton or Instance Caches?

Although this question concerns EhCache primarily, it really applies to caching frameworks (and plain old caching) in general.
EhCache allows you to create a singleton CacheManager for managing all of the Caches in your application, or it allows you to create "instance" CacheManager, which means exactly what it sounds like: multiple managers in use throughout your application.
What are the pros/cons of each? At first glimpse it seems like it would be cleaner to just have a singleton manager.
The only conceivable reason I can think of for why one would want multiple instance managers is the fact that all caches living inside a CacheManager must share the same configurations: same size, usage strategies, capacities, etc. So if you wanted multiple caches, each configured differently, the singleton CacheManager would not be able to provided the different cache configurations to each cache.
Is this the only criteria for determining singleton vs instance managers? If not, what are some other considerations? Is there a noticeable performance cost associated with either? Thanks in advance!
What about automated testing? Maybe it can be useful for unit testing if you can enable/disable/configure caching on a smaller level? Just a thought.

How to create a cross-process Singleton class in Java

Is is possible to create a universal Singleton class, which is, at any given time, only one instance is shared across multiple Java processes?
Multiple Java processes don't share the same virtual machine.
So you would end up with one JVM instance hosting the singleton, then one JVM instance per process which access the singleton using Remote Method Invocation as #Little Bobby Tables suggested.
Anyway consider When is a Singleton not a Singleton:
Multiple Singletons in Two or More Virtual Machines
When copies of the Singleton class run in multiple VMs, an instance is created for each machine. That each VM can hold its own Singleton might seem obvious but, in distributed systems such as those using EJBs, Jini, and RMI, it's not so simple. Since intermediate layers can hide the distributed technologies, to tell where an object is really instantiated may be difficult.
For example, only the EJB container decides how and when to create EJB objects or to recycle existing ones. The EJB may exist in a different VM from the code that calls it. Moreover, a single EJB can be instantiated simultaneously in several VMs. For a stateless session bean, multiple calls to what appears, to your code, to be one instance could actually be calls to different instances on different VMs. Even an entity EJB can be saved through a persistence mechanism between calls, so that you have no idea what instance answers your method calls. (The primary key that is part of the entity bean spec is needed precisely because referential identity is of no use in identifying the bean.)
The EJB containers' ability to spread the identity of a single EJB instance across multiple VMs causes confusion if you try to write a Singleton in the context of an EJB. The instance fields of the Singleton will not be globally unique. Because several VMs are involved for what appears to be the same object, several Singleton objects might be brought into existence.
Systems based on distributed technologies such as EJB, RMI, and Jini should avoid Singletons that hold state. Singletons that do not hold state but simply control access to resources are also not appropriate for EJBs, since resource management is the role of the EJB container. However, in other distributed systems, Singleton objects that control resources may be used on the understanding that they are not unique in the distributed system, just in the particular VM.
Yes, but not without external facilities. The simplest way is to use RMI. Other options include CORBA or Web Services - Just google it up.

Stateless Session Beans vs. Singleton Session Beans

The Java EE 6 Tutorial says:
To improve performance, you might choose a stateless session bean if it has any of these traits:
The bean’s state has no data for a specific client.
In a single method invocation, the bean performs a generic task for all clients. For example, you might use a stateless session bean to send an email that confirms an online order.
The bean implements a web service.
Singleton session beans are appropriate in the following circumstances:
State needs to be shared across the application.
A single enterprise bean needs to be accessed by multiple threads concurrently.
The application needs an enterprise bean to perform tasks upon application startup and shutdown.
The bean implements a web service.
But what to use if:
no state has to be shared across the application
a single enterprise bean could be accessed by multiple threads concurrently
no tasks on startup or shotdown need to be performed
Say for example I have a login service with the following interface:
public interface LoginService {
boolean authenticate(String user, String password);
}
Should it be annotated with #Singleton or #Stateless? What are the benefits of the one and the other? What if LoginService needs to get injected an EntityManager (which would be used concurrently)?
Addition: I'm thinking about the Java EE counterpart of Spring service beans, which are stateless singletons. If I understand that correctly the Java EE counterpart are #Stateless session beans and #Singleton Beans are used to configure the application at startup or cleanup at shutdown or to hold application wide objects. Is this correct?
I would go for Stateless - the server can generate many instances of the bean and process incoming requests in parallel.
Singleton sounds like a potential bottleneck - the default #Lock value is #Lock(WRITE) but may be changed to #Lock(READ) for the bean or individual methods.
according to the ejb 3.1 spec, page 110, chapter 4.8.5 "Singleton Concurrency":
It is legal to store Java EE objects that do not support concurrent access (e.g. Entity Managers, Stateful Session Bean references) within Singleton bean instance state. However, it is the responsibility of the Bean Developer to ensure such objects are not accessed by more than one thread at a time.
and furthermore, according to the hibernate entitymanager documentation
An EntityManager is an inexpensive, non-threadsafe object that should be used once, for a single business process, a single unit of work, and then discarded.
For me, this means, that you should never inject an EntityManager into a singleton EJB. I would use a singleton EJB as a replacement for a stateless EJB only if EVERYTHING I need to implement in this class supports concurrency without the need to do additional locking / synchronization. As you or other programmers might lose this issue sooner or later from your focus, I personally prefer to not use singleton EJBs except for startup-related issues or features that can be implemented as self-contained units - independently of other beans. In that sense, it doesn't seem to be advisable to inject for example Stateless EJBs into Singletons. Doing so raises the question about the point in time, when the container actually performs the injection of the SLSB into the Singleton? According to the EJB 3.1 Spec, chapter 4.8, the dependency injection gets done before the singleton bean instance can be accessed by clients. So the singleton would obviously stick to the same instance of the SLSB, which seems to become a singleton implicitly, but there doesn't seem to be any guarantee for that. At least I couldn't find anything in the specs, so the behavior might be unpredictable or in the best case container-specific, which is not what most people will want.
Thus, I would only inject Singletons into Singletons or Singletons into SLSBs but not vice versa. For the case of an injection of a Singleton into a Singleton, the Spec offers you the opportunity to define the dependencies between the singletons so that the container can initialize them in the correct order (see the ejb 3.1 spec, chapter 4.8.1 concerning the #DependsOn annotation).
#Stateless will allow you to have multiple copies ready for processing within a JVM (as much as memory and pool size allows) where-as #Singleton there's only one copy in a JVM, even if the single one can support multiple concurrent threads running against it.
In terms of performance #Singleton would be better, provided that the resources it uses allow long running access. However, in a distributed environment sometimes bad things occur, e.g. database or network links may fail.
With a #Stateless bean, the access is more short lived. In addition, should there be a failure it will just respawn and try to establish a new connection to the resource. If something happens like that on a singleton, then it's up the the singleton to handle it without requiring an application restart because the #PostConstruct is only called once per JVM.
I would prefer a bit of fault tolerance vs performance for most situations especially on systems I have no control over.
I think Singleton in concurrency usage will not perform worse than SLSB Pool, it might be even better. The only problem is if you want to share something between threads, you need lock it, and that could be a big problem of performance. So in that case, a SLSB Pool perform much better, because it's not 100% singleton, there are more instances, one got locked, the other one comes up. Anyway if the lock is on some resource sharing by all SLSBs, the pool won't help neither.
In short, I think singleton is better than SLSB Pool, you should use it if you can. It's also the default scope for Spring Beans.
I'm not a JavaEE expert, that's just my feeling, please correct me if I'm wrong.
I think you should use Singleton session bean. Because a login service should be a global service and it does not need to store any state for a concrete user or invocation.
If you're sure you're not sharing state between threads, then a Singleton will be fine in which case you should also annotate the class with #ConcurrencyManagement( ConcurrencyManagementType.BEAN ) which will allow multiple threads running at the same time.
you should go for Singleton if you have any resource that is going to remain constant across the application. Like loading some data from some file or reference data which would not change across the application lifecycle. otherwise, go for SLSB. The drawback of SLSB is that multiple objects would be created hence more memory would be occupied.
Imho I would answer like that:
"no state has to be shared across the application" leads me to stateless bean because of the sentence "To improve performance, you might choose a stateless session bean...".
Considering "a single enterprise bean could be accessed by multiple threads concurrently" you would have to use singleton. If I got it right it is not even possible to access a stateless bean's concurrently when properly used.
"no tasks on startup or shotdown need to be performed" would not matter to me. If tasks have to be done to properly setup a bean then they have to be invoked by a #PostActivate method.
I would logically conclude your question what to use to #Singleton since you asked for concurrent access. Of course you will have to manually control snychronisation of accesses on any further resources (which are not EJBs).

Do you really need stateless session beans in this case?

We have a project with a pretty considerable number of EJB 2 stateless session beans which were created quite a long time ago. These are not the first-line beans which are accessed from our client via RMI, rather they are used by that code to perform specific functions. However, I've come to believe that there's nothing to be gained by having them as session beans at all.
They do not need to be accessed via
RMI.
They do not retain any state,
they are just code that was factored
out of the first set of beans to
reduce their complexity.
They don't
have multiple different
implementations which we are swapping
out, each one has been as it was for
years (barring bug fixes and feature
additions).
None of them alter the
transaction that comes into them from the bean calling them
(that is they don't require a new
transaction, not participate in the
existing one, or otherwise change
things).
Why should these not all just be classes with a couple of static functions and no EJB trappings at all?
The only reason I can see is for clustering purposes (if you are doing clustering). That is the hand off to those beans could be on another VM on another machine if clustering is being done right to spread the load around.
That is likely not the case, and the movement to EJB's was just over-engineering. I'm suffering with that too.
Even transactions aren't really enough to justify it, you can have a single EJB that handles the transactions and call the different code through it via a Command type pattern.
There seems to be no reason why they shouldn't just be simple POJO's rather than stateless session beans. I think this is the conclusion that people came to after using EJB 1.x in this manner as well.
It's also the reason why frameworks such as Spring exist as an alternative to EJB's.
I'd say change them over to be just standard POJO's, but make sure you have a safety net of unit and functional tests (which might be a little bit harder with EJB's) to help you.

Categories

Resources