Any reason to initialize Entity properties with synchronized Collections? - java

For JPA-Entities in a project I work on, properties of type List or Map are always initialized to the synchronized implementations Vector and Hashtable.
(Unsynchronized ArrayList and HashMap are the standard implementations in Java, except if synchronization is really needed.)
Does anyone know a reason why synchronized Collections would be needed? We use EclipseLink.
When I asked about it, nobody knew why it was done like that. It seems it was always done like this. Maybe this was needed for an old version of EclipseLink?
I'm asking for two reasons:
I would prefer to use the standard implementations ArrayList and HashMap like anywhere else. If that's safe.
There's no matching synchronized Set implementation in the JDK. At least not a serializable one as EclipseLink expects.
Example Entity:
#Entity
public class Person {
...
#ManyToMany(cascade=CascadeType.ALL)
#JoinTable( ... )
private List<Role> accessRoles;
#ElementCollection
#CollectionTable( ... )
#MapKeyColumn(name="KEY")
#Column(name="VALUE")
private Map<String, String> attrs;
public Person() {
// Why Vector/Hashtable instead of ArrayList/HashMap?
accessRoles = new Vector<Role>();
attrs = new Hashtable<String, String>();
}
public List<Role> getAccessRoles() {
return accessRoles;
}
public void setAccessRoles(List<Role> accessRoles) {
this.accessRoles = accessRoles;
}
public Map<String, String> getAttrs() {
return attrs;
}
public void setAttrs(Map<String, String> attrs) {
this.attrs = attrs;
}
}

There's usually no need for a Vector and an ArrayList is more commonly used. So if your current codebase is full of Vectors, this is a bit of a code smell and it is wise to make sure your team members know what the difference is. See also What are the differences between ArrayList and Vector? and Why is Java Vector class considered obsolete or deprecated?
That does not mean you should do the Big Cleanup and replace all the Vectors in your existing code with ArrayLists.
Your code uses Lists and you won't notice a single difference when programming.
The only advantage to be expected is increased performance.
It is hard to tell if none of your code depends on the synchronization provided by the Vectors.
So, unless you are currently suffering performance issues, or are explicitly (re)designing the synchronization of your entire codebase, you risk introducing hard to fix concurrency bugs without any benefits.
Also, be aware that performance suffers most significantly from the use of Vectors when multiple threads access your collections concurrently. So if you are suffering from performance loss and decide to replace the Vectors for that reason, you'll need to be very careful to keep access sufficiently synchronized.
EDIT: You ask about EclipseLink JPA specifically.
It'd be rather surprising if they demanded you use Vectors and Hashtables since that would mean they ask you to rely on obsolete data structures.
In their examples, they use ArrayLists and HashMaps so from that we may conclude that this is indeed not the case.
Diving a bit more specifically into the source code, we can see that their CollectionContainerPolicy uses the Collection interface and does not care about the implementation of your collections. It does, however, surprisingly have special cases for when your internal collection class is Vector. See for instance buildContainerFromVector. And its default container class is Vector, though you can alter that.
See also the documentation for the Container policy.
The most intrusive moment where EclipseLink and your Lists meet is when you're lazy loading collections. EclipseLink will replace the collection with its own IndirectList which internally uses a Vector. See What collections does jpa return? So in those cases, EclipseLink will give you a Vector anyways(!) and it does not even matter what collection you specify in the collection's initialization.
So EclipseLink indeed has a preference for using Vectors and using
Vectors with EclipseLink means less copying of object references from
one collection to the other.

Much of the internals of EclipseLink date back to a time when Vector and Hashtable were the standard collection types in Java. EclipseLink was TopLink back then, which originated from a persistence framework for Smalltalk - so, much of EclipseLinks code is actually older than Java itself, so to speak.
For many years I have worked with TopLink, and always their standard mappings for collection properties used Vector and Hashtable.
To me, the only reasonable explanation for Vector and Hashtable still appearing in EclipseLink is that it has been working like this for a long time and - because it is working - hitherto no one has gotten around to changing it.
For myself, I wouldn't ever use Vector or Hashtable again. If I need a synchronized collection, I'd rather use the SynchronizedList ...Map etc. APIs.
Just my 2 ct.

Going through the code base of eclipselink, it looks like the usage of vector is inherited from older code base and is much like the Vector class itself - legacy.
Somehow the intent was to use Vector to allow multiple threads to act safely on the relationships which are loaded lazily - "indirection" in eclipselink parlance.
(More on the concepts here- the different types of indirection discussed being ValueHolder indirection, Transparent Indirection, Proxy indirection etc.)
However typically the entities and their relationships are not shared among multiple threads in usual use-cases. Each thread gets it's own copy of entity and its
relationships if accessed in their own unit of work.
In case of ValueHoder indirection - one of the implementations of ValueHoderInterface is ValueHoder which is typically initialized with a vector. The relevant part of code is below along with the
code comment as is. The comments are interesting as well
IndirectList.java
..........................
.........................
/**
* INTERNAL:
* Return the valueHolder.
* This method used to be synchronized, which caused deadlock.
*/
public ValueHolderInterface getValueHolder() {
// PERF: lazy initialize value holder and vector as are normally set after creation.
if (valueHolder == null) {
synchronized(this) {
if (valueHolder == null) {
valueHolder = new ValueHolder(new Vector(this.initialCapacity, this.capacityIncrement));
}
}
}
return valueHolder;
}
...................
..................
Also there were few issues reported due to the usage of Vector as mentioned here and here.

You don't need synchronized Collections for the JPA, It should be only related to the business logic.. Which i supposed that doesn't need this.. Because you would know.
So basically it is suggested to use not synchronize and it will increase performance.

As #flup answered with some interesting references, I could only make some additional presumptions:
The team that developed and/or the specifications simply were unaware of the Collection API.
The team wanted to use the code in a highly concurrent environment (either in your Web application, like passing some entities to some other threads or in another desktop application, as JPA is not limited to WEB applications only). Also do note, that IndirectSet is not thread-safe, so meaning that if the team wanted to write some thread-safe code, they should have taken some additional measures (if they use Sets)!

Related

Why factory methods for Collections produce immutable instances? [duplicate]

I am unable to get what are the scenarios where we need an immutable class.
Have you ever faced any such requirement? or can you please give us any real example where we should use this pattern.
The other answers seem too focused on explaining why immutability is good. It is very good and I use it whenever possible. However, that is not your question. I'll take your question point by point to try to make sure you're getting the answers and examples you need.
I am unable to get what are the scenarios where we need an immutable class.
"Need" is a relative term here. Immutable classes are a design pattern that, like any paradigm/pattern/tool, is there to make constructing software easier. Similarly, plenty of code was written before the OO paradigm came along, but count me among the programmers that "need" OO. Immutable classes, like OO, aren't strictly needed, but I going to act like I need them.
Have you ever faced any such requirement?
If you aren't looking at the objects in the problem domain with the right perspective, you may not see a requirement for an immutable object. It might be easy to think that a problem domain doesn't require any immutable classes if you're not familiar when to use them advantageously.
I often use immutable classes where I think of a given object in my problem domain as a value or fixed instance. This notion is sometimes dependent on perspective or viewpoint, but ideally, it will be easy to switch into the right perspective to identify good candidate objects.
You can get a better sense of where immutable objects are really useful (if not strictly necessary) by making sure you read up on various books/online articles to develop a good sense of how to think about immutable classes. One good article to get you started is Java theory and practice: To mutate or not to mutate?
I'll try to give a couple of examples below of how one can see objects in different perspectives (mutable vs immutable) to clarify what I mean by perspective.
... can you please give us any real example where we should use this pattern.
Since you asked for real examples I'll give you some, but first, let's start with some classic examples.
Classic Value Objects
Strings and integers are often thought of as values. Therefore it's not surprising to find that String class and the Integer wrapper class (as well as the other wrapper classes) are immutable in Java. A color is usually thought of as a value, thus the immutable Color class.
Counterexample
In contrast, a car is not usually thought of as a value object. Modeling a car usually means creating a class that has changing state (odometer, speed, fuel level, etc). However, there are some domains where it car may be a value object. For example, a car (or specifically a car model) might be thought of as a value object in an app to look up the proper motor oil for a given vehicle.
Playing Cards
Ever write a playing card program? I did. I could have represented a playing card as a mutable object with a mutable suit and rank. A draw-poker hand could be 5 fixed instances where replacing the 5th card in my hand would mean mutating the 5th playing card instance into a new card by changing its suit and rank ivars.
However, I tend to think of a playing card as an immutable object that has a fixed unchanging suit and rank once created. My draw poker hand would be 5 instances and replacing a card in my hand would involve discarding one of those instance and adding a new random instance to my hand.
Map Projection
One last example is when I worked on some map code where the map could display itself in various projections. The original code had the map use a fixed, but mutatable projection instance (like the mutable playing card above). Changing the map projection meant mutating the map's projection instance's ivars (projection type, center point, zoom, etc).
However, I felt the design was simpler if I thought of a projection as an immutable value or fixed instance. Changing the map projection meant having the map reference a different projection instance rather than mutating the map's fixed projection instance. This also made it simpler to capture named projections such as MERCATOR_WORLD_VIEW.
Immutable classes are in general much simpler to design, implement and use correctly. An example is String: the implementation of java.lang.String is significantly simpler than that of std::string in C++, mostly due to its immutability.
One particular area where immutability makes an especially big difference is concurrency: immutable objects can safely be shared among multiple threads, whereas mutable objects must be made thread-safe via careful design and implementation - usually this is far from a trivial task.
Update: Effective Java 2nd Edition tackles this issue in detail - see Item 15: Minimize mutability.
See also these related posts:
non-technical benefits of having string-type immutable
Downsides to immutable objects in Java?
Effective Java by Joshua Bloch outlines several reasons to write immutable classes:
Simplicity - each class is in one state only
Thread Safe - because the state cannot be changed, no synchronization is required
Writing in an immutable style can lead to more robust code. Imagine if Strings weren't immutable; Any getter methods that returned a String would require the implementation to create a defensive copy before the String was returned - otherwise a client may accidentally or maliciously break that state of the object.
In general it is good practise to make an object immutable unless there are severe performance problems as a result. In such circumstances, mutable builder objects can be used to build immutable objects e.g. StringBuilder
Hashmaps are a classic example. It's imperative that the key to a map be immutable. If the key is not immutable, and you change a value on the key such that hashCode() would result in a new value, the map is now broken (a key is now in the wrong location in the hash table.).
Java is practically one and all references. Sometimes an instance is referenced multiple times. If you change such an instance, it would be reflected into all its references. Sometimes you simply don't want to have this to improve robustness and threadsafety. Then an immutable class is useful so that one is forced to create a new instance and reassign it to the current reference. This way the original instance of the other references remain untouched.
Imagine how Java would look like if String was mutable.
Let's take an extreme case: integer constants. If I write a statement like "x=x+1" I want to be 100% confidant that the number "1" will not somehow become 2, no matter what happens anywhere else in the program.
Now okay, integer constants are not a class, but the concept is the same. Suppose I write:
String customerId=getCustomerId();
String customerName=getCustomerName(customerId);
String customerBalance=getCustomerBalance(customerid);
Looks simple enough. But if Strings were not immutable, then I would have to consider the possibility that getCustomerName could change customerId, so that when I call getCustomerBalance, I am getting the balance for a different customer. Now you might say, "Why in the world would someone writing a getCustomerName function make it change the id? That would make no sense." But that's exactly where you could get in trouble. The person writing the above code might take it as just obvious that the functions would not change the parameter. Then someone comes along who has to modify another use of that function to handle the case where where a customer has multiple accounts under the same name. And he says, "Oh, here's this handy getCustomer name function that's already looking up the name. I'll just make that automatically change the id to the next account with the same name, and put it in a loop ..." And then your program starts mysteriously not working. Would that be bad coding style? Probably. But it's precisely a problem in cases where the side effect is NOT obvious.
Immutability simply means that a certain class of objects are constants, and we can treat them as constants.
(Of course the user could assign a different "constant object" to a variable. Someone can write
String s="hello";
and then later write
s="goodbye";
Unless I make the variable final, I can't be sure that it's not being changed within my own block of code. Just like integer constants assure me that "1" is always the same number, but not that "x=1" will never be changed by writing "x=2". But I can be confidant that if I have a handle to an immutable object, that no function I pass it to can change it on me, or that if I make two copies of it, that a change to the variable holding one copy will not change the other. Etc.
We don't need immutable classes, per se, but they can certainly make some programming tasks easier, especially when multiple threads are involved. You don't have to perform any locking to access an immutable object, and any facts that you've already established about such an object will continue to be true in the future.
There are various reason for immutability:
Thread Safety: Immutable objects cannot be changed nor can its internal state change, thus there's no need to synchronise it.
It also guarantees that whatever I send through (through a network) has to come in the same state as previously sent. It means that nobody (eavesdropper) can come and add random data in my immutable set.
It's also simpler to develop. You guarantee that no subclasses will exist if an object is immutable. E.g. a String class.
So, if you want to send data through a network service, and you want a sense of guarantee that you will have your result exactly the same as what you sent, set it as immutable.
My 2 cents for future visitors:
2 scenarios where immutable objects are good choices are:
In multi-threading
Concurrency issues in multi-threaded environment can very well be solved by synchronization but synchronization is costly affair (wouldn't dig here on "why"), so if you are using immutable objects then there is no synchronization to solve concurrency issue because state of immutable objects cannot be changed, and if state cannot be changed then all threads can seamless access the object. So, immutable objects makes a great choice for shared objects in multi-threaded environment.
As key for hash based collections
One of the most important thing to note when working with hash based collection is that key should be such that its hashCode() should always return the same value for the lifetime of the object, because if that value is changed then old entry made into the hash based collection using that object cannot be retrieved, hence it would cause memory leak. Since state of immutable objects cannot be changed so they makes a great choice as key in hash based collection. So, if you are using immutable object as key for hash based collection then you can be sure that there will not be any memory leak because of that (of course there can still be memory leak when the object used as key is not referenced from anywhere else, but that's not the point here).
I'm going to attack this from a different perspective. I find immutable objects make life easier for me when reading code.
If I have a mutable object I am never sure what its value is if it's ever used outside of my immediate scope. Let's say I create MyMutableObject in a method's local variables, fill it out with values, then pass it to five other methods. ANY ONE of those methods can change my object's state, so one of two things has to occur:
I have to keep track of the bodies of five additional methods while thinking about my code's logic.
I have to make five wasteful defensive copies of my object to ensure that the right values get passed to each method.
The first makes reasoning about my code difficult. The second makes my code suck in performance -- I'm basically mimicking an immutable object with copy-on-write semantics anyway, but doing it all the time whether or not the called methods actually modify my object's state.
If I instead use MyImmutableObject, I can be assured that what I set is what the values will be for the life of my method. There's no "spooky action at a distance" that will change it out from under me and there's no need for me to make defensive copies of my object before invoking the five other methods. If the other methods want to change things for their purposes they have to make the copy – but they only do this if they really have to make a copy (as opposed to my doing it before each and every external method call). I spare myself the mental resources of keeping track of methods which may not even be in my current source file, and I spare the system the overhead of endlessly making unnecessary defensive copies just in case.
(If I go outside of the Java world and into, say, the C++ world, among others, I can get even trickier. I can make the objects appear as if they're mutable, but behind the scenes make them transparently clone on any kind of state change—that's copy-on-write—with nobody being the wiser.)
Immutable objects are instances whose states do not change once initiated.
The use of such objects is requirement specific.
Immutable class is good for caching purpose and it is thread safe.
By the virtue of immutability you can be sure that the behavior/state of the underlying immutable object do not to change, with that you get added advantage of performing additional operations:
You can use multiple core/processing(concurrent/parallel processing) with ease(as the sequence of operations will no longer matter.)
Can do caching for expensive operations (as you are sure of the same
result).
Can do debugging with ease(as the history of run will not be a concern
anymore)
Using the final keyword doesn't necessarily make something immutable:
public class Scratchpad {
public static void main(String[] args) throws Exception {
SomeData sd = new SomeData("foo");
System.out.println(sd.data); //prints "foo"
voodoo(sd, "data", "bar");
System.out.println(sd.data); //prints "bar"
}
private static void voodoo(Object obj, String fieldName, Object value) throws Exception {
Field f = SomeData.class.getDeclaredField("data");
f.setAccessible(true);
Field modifiers = Field.class.getDeclaredField("modifiers");
modifiers.setAccessible(true);
modifiers.setInt(f, f.getModifiers() & ~Modifier.FINAL);
f.set(obj, "bar");
}
}
class SomeData {
final String data;
SomeData(String data) {
this.data = data;
}
}
Just an example to demonstrate that the "final" keyword is there to prevent programmer error, and not much more. Whereas reassigning a value lacking a final keyword can easily happen by accident, going to this length to change a value would have to be done intentionally. It's there for documentation and to prevent programmer error.
Immutable data structures can also help when coding recursive algorithms. For example, say that you're trying to solve a 3SAT problem. One way is to do the following:
Pick an unassigned variable.
Give it the value of TRUE. Simplify the instance by taking out clauses that are now satisfied, and recur to solve the simpler instance.
If the recursion on the TRUE case failed, then assign that variable FALSE instead. Simplify this new instance, and recur to solve it.
If you have a mutable structure to represent the problem, then when you simplify the instance in the TRUE branch, you'll either have to:
Keep track of all changes you make, and undo them all once you realize the problem can't be solved. This has large overhead because your recursion can go pretty deep, and it's tricky to code.
Make a copy of the instance, and then modify the copy. This will be slow because if your recursion is a few dozen levels deep, you'll have to make many many copies of the instance.
However if you code it in a clever way, you can have an immutable structure, where any operation returns an updated (but still immutable) version of the problem (similar to String.replace - it doesn't replace the string, just gives you a new one). The naive way to implement this is to have the "immutable" structure just copy and make a new one on any modification, reducing it to the 2nd solution when having a mutable one, with all that overhead, but you can do it in a more efficient way.
One of the reasons for the "need" for immutable classes is the combination of passing everything by reference and having no support for read-only views of an object (i.e. C++'s const).
Consider the simple case of a class having support for the observer pattern:
class Person {
public string getName() { ... }
public void registerForNameChange(NameChangedObserver o) { ... }
}
If string were not immutable, it would be impossible for the Person class to implement registerForNameChange() correctly, because someone could write the following, effectively modifying the person's name without triggering any notification.
void foo(Person p) {
p.getName().prepend("Mr. ");
}
In C++, getName() returning a const std::string& has the effect of returning by reference and preventing access to mutators, meaning immutable classes are not necessary in that context.
They also give us a guarantee. The guarantee of immutability means that we can expand on them and create new patters for efficiency that are otherwise not possible.
http://en.wikipedia.org/wiki/Singleton_pattern
One feature of immutable classes which hasn't yet been called out: storing a reference to a deeply-immutable class object is an efficient means of storing all of the state contained therein. Suppose I have a mutable object which uses a deeply-immutable object to hold 50K worth of state information. Suppose, further, that I wish to on 25 occasions make a "copy" of my original (mutable) object (e.g. for an "undo" buffer); the state could change between copy operations, but usually doesn't. Making a "copy" of the mutable object would simply require copying a reference to its immutable state, so 20 copies would simply amount to 20 references. By contrast, if the state were held in 50K worth of mutable objects, each of the 25 copy operations would have to produce its own copy of 50K worth of data; holding all 25 copies would require holding over a meg worth of mostly-duplicated data. Even though the first copy operation would produce a copy of the data that will never change, and the other 24 operations could in theory simply refer back to that, in most implementations there would be no way for the second object asking for a copy of the information to know that an immutable copy already exists(*).
(*) One pattern that can sometimes be useful is for mutable objects to have two fields to hold their state--one in mutable form and one in immutable form. Objects can be copied as mutable or immutable, and would begin life with one or the other reference set. As soon as the object wants to change its state, it copies the immutable reference to the mutable one (if it hasn't been done already) and invalidates the immutable one. When the object is copied as immutable, if its immutable reference isn't set, an immutable copy will be created and the immutable reference pointed to that. This approach will require a few more copy operations than would a "full-fledged copy on write" (e.g. asking to copy an object which has been mutated since the last copy would require a copy operation, even if the original object is never again mutated) but it avoids the threading complexities that FFCOW would entail.
Why Immutable class?
Once an object is instantiated it state cannot be changed in lifetime. Which also makes it thread safe.
Examples :
Obviously String, Integer and BigDecimal etc. Once these values are created cannot be changed in lifetime.
Use-case :
Once Database connection object is created with its configuration values you might not need to change its state where you can use an immutable class
from Effective Java;
An immutable class is simply a class whose instances cannot be modified. All of
the information contained in each instance is provided when it is created and is
fixed for the lifetime of the object. The Java platform libraries contain many
immutable classes, including String, the boxed primitive classes, and BigInte-
ger and BigDecimal. There are many good reasons for this: Immutable classes
are easier to design, implement and use than mutable classes. They are less prone
to error and are more secure.
An immutable class is good for caching purposes because you don't have to worry about the value changes. Another benefit of an immutable class is that it is inherently thread-safe, so you don't have to worry about thread safety in case of a multi-threaded environment.

How can I reuse collections that would use the same backing iterator?

I'm fairly new to Java so my knowledge is pretty limited. I'm working on a personal project where I'm trying out some of the techniques used in Guava for creating views/transformations of collections. I made a class called View to take an inputted collection as the backing iterable, and a transformation, and then present it as a read-only iterable. (not a collection, though I don't think it makes much of a difference for this question). Here is a quick example of using it...
public class Node {
public enum Change implements Function<Node, Coordinate> {
TO_COORDINATE;
#Override public Coordinate apply(Node node) {
return new Coordinate(node);
}
}
private HashSet<Node> neighborNodes = new HashSet<Node>();
//various other members
public View<Coordinate> viewNeighborCoordinates() {
return new View<Coordinate>(neighborNodes, Change.TO_COORDINATE);
}
}
now if some method wants to use viewNeighborCoordinates() of this node, and then later some other method also wants to viewNeighborCoordinates() of this node, it seems wasteful to always be returning new objects, right? I mean any number of things should be able to share reference to a view of the same backing iterable with the same transformation, since all they're doing is reading through it. Is there an established way of managing a shared pool of objects which can be "interned" like Strings are? Is it just having to make some sort of ViewFactory that stores a running list of views in use, and everytime someone wants a view, it checks to see if it already has that view and hands it out? (is that even more efficient)?
As already stated, interning is possible (look at Interners), but most probably a bad idea.
Another possibility is lazy initialization of a field storing the View. Since I'm lazy as well, I only point you to a Lombok implementation. Be careful with DCL, if you want to try this. In case your class is immutable, you may need no synchronization at all, like e.g. String.hashCode.
A very simple possibility is eager initialization of a field. Assuming you need the view often, it's the best way.
But without knowing more, your current implementation is best. Beware the root of all evil.
Don't optimize without profiling or benchmarking (and if you benchmark, then do it right, i.e., using caliper or jmh. Home-baked benchmarking in Java just doesn't work).

When to use lazy values in Scala?

Why Scala introduces lazy parameters. Shouldn't it be managed by the JVM (invisible for the user) how the value is initialized? What is the real world use case in which it is worth to give the control into developers hand and define values as lazy?
The by-name parameters: one of the primary motivations was to support dsls. They allow you to have a really nice syntax in APIs, that almost feel as if they're built into the language. For example, you can very easily define your own custom repeat-loop:
def repeat(body: =>Unit)(until: =>Boolean): Unit = {
body
if (until) {} else repeat(body)(until)
}
And then use it as if it were a part of the language.
var i = 0
repeat {
println(i)
i += 1
} (i < 3)
Or you could similarly spawn a new thread like this: spawn { println("on the new thread!") }, or you could do automatic resource management of your FileInputStreams like this: withFile("/home/john/.bashrc") { println(_.contents) }.
The lazy values - the motivations here are:
lazy data-structures like Streams that are popular in functional languages that you can use to implement efficient data-structure a-la Okasaki's functional queues.
to avoid allocating or initializing some expensive resources if they're never used in some object, e.g. file handles or database connections.
to initialize objects fields in the correct order, for objects composed of many mixins.
to achieve a correct "initialize only once" semantics when there are many threads sharing a single value (see introduction here).
to have a translation scheme for nested singleton objects:
class A { object B }
becomes something like:
class A {
class A$B$
lazy val B = new A$B$
}
One common scenario is when the writer of a class does not know whether an expensive-to-initialize val will be used. In this case, the val is initialized on demand.
Another scenario is to organically control sequencing of initialization. Often an object is created long before a particular val can be initialized, because other classes haven't been initialized yet. In this case, laziness provides a convenient way for this sequencing to occur naturally, without the author coming up with a Master Plan that sequences a complex, multiphase initialization.
TLDR: because it freaks user out and due to performance reasons
Most of the today's languages are eager. Some of them are not and they called lazy. While many programming problems could be expressed in a beautiful and concise way through lazy evaluation, I don't think having absolute lazyness is a good idea. From subjective perspective programmers are used to think in a eager way (especially those who come from imperative lands) so naively written program in, say, Haskell may confuse you a lot. Having only forks for every possible dish is not so good as having a choice between fork and spoon and although scala support lazy evaluation on language level it defaults to eager model. The reason (besides personal choice of Martin and other language designers) is interop between Java and Scala -- it would be a nightmare to compose this two worlds in one language. Moreover, at the time of Scala design JVM was not there yet to support such features and more or less performant lazy vals were made possible only with introduction of method handles in Java 7 (just two years ago, whereas scala is there for a decade).
I will answer my own question. So one use case when lazy values are extremely useful is if you want to create an immutable data structure with cycles. What is not easy possible without laziness because otherwise you would have to modify an object which is already created. This is not possible if you want your objects were immutable. Let me use as an example the simple cycle implementation.
So in Scala you could implement this in the following way
class Node(inNode: => Node) { lazy val in = inNode }
lazy val node :Node = new Node(new Node(node))
This way you created an immutable cycle. You can verify the result by comparing the references.
scala> node.in
res3: Node = Node#2d928643
scala> node.in.in
res4: Node = Node#3a5ed7a6
scala> node
res5: Node = Node#3a5ed7a6

Giving a class member a reference to another classes members

On a scale of one to ten, how bad is the following from a perspective of safe programming practices? And if you find it worse than a five, what would you do instead?
My goal below is to get the data in the List of Maps in B into A. In this case, to me, it is ok if it is either a copy of the data or a reference to the original data. I found the approach below fastest, but I have a queasy feeling about it.
public class A {
private List<Map<String, String>> _list = null;
public A(B b) {
_list = b.getList();
}
}
public class B {
private List<Map<String, String>> _list = new ArrayList<Map<String, String>>();
public List<Map<String, String>> getList() {
// Put some data in _list just for the sake of this example...
_list.add(new HashMap<String, String>());
return _list;
}
}
The underlying problem is a bit more complex:
From a security perspective, this is very, very bad.
From a performance perspective, this is very, very good.
From a testing perspective, it's good because there is nothing in the class that you can't easily reach from a test
From an encapsulation perspective, it's bad since you expose the inner state of your class.
From a coding safety perspective, it's bad because someone will eventually abuse this for some "neat" trick that will cause odd errors elsewhere and you will waste a lot of time to debug this.
From an API perspective, it can be either: It's hard to imagine an API to be more simple but at the same time, it doesn't communicate your intent and things will break badly if you ever need to change the underlying data structure.
When designing software, you need to keep all of these points in the back of your mind. With time, you will get a feeling which kinds of errors you make and how to avoid them. Computers being as dump and slow as they are, there is never a perfect solution. You can just strive to make it as good as you can make it at the when you write it.
If you want to code defensively, you should always copy any data that you get or expose. Of course, if "data" is your whole data model, then you simply can't copy everything each time you call a method.
Solutions to this deadlock:
Use immutables as often as you can. Immutables and value objects are created and never change after that. These are always safe and the performance is OK unless the creation is very expensive. Lazy creation would help here but that is usually its own can of worms. Guava offers a comprehensive set of collections which can't be changed after creation.
Don't rely too much on Collections.unmodifiable* because the backing collection can still change.
Use copy-on-write data structures. The problem above would go away if the underlying list would clone itself as soon as A or B start to change it. That would give each its own copy effectively isolation them from each other. Unfortunately, Java doesn't have support for these built in.
In this case, to me, it is ok if it is either a copy of the data or a reference to the original data.
That is the sticking point.
Passing the object instance around is the fastest, but allows the caller to change it, and also makes later changes visible (there is no snapshot).
Usually, that is not a problem, since the caller is not malicious (but you may want to protect against coding errors).
If you do not want the caller to make changes, you could wrap it into an immutable wrapper.
If you need a snapshot, you can clone the list.
Either way, this will only snapshot/protect the list itself, not its individual elements. If those are mutable, the same reasoning applies again.
I would say that you will have too choose between efficiency and encapsulation. By directly accessing a member of the class it will have its state changed. That might be unexpected and lead to nasty surprises. I would also say that it increases the coupling between the two classes.
An alternative is to let the information expert principle decide and leave the job to the class that have the information. You will have to judge if the work that was suppose to be done with class A really is the responsibility of class B.
But really, speed and clean code can be conflicting interests. Some times you just have to play dirty to get it quick enough.
All you're creating is a reference to B._list. So 10 if you wanted to copy the items.
You could iterate over all b._list items and add them to the A._list manually:
public A(B b) {
_list = new List<Map<String, String>> ();
for (Map<String,String> map : b.getList()) {
Map<String,String> newMap = new HashMap<String,String>();
while(map.keySet().iterator().hasNext()) {
String key = map.keySet().iterator().next();
newMap.put(key,map.get(key));
}
_list.add(newMap);
}

Might EnumMap be considered a reasonable alternative to Java beans?

Curious if anybody has considered using EnumMap in place of Java beans, particularly "value objects" (with no behavior)? To me it seems that one advantage would be that the name of a "property" would be directly accessible from the backing Enum, with no need for reflection, and therefore I'd assume it would be faster.
It may be a little faster then using reflection (I didn't measure it, didn't find any metrics in Google either); however there are big disadvantages to this approach:
You're losing type safety. Instead of int getAge() and String getName() everything is Object get(MyEnum.FIELD_NAME). That'll provide for some ugly code and run-time errors right there.
All the javabean niceties we've come to love and enjoy (for example, property-level annotations) are gone.
Since you can have NO BEHAVIOR AT ALL, the applicability of this approach seems rather limited.
The bottom line is - if you really truly need that alleged :-) boost in performance (which you'll have to measure to prove it exists) this may be a viable approach under very specific circumstances. Is it a viable alternative to javabeans at large? Most certainly not.
A bean is meant to be mutable, hence the setter methods. EnumMap is comparable in speed to using a HashMap with integers as the Key, but are Keys are Immutable. Beans and EnumMaps serve two different purposes. If all of the Keys are known at design time and are guaranteed to never change, then using an EnumMap will be fine.
Updating a bean is much simpler than changing the backing Enum of the EnumMap with much less chance of creating errors downstream in the code.
I wrote a Record class that maps keys to values and works by delegating to a fully synchronized EnumMap. The idea is that a Record can get new fields at runtime whereas the Bean can't. My conclusion is that with this flexibility comes a performance hit. Here's a run comparing the Record class to a fully synchronized Bean. For 10 million operations:
Record set(Thing, a) 458 ms
Bean setThing(a) 278 ms
Record get(Thing) 398 ms
Bean getThing 248 ms
So, there is something to gain in knowing your data objects and writing a class that models them statically. If you want to have new fields padded on to your data at runtime, it will cost you.
I don't understand how you can remove 'class profileration' with EnumMaps. Unless you have a generic enum with 20-odd properties to reuse for every 'bean', you're still inventing an enum to use for each enum map, e.g.
public enum PersonDTOEnum {
A, S, L;
}
as opposed to
class Person {
int a;
int s;
String l;
// getters + setters elided
}
Not to mention that everything is a String now.
I had not previously specified this, but I am working with a ResultSet. Therefore I want to provide this answer for the sake of completeness.
Commons/BeanUtil's "RowSetDynaClass" could be the happy medium between the excessive boilerplate associated with concrete beans, and the limitations of EnumMap

Categories

Resources