Use a HashMap to store instance variables?

Use a HashMap to store instance variables? - java

I would like to create a base class that all classes in my program will extend. One thing I wanted to do was find a uniform way to store all instance variables inside the object.
What I have come up with is to use a HashMap to store the key/value pairs for the object and then expose those values through a get and set method.
The code that I have for this so far is as follows:
package ocaff;
import java.util.HashMap;
public class OcaffObject {
private HashMap<String, Object> data;
public OcaffObject() {
this.data = new HashMap<String, Object>();
}
public Object get(String value) {
return this.data.get(value);
}
public void set(String key, Object value) {
this.data.put(key, value);
}
}
While functionally this works, I am curious if there are any real issues with this implementation or if there is a better way to do this?
In my day to day work I am a PHP programmer and my goal was to mimic functionality that I used in PHP in Java.

I don't think this is a good way to deal with what you mean.
Programming in java is quite different than programming in php, from my point of view.
You need to keep things clean and strongly typed, using the real paradigm of clean object oriented programming.
Some problems with this technique comes to my mind, here are some, not in importance order.
First problem you have with this is performance and memory footprint: this will consume a lot of memory and will perform very badly.
Second problem is concurrency, HashMap is not thread safe.
Third problem is type safety: you don't have type safety anymore, you can write to a field whatever you want and no one is checking it, a real anti-pattern.
Fourth problem is debugging... it will be hard to debug your code.
Fifth problem is: everyone can write and read any field knowing his name.
Sixth problem: when you change the name of a field in the hash set you don't get any kind of compile time error, you only get strange run-time behavior. Refactoring will become impossible.
Typed fields are much more useful and clean.

If you're taking the time to make a class for this, I would simply add what you need as members of the class. This will give you compile time checking of your class members, greatly reducing your subtle bugs.

Related

Is there anything wrong with replacing class attributes with a HashMap?

Just a theoretical question that could lead to some considerations in terms of design. What if you were to replace POJOs with this reusable class ? It might avoid some boilerplate code but what issues could it bring about ?
// Does not include failsafes, guards, defensive copying, whatever...
class MySingleGetterAndSetterClass{
private HashMap<String,Object> myProperties;
public SingleGetterAndSetter( String name ){
myProperties = new HashMap<String,Object>();
myProperties.put( "name", name );
}
public Object get( string propertyName ){
return myProperties.get( propertyName );
}
public Object set( string propertyName, Object value ){
myProperties.put( propertyName, value );
}
}

The main disadvantages
much slower
uses more memory
less type safety
more error prone
more difficult to maintain
more code to write/read
more thread safety problems (more ways to break) and more difficult to make thread safe.
harder to debug, note the order of fields can be arranged pseudo randomly, different for different objects of the same "type" making them harder to read.
more difficult to refactor
little or not support in code analysis.
no support in code completion.
BTW Some dynamic languages do exactly what you suggest and they have all these issues.

That would lead to very unstable code. None of your getting/setting would be compile-time checked. Generally you want your code to fail-fast, and compile-time is the absolute fastest that can be done.
To make it even relatively safe you'd have to have null-checks/exception handling all over the place, and then how do you consistently handle the case where the value isn't found, all over your code? It would get very bloated very fast.

Not compile checking.
You have to downcasting, this is not good.
Difficult to mantain.
Against OOP,
Your pojos are classes represents an abstraction of something in real world.
If i understood well you want to put their properties inside a map, this is not a good design. Your are against using OOP. If you think in this way you can take all classes in a single big String and search them by position and this would be better than having only a dictionary with property as key.

Is it inefficient to reference a hashmap in another class multiple times?

Class A
Class A {
public HashMap <Integer,Double> myHashMap;
public A(){
myHashMap = new HashMap()
}
}
class B
Class B {
private A anInstanceOfA;
public B(A a) {
this.anInstanceOfA = a;
}
aMethod(){
anInstanceOfA.myHashMap.get(1); <--getting hashmap value for key = 1
//proceed to use this value, but instead of storing it to a variable
// I use anInstanceOfA.myHashMap.get(1) each time I need that value.
}
In aMethod() I use anInstanceOfA.myHashMap.get(1) to get the value for key = 1. I do that multiple times in aMethod() and I'm wondering if there is any difference in efficiency between using anInstanceOfA.myHashMap.get(1) multiple times or just assigning it to a variable and using the assigned variable multiple times.
I.E
aMethod(){
theValue = anInstanceOfA.myHashMap.get(1);
//proceed to use theValue in my calculations. Is there a difference in efficiency?
}

In theory the JVM can optimise away the difference to be very small (compared to what the rest of the program is doing). However I prefer to make it a local variable as I believe it makes the code clearer (as I can give it a meaningful name)
I suggest you do what you believe is simpler and clearer, unless you have measured a performance difference.

The question seems to be that you want to know if it is more expensive to call get(l) multiple times instead of just once.
The answer to this is yes. The question is if it is enough to matter. The definitive answer is to ask the JVM by profiling. You can, however, guess by looking at the get method in your chosen implementation and consider if you want to do all that work every time.
Note, that there is another reason that you might want to put the value in a variable, namely that you can give it a telling name, making your program easier to maintain in the future.

This seems like a micro-optimization, that really doesn't make much difference in the scheme of things.
As #peter already suggested, 'optimizing' for style/readability is a better rationale for choosing the second option over the first one. Optimizing for speed only starts making sense if you really do a lot of calls, or if the call is very expensive -- both are probably not the case in your current example.

Put it in a local variable, for multiple reasons:
It will be much faster. Reading a local variable is definitely cheaper than a HashMap lookup, probably by a factor of 10-100x.
You can give the local variable a good, meaningful name
Your code will probably be shorter / simpler overall, particularly if you use the local variable many times.
You may get bugs during future maintenance if someone modifies one of the get calls but forgets to change the others. This is a problem whenever you are duplicating code. Using a local variable minimises this risk.
In concurrent situations, the value could theoretically change if the HashMap is modified by some other code. You normally want to get the value once and work with the same value. Although if you are running into problems of this nature you should probably be looking at other solutions first (locking, concurrent collections etc.)

Why make private static final Lists/Sets/Maps unmodifiable?

I just read some code written by a more experienced programmer, and I came across the following:
public class ConsoleFormatter extends Formatter {
private static final Map<Level, String> PREFIXES;
static {
Map<Level, String> prefixes = new HashMap<Level, String>();
prefixes.put(Level.CONFIG, "[config]");
prefixes.put(Level.FINE, "[debug]");
prefixes.put(Level.FINER, "[debug]");
prefixes.put(Level.FINEST, "[trace]");
prefixes.put(Level.INFO, "[info]");
prefixes.put(Level.SEVERE, "[error]");
prefixes.put(Level.WARNING, "[warning]");
PREFIXES = Collections.unmodifiableMap(prefixes);
}
// ...
}
As you can see, this is a class used for formatting log output. What caught my eye, however, was the code in the static initializer block: PREFIXES = Collections.unmodifiableMap(prefixes);.
Why was PREFIXES made an unmodifiable map? It's a private constant, so there's no risk of modifying the data outside of that class. Was it done to give the constant's immutability a sense of completeness?
Personally, I would've directly initialized PREFIXES as a HashMap and then put the key–value pairs in directly, without creating a dummy, placeholder map or making the field an immutable map. Am I missing something here?

By making the list unmodifiable the author documented his assumption that the values will never change. Whoever might edit that class later on can not only see that assumption, but will also be reminded in case it is ever broken.
This makes sense only when taking the longer-term view. It reduces the risk of new problems arising through maintenance. I like to do this style of programming, because I tend to break stuff even in my own classes. One day you might go in for a quick fix and you forget about an assumption that was made originally and is relevant for correctness. The more you can lock the code down, the better.

If you accidentally return PREFIXES from a method, suddenly any other code out there can modify it. Making constants truly immutable defends against your own stupidity when you modify that code in the future at 3AM.

It's surprisingly easy to have a private map, collection or array which is modifiable from outside the class. You'd mark it final, why wouldn't also spell out that it is supposed to be immutable as well?

Suppose your friend leaves his job and a less experienced programmer takes over. The less experienced programmer attempts to modify the contents of PREFIXES somewhere in a different method within the same class. It it's unmodifiable, it won't work. It's the proper way to say "this is a constant, don't ever change it."

The Map interface does not communicate that you want something to be immutable or unmodifiable.
The following approaches will work in Eclipse Collections.
private static final ImmutableMap<Level, String> PREFIXES = UnifiedMap.<Level, String>newMap()
.withKeyValue(Level.CONFIG, "[config]")
.withKeyValue(Level.FINE, "[debug]")
.withKeyValue(Level.FINER, "[debug]")
.withKeyValue(Level.FINEST, "[trace]")
.withKeyValue(Level.INFO, "[info]")
.withKeyValue(Level.SEVERE, "[error]")
.withKeyValue(Level.WARNING, "[warning]")
.toImmutable();
This will create a Map that is contractually immutable, because ImmutableMap has no mutating methods in its API.
If you prefer to to keep the Map interface, this approach will work as well.
private static final Map<Level, String> PREFIXES = UnifiedMap.<Level, String>newMap()
.withKeyValue(Level.CONFIG, "[config]")
.withKeyValue(Level.FINE, "[debug]")
.withKeyValue(Level.FINER, "[debug]")
.withKeyValue(Level.FINEST, "[trace]")
.withKeyValue(Level.INFO, "[info]")
.withKeyValue(Level.SEVERE, "[error]")
.withKeyValue(Level.WARNING, "[warning]")
.asUnmodifiable();
You should notice there is no need for the static block in either of the cases.
Note: I am a committer for Eclipse Collections.

If a collection is made final, you cannot set a new object into it.
However, it is still possible to add or remove items to the same object.
When you make it unmodifiable, you cannot even add or remove items to the collection.
Hence, it is always advised to make the collection Unmodifiable instead of just keeping it as final.

Database information -> object: how should it be done?

My application will upon request retrieve information from a database and produce an object from that information. I'm currently considering two different techniques (but I'm open to others as well!) to complete this task:
Method one:
class Book {
private int id;
private String author;
private String title;
public Book(int id) {
ResultSet book = getBookFromDatabaseById(id);
this.id = book.id;
this.author = book.author;
// ...
}
}
Method two:
public class Book {
private HashMap<String, Object> propertyContainer;
public Book(int id) {
this.propertyContainer = getBookFromDatabaseById(id);
}
public Object getProperty(String propertyKey) {
return this.propertyContainer.get(propertyKey);
}
}
With method one, I believe that it's easier to control, limit and possibly access properties, adding new properties, however, becomes smoother with method two.
What's the proper way to do this?

I think this problem has been solved in many ways: ORM, DAO, row and table mapper, lots of others. There's no need to redo it again.
One issue you have to think hard about is coupling and cyclic dependencies between packages. You might think you're doing something clever by telling a model object how to persist itself, but one consequence of this design choice is coupling between model objects and the persistence tier. You can't use model objects without persistence if you do this. They really become one big, unwieldy package. There's no layering.
Another choice is to have model objects remain oblivious to whether or not they're persisted. It's a one way dependence that way: persistence knows about model objects, but not the other way around.
Google for those other solutions. There's no need to beat that dead horse again.

The first method will provide you with type safety for associated accessors so you will know what type of object you are getting back and don.t have to cast to that type the you are expecting (this becomes more important when providing anything other than primitives).
For that reason (plus that it will make the resulting code simpler and easier to read) I would pick the first one. In any large applications you will also be able to quickly, easily and neatly get parameter values back in the code for debug etc. within the object itself.
If anyone else is going to be working on this code also (or your planning on working it after you forget about it) the first one will also help as you know the parameters etc. The second one will only give you this with extensive javadoc.

The first one is the classical way. The second one is really tricky for nothing.

Giving a class member a reference to another classes members

On a scale of one to ten, how bad is the following from a perspective of safe programming practices? And if you find it worse than a five, what would you do instead?
My goal below is to get the data in the List of Maps in B into A. In this case, to me, it is ok if it is either a copy of the data or a reference to the original data. I found the approach below fastest, but I have a queasy feeling about it.
public class A {
private List<Map<String, String>> _list = null;
public A(B b) {
_list = b.getList();
}
}
public class B {
private List<Map<String, String>> _list = new ArrayList<Map<String, String>>();
public List<Map<String, String>> getList() {
// Put some data in _list just for the sake of this example...
_list.add(new HashMap<String, String>());
return _list;
}
}

The underlying problem is a bit more complex:
From a security perspective, this is very, very bad.
From a performance perspective, this is very, very good.
From a testing perspective, it's good because there is nothing in the class that you can't easily reach from a test
From an encapsulation perspective, it's bad since you expose the inner state of your class.
From a coding safety perspective, it's bad because someone will eventually abuse this for some "neat" trick that will cause odd errors elsewhere and you will waste a lot of time to debug this.
From an API perspective, it can be either: It's hard to imagine an API to be more simple but at the same time, it doesn't communicate your intent and things will break badly if you ever need to change the underlying data structure.
When designing software, you need to keep all of these points in the back of your mind. With time, you will get a feeling which kinds of errors you make and how to avoid them. Computers being as dump and slow as they are, there is never a perfect solution. You can just strive to make it as good as you can make it at the when you write it.
If you want to code defensively, you should always copy any data that you get or expose. Of course, if "data" is your whole data model, then you simply can't copy everything each time you call a method.
Solutions to this deadlock:
Use immutables as often as you can. Immutables and value objects are created and never change after that. These are always safe and the performance is OK unless the creation is very expensive. Lazy creation would help here but that is usually its own can of worms. Guava offers a comprehensive set of collections which can't be changed after creation.
Don't rely too much on Collections.unmodifiable* because the backing collection can still change.
Use copy-on-write data structures. The problem above would go away if the underlying list would clone itself as soon as A or B start to change it. That would give each its own copy effectively isolation them from each other. Unfortunately, Java doesn't have support for these built in.

In this case, to me, it is ok if it is either a copy of the data or a reference to the original data.
That is the sticking point.
Passing the object instance around is the fastest, but allows the caller to change it, and also makes later changes visible (there is no snapshot).
Usually, that is not a problem, since the caller is not malicious (but you may want to protect against coding errors).
If you do not want the caller to make changes, you could wrap it into an immutable wrapper.
If you need a snapshot, you can clone the list.
Either way, this will only snapshot/protect the list itself, not its individual elements. If those are mutable, the same reasoning applies again.

I would say that you will have too choose between efficiency and encapsulation. By directly accessing a member of the class it will have its state changed. That might be unexpected and lead to nasty surprises. I would also say that it increases the coupling between the two classes.
An alternative is to let the information expert principle decide and leave the job to the class that have the information. You will have to judge if the work that was suppose to be done with class A really is the responsibility of class B.
But really, speed and clean code can be conflicting interests. Some times you just have to play dirty to get it quick enough.

All you're creating is a reference to B._list. So 10 if you wanted to copy the items.
You could iterate over all b._list items and add them to the A._list manually:
public A(B b) {
_list = new List<Map<String, String>> ();
for (Map<String,String> map : b.getList()) {
Map<String,String> newMap = new HashMap<String,String>();
while(map.keySet().iterator().hasNext()) {
String key = map.keySet().iterator().next();
newMap.put(key,map.get(key));
}
_list.add(newMap);
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.