Benefits of using Enums over Collections - java

I'm trying to check if a users account type matches one of several Strings.
There's debate in the office as to whether this should be represented as an enum with each entry containing a different string, or as a Set of Strings. Whilst the Set may be more efficient, an enum may be stylistically superior as it is clearer it is being used for logic flow.
What are the advantages of these two approaches?

Indeed, a Set<String> is more efficient in terms of performance when searching. However, I wouldn't expect that you have thousands of account types, but several, so you won't actually feel the difference when searching. There's one problem with this approach, though - you will be able to add any String to the Set, which is brittle.
My personal prefer would be to use an enum, especially if you don't expect that more account types will be introduced. And if you have a Set<AccountType> you'll be restricted with the values you can add (i.e. you will be able to add only account types, but not anything, like the approach with a Set<String>). The problem with this approach is the Open/Closed Principle - consider you have a switch statement over a AccountType variable with all the corresponding cases. Then, if you introduce a new AccountType constant, you must change the switch statement (with adding a new case), which breaks the "Open/Closed principle". In this case the neatest design would be to have an abstract class/interface, called AccountType which has all the specific account types as sub-classes.
So, there are several approaches you can follow, but before picking one, you should try answer yourselves the question of "How are we going to use it?"

An enum would be better since account types (typically) do not change dynamically. Furthermore, using an enum makes the types more precise - e.g. there's no way to mix up "Hello, World!" with an account type.

Enums are great because you get compile time checking. Invalid values simply won't compile so it 'fails fast'.
A collection of strings is great when you want to add another option without compiling/releasing a new version of your application. If, for instance, the valid options were configured in a database table.

It is worth noting that an enum's valueOf(String) method is implemented using the Enum.valueOf(Class,String) method, which in turn is implemented using a HashMap.
This basically means that looking up the account type from the string by using AccountTypes.valueOf() is an O(1) operation and quite as efficient as a set operation. You can then use the returned value (the actual enum object) in the program, with full type safety, faster comparisons, and all the other benefits of the enum.

It sounds to me like the problem is that you are using a string to represent data that can only have a few valid, known values. A Set may be helpful to validate if the string value is valid, but it doesn't prevent it from becoming invalid.
My suggestion is to define an enum with the valid account types and use that in place of strings. If you have input coming from the outside that represents an account type, then put a static method on the enum like "fromString" which returns an appropriate enum instance, thereby shortening the window of where invalid data be be a consideration.
You can create a Set of AccountType enum instances, provided you implement the appropriate Comparator, compareTo, or hashCode methods (depending on if you used TreeSet or HashSet, etc.). This could be useful if you have classifications of account types that you need to check against. For example, if there are "Local Admins", "Global Admins", and "Security Admins", you could define a method isAdmin(AccountType t) which searches a Set of AccountTypes. For example:
private Set<AccountType> ADMIN_ACCOUNT_TYPES = new HashSet<AccountType>() {{
add(AccountType.LOCAL_ADMIN);
add(AccountType.GLOBAL_ADMIN);
add(AccountType.SECURITY_ADMIN);
}};
public boolean isAdmin(AccountType t) {
return ADMIN_ACCOUNT_TYPES.contains(t);
}
Now, if you have a case where there are lots of different account types, with many groupings, and performance of lookups is a concern, this is how you could solve it.
Though to be honest, if you only have a few account types and they rarely change, this may be over-engineering it a bit. If there are only 2 account types, a simple if statement with equality check will be more efficient than a hash table lookup.
Again, performance may not really be a problem here. Don't over-optimize or optimize prematurely.

In my experience, I suggest using enum in this case. Even mysql supports enum for use cases where you want a column to accept values from an explicitly declared list.

I'd use a Map<String,Enum> = new HashMap<>(); for maximum efficiently.

Related

Are there advantages to using an enum where you could use a map and vice versa?

Say, for example, I want to make a cash register program. Ignoring, for the sake of being compact, that one wouldn't use floats for currency my first instinct is to use an enum for the denominations, something along the lines of :
private enum Currency {
ONE_HUNDRED(100.00f),
FIFTY( 50.00f),
TWENTY( 20.00f),
TEN( 10.00f),
FIVE( 5.00f),
TWO( 2.00f),
ONE( 1.00f),
HALF_DOLLAR( 0.50f),
QUARTER( 0.25f),
DIME( 0.10f),
NICKEL( 0.05f),
PENNY( 0.01f);
private final float value;
Currency(float value) {
this.value = value;
}
public float getValue() {
return this.value;
}
#Override
public String toString() {
return this.name().replace("_", " ");
}
}
But last I followed instinct, sans forethought, and did something similar for a Morse Code Converter, someone suggested that I use a map instead, explicitly a Bimap. I see the appeal of that collection in that particular scenario, but generally speaking I wanted to inquire if there were any reason to prefer one when the other could be used? If instead of the above code I did this:
Map<String, Float> currency = new LinkedHashMap<>();
currency.put("One Hundred", 100.00f);
currency.put("Fifty", 50.00f);
currency.put("Twenty", 20.00f);
currency.put("Ten", 10.00f);
currency.put("Five", 5.00f);
currency.put("Two", 2.00f);
currency.put("One", 1.00f);
currency.put("Half Dollar", 0.50f);
currency.put("Quarter", 0.25f);
currency.put("Dime", 0.10f);
currency.put("Nickel", 0.05f);
currency.put("Penny", 0.01f);
Would it be superior for any reason?
In cases like these were either could be utilized, are there any performance advantages to using one over another? Is one more preferable/conventional? More maintainable/adaptable?
Is there any rule of thumb I could use for when I should use one over the other?
Here are things I like to keep in mind:
Enums are best used (and in the languages I know of, may only be used) to define a known set of items ahead of time. This has a nice benefit of treating what really boils down to frequently used "data" as code in a very readable way.
In my opinion, any code that relies on frequently hardcoded strings, like you would need to use if implementing data like that in a map is more difficult to read and maintain. This leads to "magic strings", which is a no-no when avoidable.
It's not immediately clear what should exist in the map until you go check, and it's not clear if it's potentially being modified elsewhere. Consider, that if you got an enum value wrong, the code will not even compile. Get a string key wrong, and you might not notice until much later.
Regarding performance, I doubt there is a large difference between the two. Enums are treated largely the same as objects, I suppose the benefit comes from accessing the data as a field on the object rather than a hash lookup.
This article doesn't go in depth as I would like, but may be a good starting point: Memory Consumption of Java Data Types
It is quite common practice to use an enum as keys for a known map and that offers another way of associating data with a set of specific items (rather than setting them as fields on the enum). I believe this approach would be my preferred method since setting lots of fields on an enum makes them feel too much like a class rather than a method of referencing. This doesn't have the same problems as a normal map because since the keys must be enums you don't need to worry about any other keys "accidentally" being added to the map. It seems Java as a whole supports this approach as they provide the EnumMap class.
I would say that the main difference between your two pieces of code is that in case of enum you have fixed list of denominations which are "type-safe". While operating with strings and maps it is very easy to misspell some string, introducing bugs that are hard to spot.
I would use enum in this case it is more sensible and if this were something that were to be used by other people enum's have the associated values display for you if you are using pretty much any ide, where as if you are using a map neither the key or the value is readily available to you. There are other reasons but that was one that came to mind.
Would it be superior for any reason?
The Map design would be appropriate for dynamic data, whereas the enum design would be appropriate for fixed data.
In cases like these were either could be utilized, are there any
performance advantages to using one over another?
Insignificant.
Is one more preferable/conventional?
Only when considering the specific problem to be solved.
More maintainable/adaptable?
Again, it depends on the problem you're trying to solve.
Is there any rule of thumb I could use for when I should use one over
the other?
Whether you're working with a limited, non-varying dataset known at compile time.

How to encapsulate a Map into a custom object

My coworker tells me that it's lazy to use Maps, and that oftentimes the programmer's purpose would be better served by an actual object. But I don't know the best way to do so. This is further complicated (to me) by the fact that the key is an Enum type.
Say I have a Hashmap<MyEnum, MyObj> which is expected to have four hashmap keys (one for each value in MyEnum). The MyObj hashmap value is the latest of several MyObjs in a database which have the given enum value.
My best guess involves an object with four fields, or maybe two arrays containing the keys and values in order.
Not sure if this is clear or not (It's 5PM on Thursday = I'm brain-dead), so please ask for clarification if necessary.
While there's nothing wrong with using Maps for their intended purpose, Maps are sometimes misused as substitutes for strongly-typed objects.
String firstname = (String)myMap.get("first_name");
... as opposed to:
String firstName = person.getFirstName();
Since Java implements enums as classes, you might want to consider putting the value you're looking for onto your enum class directly:
MyEnum val = getVal();
MyObj obj = val.getMostRecentMyObj();
But I'd pay attention to separation of concerns to determine whether this really makes sense. It could well be that a Map is the appropriate tool for this job.

Whether or not to code to an interface when only certain implementations provide correct behavior

So, I know that coding to an interface (using an interface as a variable's declared type instead of its concrete type) is a good practice in OO code, for a bunch of reasons. This is seen a lot, for example, with Java collections. Well, is referring to an interface in your program still a good thing to do when only certain implementations of that interface provide correct behavior?
For example, I have a Java program. In that program, I have multiple sets of objects. I chose to use a Set, because I didn't want duplicate elements. However, I wanted a list's ordering property (i.e. maintain insertion order). Therefore, I am using a LinkedHashSet as the concrete Set type. One thing these sets are used for is computing a dot product involving the primitive fields of the objects contained in the sets, such as in (simplifying a bit):
double dot(LinkedHashSet<E> set, double[] array) {
double sum = 0.0;
int i = 0;
for(E element : set) {
sum += (element.getValue()*array[i]);
}
return sum;
}
This method's result is dependent on the set's iteration order, and so certain Set implementations, mainly HashSet, will give incorrect/unexpected results. Currently, I am using LinkedHashSet throughout my program as the declared type, instead of Set, to ensure correct behavior. However, that feels bad stylistically. What's the right thing to do here? Is it okay to use the concrete type in this case? Or maybe should I use Set as the type, but then state in the documentation which implementations will/won't produce correct behavior? I'm looking more for general input than anything specific to the scenario above. In particular, this should apply to really any scenario where you're using the ordering properties of a LinkedHashSet or TreeSet. How do you prevent unintended implementations from being used? Do you force it in the code (by ditching the interface), or do you specify it in the documentation? Or perhaps some other approach?
It is true that you should code to interfaces, but only if the assurances they make fit your needs. In your case, if you would only use Set then you are saying: I don't want duplicates, but I don't care about the order. You could also use a List and mean: I care about insertion order, but not about duplicates. There even is a SortedSet but it does not have the ordering you want. So in your case you can't replace LinkedHashSet by one of its interfaces without violating the Liskov substitution principle.
So I would argue that in your case you should stick to the implementation until you really need the to switch to another implementation. With modern IDEs refactoring is not that hard anymore so I would refrain from doing any premature optimizations -- YAGNI and KISS.
Very very great question. One solution is: Make another interface! Say one that extends SortedMap but has a getInsertionOrderIterator() method or an interface that extends Map & has getOrderIterator() & getInsertionOrderIterator() methods.
You can write a quick adapter class that contains a LinkedHashMap & TreeMap as the backend data structures.
You can make arguments for either way. As long as you and others maintaining this code know that particular implementations of Set might break the rest of the app or library, then coding to the interface is fine. However, if that is not true, then you should use the specific implementation.
The purpose of coding to an interface is to give you flexibility that will not break your app. Take JDBC for instance. If you use the wrong driver it will break your program similar to how you are describing here. However, if let's say Oracle decided to put behavior in their JDBC driver that subtly broke code written to the JDBC spec instead of the specific Oracle driver code then you'd have to choose.
There is no cut and dry, "this is always right" type of answer.

Using Java, how can I restrict an object property to have certain values?

Using Java, how can I restrict an object property to have certain values? I want to create a Java object that represents a "type of location" but I want to restrict the use of the class to only about 100 strings representing all possible types? What is the design pattern for this?
All I can think of is to create a String arraylist and each time a user instantiates the object I would iterate through the entire list looking for a match. That seems sorta like a hack to me though and I want to do it right.
How about using Java's Enumerations? Your Object would just by the type of that Enum and then you'd be bounded by the 100 or so Strings you have in your enum.
You can use a HashSet of allowed values (lookup is faster and you only want to know if its contained) for strings or an enumeration.
For a fixed set of 100 or so values, an enum type is the best answer. There are a couple of caveats though:
If the set of value is not fixed ... to the extent that you can hard-wire them into your code ... then enum classes won't work. There is no form of enum class in Java that allows you to add new values to an existing enum class without a recompilation, etcetera.
If you have a really large number of values, the enum class will run into one or more limitations that are imposed by the JVM spec. For instance, the static initialization code generated by the compiler for the enum class cannot consist of more than 64K of bytecodes.
Another thought, since it's a bounded set, would be to create an enum for those Strings. Make the object property an enum type. No worries then.

When to use Enum or Collection in Java

Under what circumstances is an enum more appropriate than, for example, a Collection that guarantees unique elements (an implementer of java.util.Set, I guess...)?
(This is kind of a follow up from my previous question)
Basically when it's a well-defined, fixed set of values which are known at compile-time.
You can use an enum as a set very easily (with EnumSet) and it allows you to define behaviour, reference the elements by name, switch on them etc.
When the elements are known up front and won't change, an enum is appropriate.
If the elements can change during runtime, use a Set.
I am no java guru, but my guess is to use enumeration when you want to gurantee a certain pool of values, and to use a collection when you want to gurantee uniqueness. Example would be to enumerate days of the week (cant have "funday") and to have a collection of SSN (generic example i know!)
Great responses - I'll try and summarise, if just for my own reference - it kinda looks like you should use enums in two situations:
All the values you need are known at compile time, and either or both of the following:
you want better performance than your usual collection implementations
you want to limit the potential values to those specified at compile time
With the Collection over enumeration links that Jon gave, you can get the benefits of enum performance and safety as an implementation detail without incorporating it into your overall design.
Community wiki'd, please do edit and improve if you want to!
Note: you can have both with an EnumSet.
In some situations your business requires the creation of new items, but at the same time business logic based on some fixed items. For the fixed ones you want an enum, the new ones obviously require some kind of collection/db.
I've seen projects using a collection for this kind of items, resulting in business logic depending on data which can be deleted by the user. Never do this, but do create a separate enum for the fixed ones and a collection for the others, just as required.
An other solution is to use a collection with immutable objects for the fixed values. These items could also reside in a db, but have an extra flag so users cannot update / delete it.

Categories

Resources