To use nested genericized collections or custom intermediate classes? - java

Before the introduction to generics to the Java language I would have written classes encapsulating collections-of-collections-of-collections. For example:
class Account {
private Map tradesByRegion; //KEY=Region, VALUE=TradeCollection
}
class TradeCollection {
private Map tradesByInstrument; //KEY=Instrument, Value=Trade
}
Of course, with generics, I can just do:
class Account {
private Map<Region, Map<Instrument, Trade>> trades;
}
I tend to now opt for option #2 (over a generified version of option #1) because this means I don't end up with a proliferation of classes that exist solely for the purpose of wrapping a collection. But I have a nagging feeling that this is bad design (e.g. how many nested collections should I use before declaring new classes). Opinions?

2 is better because:
Less code accomplishes the same
effect (better, actually, as in #1
some of your type information exists
only in comments)
It's completely clear what's going
on.
Your type errors will be caught at compile time.
What is there to recommend 1? admittedly the Map< Integer , < Map < String, < Map< ... generics are a bit hard to get used to, but to my eye it's much easier to understand than code with maps, and lists of maps, and maps of lists of maps, and custom objects full of lists of maps.

Combination of the two. While you can use generics to replace custom classes, you still will want to use a class to encapsulate your concepts. If you're just passing maps of maps of maps of lists to everything, who controls what you can add? who controls what you can remove?
For holding the data, generics is a great thing. But you still want methods to validate when you add a trade, or add an account, and without some kind of class wrapping your collections, nobody controls that.

Normally there will also be some code to operate on the collections. When this becomes nontrivial I package up the collection with the behavior in a new class. The deeper the nesting is the more likely this will be the case.

I have made this simple rule for me: Never more than two <'s and two commas in a generics declaration and preferably only one comma. After that I introduce custom types. I think this is the point where the readability suffers enough to warrant additional concepts.
There is a real good reason to avoid too deep generics: The complexity is not only in the actual declaration but usually tends to be equally visible in the construction logic. So a lot of code has a tendency to become convoluted if you nest these declarations too deeply. Creating intermediate classes can help a lot. The trick is often to find the proper intermediate classes.
I definitely think you should go a slight bit back towards your old standard. Actually your second sample is the exact pain-point where I'd still accept generics-only.

I think it's better to keep objects in mind and emphasize the collections a bit less. Reification is your friend.
For example, it's natural to have Student and Course objects if you're modeling a system for a school. Where should grades be captured? I'd argue that they belong in an object where Student and Course meet - a ReportCard. I wouldn't have a GradeCollection. Give it some real behavior.

I prefer #2. It is clearer what is going on, and is typesafe at compile time (I prefer having as many things go wrong at compile time as possible, as opposed to having them happen at runtime... in general I like it when nothing goes wrong).
Edit:
Well there are two ways that I can see... I guess it depends on which you would use:
class Account
{
private Map<Region, TradeCollection> tradesByRegion;
}
class TradeCollection
{
private Map<Instrument, Trade> tradesByInstrument;
}
or
class Account<R extends Region, I extends Instrument, T extends Trade, C extends TradeCollection<I, T>>
{
private Map<R, C> tradesByRegion;
}
class TradeCollection<I extends Instrument, T extends Trade>
{
private Map<I, T> tradesByInstrument;
}

I think the answer to this is that it depends on the situation. Generally, introducing a type is useful if that also introduces methods related to the type, if those intermediate types are passed around independently, etc.
You might find this bit of advice from Rich Hickey (the creator of Clojure) about creating interoperable libraries kind of interesting:
http://groups.google.com/group/clojure/browse_thread/thread/e0823e1caaff3eed
It's intended as specific advice to Clojure library writers but I think is interesting food for thought even in Java.

Related

What is the best way to work with many interfaces?

I have a situation where I have have a lot of model classes (~1000) which implement any number of 5 interfaces. So I have classes which implement one and others which implement four or five.
This means I can have any permutation of those five interfaces. In the classical model, I would have to implement 32-5 = 27 "meta interfaces" which "join" the interfaces in a bundle. Often, this is not a problem because IB usually extends IA, etc. but in my case, the five interfaces are orthogonal/independent.
In my framework code, I have methods which need instances that have any number of these interfaces implemented. So lets assume that we have the class X and the interfaces IA, IB, IC, ID and IE. X implements IA, ID and IE.
The situation gets worse because some of these interfaces have formal type parameters.
I now have two options:
I could define an interface IADE (or rather IPersistable_MasterSlaveCapable_XmlIdentifierProvider; underscores just for your reading pleasure)
I could define a generic type as <T extends IPersistable & IMasterSlaveCapable & IXmlIdentifierProvider> which would give me a handy way to mix & match interfaces as I need them.
I could use code like this: IA a = ...; ID d = (ID)a; IE e = (IE)e and then use the local variable with the correct type to call methods even though all three work on the same instance. Or use a cast in every second method call.
The first solution means that I get a lot of empty interfaces with very unreadable names.
The second uses a kind of "ad-hoc" typing. And Oracle's javac sometimes stumbles over them while Eclipse gets it right.
The last solution uses casts. Nuff said.
Questions:
Is there a better solution for mixing any number of interfaces?
Are there any reasons to avoid the temporary types which solution #2 offers me (except for shortcomings in Oracle's javac)?
Note: I'm aware that writing code which doesn't compile with Oracle's javac is a risk. We know that we can handle this risk.
[Edit] There seems to be some confusion what I try to attempt here. My model instances can have one of these traits:
They can be "master slave capable" (think cloning)
They can have an XML identifier
They might support tree operations (parent/child)
They might support revisions
etc. (yes, the model is even more complex than that)
Now I have support code which operates on trees. An extensions of trees are trees with revisions. But I also have revisions without trees.
When I'm in the code to add a child in the revision tree manager, I know that each instance must implement ITtree and IRevisionable but there is no common interface for both because these are completely independent concerns.
But in the implementation, I need to call methods on the nodes of the tree:
public void addChild( T parent, T child ) {
T newRev = parent.createNewRevision();
newRev.addChild( foo );
... possibly more method calls to other interfaces ...
}
If createNewRevision is in the interface IRevisionable and addChild is in the interface ITree, what are my options to define T?
Note: Assume that I have several other interfaces which work in a similar way: There are many places where they are independent but some code needs to see a mix of them. IRevisionableTree is not a solution but another problem.
I could cast the type for each call but that seems clumsy. Creating all permutations of interfaces would be boring and there seems no reasonable pattern to compress the huge interface names. Generics offer a nice way out:
public
<T extends IRevisionable & ITree>
void addChild( T parent, T child ) { ... }
This doesn't always work with Oracle's javac but it seems compact and useful. Any other options/comments?
Loosely coupled capabilities might be interesting. An example here.
It is an entirely different approach; decoupling things instead of typing.
Basically interfaces are hidden, implemented as delegating field.
IA ia = x.lookupCapability(IA.class);
if (ia != null) {
ia.a();
}
It fits here, as with many interfaces the wish to decouple rises, and you can more easily combine cases of interdepending interfaces (if (ia != null && ib != null) ...).
If you have a method (semicode)
void doSomething(IA & ID & IE thing);
then my main concern is: Couldn't doSomething be better tailored? Might it be better to split up the functionality? Or are the interfaces itself badly tailored?
I have stumbled over similar things several times and each time it proved to be better to take big step backward and rethink the complete partitioning of the logic - not only due to the stuff you mentioned but also due to other concerns.
Since you formulated your question very abstractly (i.e. without a sensible example) I cannot tell you if that's advisable in your case also.
I would avoid all "artificial" interfaces/types that attempt to represent combinations. It's just bad design... what happens if you add 5 more interfaces? The number of combinations explodes.
It seems you want to know if some instance implements some interface(s). Reasonable options are:
use instanceof - there is no shame
use reflection to discover the interfaces via object.getClass().getInterfaces() - you may be able to write some general code to process stuff
use reflection to discover the methods via object.getClass().getMethods() and just invoke those that match a known list of methods of your interfaces (this approach means you don't have to care what it implements - sounds simple and therefore sounds like a good idea)
You've given us no context as to exactly why you want to know, so it's hard to say what the "best" approach is.
Edited
OK. Since your extra info was added it's starting to make sense. The best approach here is to use the a callback: Instead of passing in a parent object, pass in an interface that accepts a "child".
It's a simplistic version of the visitor pattern. Your calling code knows what it is calling with and how it can handle a child, but the code that navigates around and/or decides to add a child doesn't have context of the caller.
Your code would look something like this (caveat: May not compile; I just typed it in):
public interface Parent<T> {
void accept(T child);
}
// Central code - I assume the parent is passed in somewhere earlier
public void process(Parent<T> parent) {
// some logic that decides to add a child
addChild(parent, child);
}
public void addChild(Parent<T> parent, T child ) {
parent.accept(child);
}
// Calling code
final IRevisionable revisionable = ...;
someServer.process(new Parent<T> {
void accept(T child) {
T newRev = revisionable.createNewRevision();
newRev.addChild(child);
}
}
You may have to juggle things around, but I hope you understand what I'm trying to say.
Actually solution 1 is a good solution, but you should find a better naming.
What actually would you name a class that implements the IPersistable_MasterSlaveCapable_XmlIdentifierProvider interface? If you follow good naming convention, it should have a meaningful name originating from a model entity. You can give the interface the same name prefixed with I.
I don't find it a disadvantage to have many interfaces, because like that you can write mock implementations for testing purposes.
My situation is the opposite: I know that at certain point in code,
foo must implement IA, ID and IE (otherwise, it couldn't get that
far). Now I need to call methods in all three interfaces. What type
should foo get?
Are you able to bypass the problem entirely by passing (for example) three objects? So instead of:
doSomethingWithFoo(WhatGoesHere foo);
you do:
doSomethingWithFoo(IA foo, ID foo, IE foo);
Or, you could create a proxy that implements all interfaces, but allows you to disable certain interfaces (i.e. calling the 'wrong' interface causes an UnsupportedOperationException).
One final wild idea - it might be possible to create Dynamic Proxies for the appropriate interfaces, that delegate to your actual object.

Best way to add functionality to built-in types

I wonder what is the best way in terms of strict OOP to add functionality to built-in types like Strings or integers or more complex objects (in my case the BitSet class).
To be more specific - I got two scenarios:
Adding a md5 hashing method to the String object
Adding conversion methods (like fromByteArray() or toInteger()) to the BitSet class.
Now I wonder what the best practices for implementing this would be.
I could e.g. create a new Class "BitSetEx" extending from BitSet and add my methods. But I don't like the idea since this new class would need describing name and "BitSetWithConversionMethods" sound really silly.
Now I could write a class consisting only of static methods doing the conversions.
Well I got a lot of ideas but I wan't to know what would be the "best" in sense of OOP.
So could someone answer me this question?
There are a few approaches here:
Firstly, you could come up with a better name for the extends BitSet class. No, BitsetWithConversionMethods isn't a good name, but maybe something like ConvertibleBitSet is. Does that convey the intent and usage of the class? If so, it's a good name. Likewise you might have a HashableString (bearing in mind that you can't extend String, as Anthony points out in another answer). This approach of naming child classes with XableY (or XingY, like BufferingPort or SigningEmailSender) can sometimes be a useful one to describe the addition of new behaviour.
That said, I think there's a fair hint in your problem (not being able to find a name) that maybe this isn't a good design decision, and it's trying to do too much. It is generally a good design principle that a class should "do one thing". Obviously, depending on the level of abstraction, that can be stretched to include anything, but it's worth thinking about: do 'manipulating the set/unset state of a number of bits' and 'convert a bit pattern to another format' count as one thing? I'd argue that (especially with the hint that you're having a hard time coming up with a name) they're probably two different responsibilities. If so, having two classes will end up being cleaner, easier to maintain (another rule is that 'a class should have one reason to change'; one class to both manipulate + convert has at least 2 reasons to change), easier to test in isolation, etc.
So without knowing your design, I would suggest maybe two classes; in the BitSet example, have both a BitSet and (say) a BitSetConverter which is responsible for the conversion. If you wanted to get really fancy, perhaps even:
interface BitSetConverter<T> {
T convert(BitSet in);
BitSet parse(T in);
}
then you might have:
BitSetConverter<Integer> intConverter = ...;
Integer i = intConverter.convert(myBitSet);
BitSet new = intConverter.parse(12345);
which really isolates your changes, makes each different converter testable, etc.
(Of course, once you do that, you might like to look at guava and consider using a Function, e.g. a Function<BitSet, Integer> for one case, and Function<Integer, BitSet> for the other. Then you gain a whole ecosystem of Function-supporting code which may be useful)
I would go with the extending class. That is actually what you are doing, extending the current class with some extra methods.
As for the name: you should not name at for the new features, as you might add more later on. It is your extended BitSet class, so BitSetEx allready sounds better then the BitSetWithConversionMethods you propose.
You don't want to write a class with the static methods, this is like procedural programming in an OOP environment, and is considered wrong. You have an object that has certain methods (like the fromByteArray() you want to make) so you want those methods to be in that class. Extending is the way to go.
It depends. As nanne pointed out, subclass is an option. But only sometimes. Strings are declared final, so you cannot create a subclass. You have at least 2 other options:
1) Use 'encapsulation', i.e. create a class MyString which has a String on which it operates (as opposed to extending String, which you cannot do). Basically a wrapper around the String that adds your functionality.
2) Create a utility/helper, i.e. a class with only static methods that operate on Strings. So something like
class OurStringUtil {
....
public static int getMd5Hash(String string) {...}
....
}
Take a look at the Apache StringUtils stuff, it follows this approach; it's wonderful.
"Best way" is kinda subjective. And keep in mind that String is a final class, so you can't extend it.
Two possible approaches are writing wrappers such as StringWrapper(String) with your extra methods, or some kind of StringUtils class full of static methods (since Java 5, static methods can be imported if you wan't to use the util class directly).

Disadvantage of object composition over class inheritance

Most design patten books say we should "Favor object composition over class inheritance."
But can anyone give me an example that inheritance is better than object composition.
Inheritance is appropriate for is-a relationships. It is a poor fit for has-a relationships.
Since most relationships between classes/components fall into the has-a bucket (for example, a Car class is likely not a HashMap, but it may have a HashMap), it then follows the composition is often a better idea for modeling relationships between classes rather than inheritance.
This is not to say however that inheritance is not useful or not the correct solution for some scenarios.
My simple answer is that you should use inheritance for behavioral purposes. Subclasses should override methods to change the behaviour of the method and the object itself.
This article (interview with Erich Gamma, one of the GoF) elaborates clearly why Favor object composition over class inheritance.
In Java, whenever you inherit from a class, your new class also automatically becomes a subtype of the original class type. Since it is a subtype, it needs to adhere to the Liskov substitution principle.
This principle basically says that you must be able to use the subtype anywhere where the supertype is expected. This severely limits how the behavior of your new inherited class can differ from the original class.
No compiler will be able to make you adhere to this principle though, but you can get in trouble if you don't, especially when other programmers are using your classes.
In languages that allow subclassing without subtyping (like the CZ language), the rule "Favor object composition over inheritance" is not as important as in languages like Java or C#.
Inheritance allows an object of the derived type to be used in nearly any circumstance where one would use an object of the base type. Composition does not allow this. Use inheritance when such substitution is required, and composition when it is not.
Just think of it as having an "is-a" or a "has-a" relationship
In this example Human "is-a" Animal, and it may inherits different data from the Animal class. Therefore Inheritance is used:
abstract class Animal {
private String name;
public String getName(){
return name;
}
abstract int getLegCount();
}
class Dog extends Animal{
public int getLegCount(){
return 4;
}
}
class Human extends Animal{
public int getLegCount(){
return 2;
}
}
Composition makes sense if one object is the owner of another object. Like a Human object owning a Dog object. So in the following example a Human object "has-a" Dog object
class Dog{
private String name;
}
class Human{
private Dog pet;
}
hope that helped...
It is a fundamental design principle of a good OOD. You can assign a behaviour to a class dynamicly "in runtime", if you use composition in your design rather than inheritance like in Strategy Pattern. Say,
interface Xable {
doSomething();
}
class Aable implements Xable { doSomething() { /* behave like A */ } }
class Bable implements Xable { doSomething() { /* behave like B */ } }
class Bar {
Xable ability;
public void setAbility(XAble a) { ability = a; }
public void behave() {
ability.doSomething();
}
}
/*now we can set our ability in runtime dynamicly */
/*somewhere in your code */
Bar bar = new Bar();
bar.setAbility( new Aable() );
bar.behave(); /* behaves like A*/
bar.setAbility( new Bable() );
bar.behave(); /* behaves like B*/
if you did use inheritance, the "Bar" would get the behaviour "staticly" over inheritance.
Inheritance is necessary for subtyping. Consider:
class Base {
void Foo() { /* ... */ }
void Bar() { /* ... */ }
}
class Composed {
void Foo() { mBase.Foo(); }
void Bar() { mBase.Foo(); }
private Base mBase;
}
Even though Composed supports all of the methods of Foo it cannot be passed to a function that expects a value of type Foo:
void TakeBase(Base b) { /* ... */ }
TakeBase(new Composed()); // ERROR
So, if you want polymorphism, you need inheritance (or its cousin interface implementation).
This is a great question. One I've been asking for years, at conferences, in videos, in blog posts. I've heard all kinds of answers. The only good answer I've heard is preformance:
Performance differences in languages. Sometimes, classes take advantage of built-in engine optimizations that dynamic compositions don't. Most of the time, this is a much smaller concern than the problems associated with class inheritance, and usually, you can inline everything you need for that performance optimization into a single class and wrap a factory function around it and get the benefits you need without a problematic class hierarchy.
You should never worry about this unless you detect a problem. Then you should profile and test differences in perf to make informed tradeoffs as needed. Often, there are other performance optimizations available that don't involve class inheritance, including tricks like inlining, method delegation, memoizing pure functions, etc... Perf will vary depending on the specific application and language engine. Profiling is essential, here.
Additionally, I've heard lots of common misconceptions. The most common is confusion about type systems:
Conflating types with classes (there are a couple existing answers concentrate on that here already). Compositions can satisfy polymorphism requirements by implementing interfaces. Classes and types are orthogonal, though in most class-supporting languages, subclasses automatically implement the superclass interface, so it can seem convenient.
There are three very good reasons to avoid class inheritance, and the crop up again and again:
The gorilla/banana problem
"I think the lack of reusability comes in object-oriented languages, not functional languages. Because the problem with object-oriented languages is they’ve got all this implicit environment that they carry around with them. You wanted a banana but what you got was a gorilla holding the banana and the entire jungle." ~ Joe Armstrong, quoted in "Coders at Work" by Peter Seibel.
This problem basically refers to the lack of selective code reuse in class inheritance. Composition lets you select just the pieces you need by approaching software design from a "small, reusable parts" approach rather than building monolithic designs that encapsulate everything related to some given functionality.
The fragile base class problem
Class inheritance is the tightest coupling available in object-oriented design, because the base class becomes part of the implementation of the child classes. This is why you'll also hear the advice from the Gang of Four's "Design Patterns" classic: "Program to an interface, not an implementation."
The problem with implementation inheritance is that even the smallest change to the inner details of that implementation could potentially break child classes. If the interface is public, exposed to user-land in any way, it could break code you are not even aware of.
This is the reason that class hierarchies become brittle -- hard to change as you grow them with new use-cases.
The common refrain is that we should be constantly refactoring our code (see Martin Fowler et al on extreme programming, agile, etc...). The key to refactor success is that you can't break things -- but as we've just seen, it's difficult to refactor a class hierarchy without breaking things.
The reason is that it's impossible to create the correct class hierarchy without knowing everything you need to know about the use-cases, but you can't know that in evolving software. Use cases get added or changed in projects all the time.
There is also a discovery process in programming, where you discover the right design as you implement the code and learn more about what works and what doesn't. But with class inheritance, once you get a class taxonomy going, you've painted yourself into a corner.
You need to know the information before you start the implementation, but part of learning the information you need involves building the implementation. It's a catch-22.
The duplication by necessity problem. This is where the death spiral really gets going. Sometimes, you really just want a banana, not the gorilla holding the banana, and the entire jungle. So you copy and paste it. Now there's a bug in a banana, so you fix it. Later, you get the same bug report and close it. "I already fixed that". And then you get the same bug report again. And again. Uh-oh. It's not fixed. You forgot the other banana! Google "copy pasta".
Other times, you really need to work a new use-case into your software, but you can't change the original base class, so instead, you copy and paste the entire class hierarchy into a new one and rename all the classes you need in the hierarchy to force that new use-case into the code base. 6 months later a new dev is looking at the code and wondering which class hierarchy to inherit from and nobody can provide a good answer.
Duplication by necessity leads to copy pasta messes, and pretty soon people start throwing around the word "rewrite" like it's no big deal. The problem with that is that most rewrite projects fail. I can name several orgs off the top of my head that are currently maintaining two development teams instead of one while they work on a rewrite project. I've seen such orgs cut funding to one or the other, and I've seen projects like that chew through so much cash that a startup or small business runs out of money and shuts down.
Developers underestimate the impact of class inheritance all the time. It's an important choice, and you need to be aware of the trade offs you opt into every time you create or inherit from a base class.

Is is acceptable to declare a private class as an alias?

In my answer from yesterday I called the following piece of code "a hack":
final class MyMap extends HashMap<SomeSuperLongIdentifier, OtherSuperLongIdentifier> {}
// declared MyMap as an alias for readability purposes only
MyMap a = new MyMap();
a.put("key", "val");
Giving it another thought, this does not seem like a bad idea at all, but I might be missing something. Are there any potholes I missed out on? Is this an acceptable (possibly creative) way for declaring aliases in Java?
The drawback would be that you won't be able to directly use any methods that return a correctly typed Map, because they will never return a MyMap. Even if they could return a Map<SomeSuperLongIdentifier, OtherSuperLongIdentifier>.
For example you wouldn't be able to use the filter() methods in Maps (provided by Google Collections). They would accept a MyMap instance as input, but they would return only a Map<SomeSuperLongIdentifier, OtherSuperLongIdentifier>.
This problem can be somewhat reduced, by writing your MyMap to delegate to another Map implementation. Then you could pass the return value of such a method into the constructor and still have a MyMap (without copying, even). The default constructor could just set the delegate to a new HashMap instance, so the default usage would stay the same.
I would object to the name MyMap: Since you create an alias, make it document its purpose by giving it a useful name. Other than that, I like it.
I think it surely a convenient way to declare type synonyms. Some languages have direct support for that (in Delphi (pascal), for example, you can do that like that:
type MyMap = HashMap<SomeSuperLongIdentifier, OtherSuperLongIdentifier>;
Since Java does not, I think you can use inheritance for that. You need to document, that this declaration is just a synonym and noone should add meethods to this class. Note also, that this consumes a little memory for VMT storage.
I personally would not do this, and would flag it in a review, but this is a matter of opinion.
Google Collections helps mitigate this problem, by letting you declare:
Map<SomeSuperLongIdentifier, OtherSuperLongIdentifier> a = Maps.newHashMap();
I'd look for ways to refactor code to not have to declare so many instances of this Map, perhaps.
As long as developers using your code have IDEs and are able to quickly jump to the class definition and read the comments for its purpose (which are in place, no?), I can see nothing wrong with it.
I wouldn't call it an 'alias'. It isn't. It can't be used interchangeably with the type it is supposed to be aliasing. So if that's the intention, it fails.
I think that inheritance is a very big gun compared to the problem at hand. At the very least I would have made this "alias class" final, with a big fat comment describing the reason for its existence.
Well, there are two contradictory aspects here.
On a modelling point of view, your declaration is right, because it emphasizes the encapsulation your class provides.
On a coding point of view, your declaration may be considered as wrong because you add a class only as a modelling support, with absolutely no added feature.
However, I find your approach quite right (although I never though about it before), since it provides a much appreciated (well, to me, at least) compilable model : classes from your model are perfectly reflected in your code, making your specifications executable, what is very cool.
All this brings me to say it's definitely a great idea, provided you support it with documentation.
I wouldn't call it a hack. Personally, I've created an alias for the purpose of declaring generic type parameters which cannot be changed and creating some clarity.
You also couldn't use this map in serialization if sending to another jvm which does not have your MyMap class.

How do I argue against Duck-typing in a strongly typed language like Java?

I work on a team of Java programmers. One of my co-workers suggests from time-to-time that I do something like "just add a type field" (usu. "String type"). Or code will be committed laden with "if (foo instanceof Foo){...} else if( foo instanceof Bar){...}".
Josh Bloch's admonition that "tagged classes are a wan imitation of a proper class hierarchy" notwithstanding, what is my one-line response to this sort of thing? And then how do I elaborate the concept more seriously?
It's clear to me that - the context being Java - the type of Object under consideration is right in front of our collective faces - IOW: The word right after the "class", "enum" or "interface", etc.
But aside from the difficult-to-demonstrate or quantify (on the spot) "it makes your code more complicated", how do I say that "duck-typing in a (more or less) strongly-typed language is a stupid idea that suggests a much deeper design pathology?
Actually, you said it reasonably well right there.
The truth is that the "instance of" comb is almost always a bad idea (the exception happening for example when you're marshaling or serializing, when for a short interval you may not have all the type information at hand.) As josh says, that's a sign of a bad class hierarchy otherwise.
The way that you know it's a bad idea is that it makes the code brittle: if you use that, and the type hierarchy changes, then it probably breaks that instance-of comb everywhere it occurs. What's more, you then lose the benefit of strong typing; the compiler can't help you by catching errors ahead of time. (This is somewhat analogous to the problems caused by typecasts in C.)
Update
Let me extend this a bit, since from a comment it appears I wasn't quite clear. The reason you use a typecast in C, or instanceof, it that you want to say "as if": use this foo as if it were a bar. Now, in C, there is no run time type information around at all, so you're just working without a net: if you typecast something, the generated code is going to treat that address as if it contained a particular type no matter what, and you should only hope that it will cause a run-time error instead of silently corrupting something.
Duck typing just raises that to a norm; in a dynamic, weakly typed language like Ruby or Python or Smalltalk, everything is an untyped reference; you shoot messages at it at runtime and see what happens. If it understands a particular message, it "walks like a duck" -- it handles it.
This can be very handy and useful, because it allows marvelous hacks like assigning a generator expression to a variable in Python, or a block to a variable in Smalltalk. But it does mean you're vulnerable to errors at runtime that a strongly typed language can catch at compile time.
In a strongly-typed language like Java, you can't really, strictly, have duck typing at all: you must tell the compiler what type you're going to treat something as. You can get something like duck typing by using type casts, so that you can do something like
Object x; // A reference to an Object, analogous to a void * in C
// Some code that assigns something to x
((FoodDispenser)x).dropPellet(); // [1]
// Some more code
((MissleController)x).launchAt("Moon"); // [2]
Now at run time, you're fine as long as x is a kind of FoodDispenser at [1] or MissleController at [2]; otherwise boom. Or unexpectedly, no boom.
In your description, you protect yourself by using a comb of else if and instanceof
Object x ;
// code code code
if(x instanceof FoodDispenser)
((FoodDispenser)x).dropPellet();
else if (x instanceof MissleController )
((MissleController)x).launchAt("Moon");
else if ( /* something else...*/ ) // ...
else // error
Now, you're protected against the run-time error, but you've got the responsibility of doing something sensible later, at the else.
But now imagine you make a change to the code, so that 'x' can take the types 'FloorWax' and 'DessertTopping'. You now must go through all the code and find all the instances of that comb and modify them. Now the code is "brittle" -- changes in the requirements mean lots of code changes. In OO, you're striving to make the code less brittle.
The OO solution is to use polymorphism instead, which you can think of as a kind of limited duck typing: you're defining all the operations that something can be trusted to perform. You do this by defining a superior class, probably abstract, that has all the methods of the inferior classes. In Java, a class like that is best expressed an "interface", but it has all the type properties of a class. In fact, you can see an interface as being a promise that a particular class can be trusted to act "as if" it were another class.
public interface VeebleFeetzer { /* ... */ };
public class FoodDispenser implements VeebleFeetzer { /* ... */ }
public class MissleController implements VeebleFeetzer { /* ... */ }
public class FloorWax implements VeebleFeetzer { /* ... */ }
public class DessertTopping implements VeebleFeetzer { /* ... */ }
All you have to do now is use a reference to a VeebleFeetzer, and the compiler figures it out for you. If you happen to add another class that's a subtype of VeebleFeetzer, the compiler will select the method and check the arguments in the bargain
VeebleFeetzer x; // A reference to anything
// that implements VeebleFeetzer
// Some code that assigns something to x
x.dropPellet();
// Some more code
x.launchAt("Moon");
This isn't so much duck typing as it is just proper object-oriented style; indeed, being able to subclass class A and call the same method on class B and have it do something else is the entire point of inheritance in languages.
If you're constantly checking the type of an object, then you're either being too clever (though I suppose it's this cleverness that duck typing aficionados enjoy, except in a less brittle form) or you're not embracing the basics of object-oriented programming.
hmmm...
correct me if I am wrong but tagged classes and duck-typing are two different concepts though not necessarely mutally exclusive.
When one has the urge of using tags in a class to define the type then one should, IMHO, revise their class hiearchy as it is a clear sing of conceptual bleed where an abstract class needs to know the the implementation details that the class parenthood tries to hide. Are you using the correct pattern ? In other words are you trying to coerce behaviour in a pattern that does not naturally support it ?
Where as duck-typing is the ability to loosely define a type where a method can accept any types just so long as the necessary methods in the parameter instance are defined. The method will then use the parameter and call the necessary methods without too much bother on the parenthood of the instance.
So here... the smelly hint is, as Charlie pointed out, the use of instanceof. Much like static or other smelly keywords, whenever they appear one must ask "Am I doing the right thing here ?", not that they are inhertitly wrong but they are oftenly used to hack through a bad or ill fitted OO desing.
My one line response would be that you lose one of the main benefits of OOP: polymorphism. This reduces the time to develop new code (developers love to develop new code, so that should help your argument :-)
If, when adding a new type to an existing system, you have to add logic, aside from figuring out which instance to construct, then, in Java, you are doing something wrong (assuming that the new class should simply be a drop in replacement for another).
Generally, the appropriate way to handle this in Java is to keep the code polymorphic and make use of interfaces. So anytime they find themselves wanting to add another variable or do an instanceof they should probably be implementing an interface instead.
If you can convince them to change the code it is pretty easy to retrofit interfaces into the existing code base. For that matter, I'd take the time to take a piece of code with instanceof and refactor it to be polymorphic. It is much easier for people to see the point if they can see the before and after versions and compare them.
You might want to point your co-worker to the Liskov substitution principle, one of the five pillars in SOLID.
Links:
Wikipedia entry
Article written by Uncle Bob
When you say "duck typing in strongly-typed languages" you actually mean "imitating (subtype) polymorphism in statically-typed languages".
It's not that bad when you have data objects (DTOs) that don't contain any behaviour. When you do have a full-blown OO model (ask yourself if this is really the case) then you should use the polymorphism offered by the language where appropriate.
Although I'm generally a fan of duck-typed languages like python, I can see your problem with it in java.
If you are writing all the classes that will ever be used with this code, then you don't need to duck-type, because you don't need to allow for cases where code can't directly inherit from (or implement) an interface or other unifying abstraction.
A downside of duck-typing is that you have an extra class of unit tests to run on your code: a new class could return a different type than expected, and subsequently cause the rest of the code to fail. So although duck-typing allows backward-flexibility, it requires a lot of forward thinking for tests.
In short you have a catch-all (hard) instead of a catch-few (easy). I think that's the pathology.
Why "imitate a class hierarchy" instead of designing and using it? One of the refactoring methods is replacing "switch"es (chained ifs are almost the same) with polymorphism. Why use switches where polymorphism would lead to cleaner code?
This isn't duck typing, it is just a bad way to simulate polymorphism in a language that has (more or less) real polymorphism.
Two arguments to answer the titled question:
1) Java is supposed to be "write once, run anywhere," so code that was written for one hierarchy shouldn't throw RuntimeExceptions when we change the environment somewhere. (Of course, there are exceptions -- pun -- to this rule.)
2) The Java JIT performs very aggressive optimizations that rely on knowing that a given symbol must be of one type and one type only. The only way to work around this is to cast.
As others have mentioned, your "instance of" doesn't match with the question I've answered here. Anything with any types, duck or static, may have the issue you described. There are better OOP ways to deal with it.
Instead of instanceof you can use the Method- and the Strategy-Pattern, mixed together the code looks much better than before...

Categories

Resources