Java OOP encapsulation. Why is Object.doSomething(); better than doSomething(Object);? - java

I am struggling to explain oop concepts in java.
A major tenet in oop is that objects have methods; so Object.method(); works.
I am contrasting this with procedural programming in which one must do method(Object).
Is this called encapsulation?
What are the advantages of the oop way?

That's a big question with an answer that fills multiple books, but in short, class members have access modifiers (public, private, protected). Private members can be accessed by other class members, such as a method, but not from external functions.

In the scenario Object.doSomething(), the object will have complete control over its properties which are used in the method.
But in the other call, doSomething(Object), you have to make all the properties of the object public so that they are available in the method. Which is not a safer operation.

2 more advantages of OOP are re-use and polymorphism.
ReUse:
If you use doSomething(Object) in one file or one program, it may work fine for that program. Now, imagine that you need to use your Object in another program. You will need to duplicate the doSomething() method in your new program (probably copy and paste it). This may work, but is bad practice and makes maintaining that logic a nightmare. If the doSomething() logic is a function inside Object then that logic "lives" with the object.
Polymorphism:
Imagine another case where Object is just one of many similar types. If you take advantage of Interfaces, many objects can implement the doSomething() function to suit their specific needs.
Example:
interface ICar
{
void doSomething();
void getFuel();
}
class GasCar : ICar
{
public void doSomething()
{
//do something a gas car would do
}
public void getFuel()
{
//logic to pull gas out of a tank
}
}
class ElectricCar : ICar
{
public void doSomething()
{
//do something an electric car would do
}
public void getFuel()
{
//logic to pull fuel out of a battery
}
}

One other answer to keep in mind. When you do method(object) and method is implemented like this:
method(obj)
{
return obj.getA()+obj.getB()
}
This is bad because later when you want to do the same thing again, where is your code? I mean, it uses A and B, so the first place you'd go to look is obj, but it's not there!?! Now you have to go search.
OO is as much about organizing code to find and reuse it as anything else.
tl;dr rant
You see a LOT of code like this with the bean pattern--I've come to see beans as one of the more insidious evils in OO programming because of this. In theory most beans these days are called "Pojos" because you CAN implement actual methods in them, but people still stick with beans full of nothing but setters and getters which is in no way OO, in fact they encourage code that is specifically NOT OO (beans are more akin to data structures in a non-oo language like C than anything OO has).

OOP evangelizes encapsulation; as a result, state and behavior is encapsulated in the class representing the object. Depending on what level you the encapsulation to happen, employ either static (class level) or instance (object level) based encapsulation.

method(Object) is a paradigm that works with data structures. Data Structures are about grouping fields of information that are semantically consistent and related (i.e. struct Person {FirstName, LastName, DateOfBirth}).
Object oriented programing is one step above data structures. In OOP, we not only group data fields that are related, but we also include functions (methods, member functions) that are related to the data (and that act on the data the correct way).
Encapsulation is about keeping part of the members private to objects. The goal is to "hide" the inner-working from the external world, and protect the object's state from "corruption", or from being assigned incorrect values. OOP languages provide several "access modifiers" that are used to specify whether a given member can be accessed by a specific category of objects (instances of child classes, classes in the same "package/namespace/library", any other class, etc.).
object.method() is usually about asking an object to perform something that may involve accessing a field that is not accessible outside of the class.
The above was to define, and explain how the concept of member function (method), and the concept of encapsulation go hand in hand.
Referrences:
http://en.wikipedia.org/wiki/Encapsulation_%28object-oriented_programming%29
http://en.wikipedia.org/wiki/Object-oriented_programming

Related

Why Encapsulation is called data hiding, if its not hiding the data?

What is the difference between following two class in terms of data hiding(encapsulation).
In below example , I can access the value of member by making it public.
Eg: 1
public class App {
public int b = 10;
public static void main(String[] args) {
System.out.println(new App().b);
}
}
In below example, I can access the value of member by using getter method.
Eg : 2
class DataHiding
{
private int b;
public DataHiding() {
}
public int getB() {
return b;
}
public void setB(int b) {
this.b = b;
}
}
In both the above examples, I can access the value of member. Why Eg : 2, is called data hiding (encapsulation) ? If its not hiding the data.
Why Eg : 1 is not called encapsulated ?
What is it about
As you tagged this question with both java and object oriented programming oop, I suppose you are implicitly thinking about Java Beans. Nevertheless this is a question quite common across languages, take the wikipedia page on this matter :
In programming languages, encapsulation is used to refer to one of two
related but distinct notions, and sometimes to the combination1
thereof:
A language mechanism for restricting access to some of the object's
components.
A language construct that facilitates the bundling
of data with the methods (or other functions) operating on that
data.
Some programming language researchers and academics use the first
meaning alone or in combination with the second as a distinguishing
feature of object-oriented programming, while other programming
languages which provide lexical closures view encapsulation as a
feature of the language orthogonal to object orientation.
The second definition is motivated by the fact that in many OOP
languages hiding of components is not automatic or can be overridden;
thus, information hiding is defined as a separate notion by those who
prefer the second definition.
So encapsulation is not really about hiding data or information it about enclosing pieces of data in a language component (a class in Java). A Java Beans encapsulate data.
That being said, while encapsulation is one of the main feature of object oriented programming paradigm, at some point in the history of language design it was seen as not enough to help design better software.
History
One key practice to achieve better software design is decoupling, and encapsulation helps on that matter. Yet a cluster of data was not enough to help achieve this goal, other efforts in OOP pioneering were made in different language at that time, I believe SIMULA is the earliest language to introduce some kind of visibility keywords among other concepts like a class. Yet the idea of information hiding really appears later in 1972 with data that is only relevant to the component that uses it to achieve greater decoupling.
But back to the topic.
Answers to your questions
In this case data is encapsulated and public
This is commonly known as a global variable and it is usually regarded as a bad programming practice, because this may lead to coupling and other kind of bugs
Data is encapsulated and public (through method accessors)
This class is usually referred to as a Java Bean, these are an abomination if used in any other than what they were designed for.
These object were designed to fulfill a single role and that is quite specific is according to the specification
2.1 What is a Bean?
Let's start with an initial definition and then refine it:
“A Java Bean is a reusable software component that can be manipulated visually in a builder tool.”
Why is it an abomination nowadays ? Because people, framework vendors usually misuse them. The specification is not enough clear about that, yet there's some statement in this regard :
So for example it makes sense to provide the JDBC database access API as a class library rather than as a bean, because JDBC is essentially a programmatic API and not something that can be directly presented for visual manipulation.
I'd rather quote Joshua Bloch (more in this question and answer) :
"The JavaBeans pattern has serious disadvantages." - Joshua Bloch, Effective Java
Related points
As explained above one key practice to achieve better software is decoupling. Coupling has been one of the oldest battlefront of software engineers. Encapsulation, information hiding have a lot to do with the following practices to help decoupling for numerous reasons:
the Law of Demeter, breaking this law means the code has coupling. If one has to traverse a whole data graph by hand then, there's no information hiding, knowledge of the graph is outside of the component, which means the software is therefore less maintainable, less adaptable. In short : refactoring is a painful process. Anemic domain model suffer from that, and they are recognized as an anti-pattern.
A somehow modern practice that allows one to not break the Law of Demeter is Tell, Don't Ask.
That is, you should endeavor to tell objects what you want them to do; do not ask them questions about their state, make a decision, and then tell them what to do.
immutability, if data has to be public it should be immutable. In some degree if data is not needed, one module can introduce side effects in another ; if this was true for single threaded programs, it's even more painful with multi-threaded softwares. Today softwares and hardware are getting more and more multi-threaded, threads have to communicate, if an information has to be public it should be immutable. Immutability guarantee thread-safety, one less thing to worry about. Also immutability has to be guaranteed on the whole object graph.
class IsItImmutable {
// skipping method accessors for brevity
// OK <= String is immutable
private final String str;
// NOK <= java.util.Date is mutable, even if reference is final a date can be modified
private final Date date;
// NOK <= Set operations are still possible, so this set is mutable
private final Set<String> strs;
// NOK <= Set is immutable, set operations are not permitted, however Dates in the set are mutable
private final Set<Date> udates = Collections.unmodifiableSet(...);
// OK <= Set is immutable, set operations are not permitted, String is immutable
private final Set<String> ustrs = Collections.unmodifiableSet(...);
}
Using mutators and accessors hides the logic, not the name of the methods. It prevents users from directly modifying the class members.
In your second example, the user has no idea about the class member b, whereas in the first example, the user is directly exposed to that variable, having the ability to change it.
Imagine a situation where you want to do some validation before setting the value of b, or using a helper variable and methods that you don't want to expose. You'll encapsulate the logic in your setter and by doing that, you ensure that users cannot modify the variable without your supervision.
Encapsulation is not data hiding it is information hiding. You are hiding internal structure and data implementation, as well as data access logic.
For instance you can store your integer internally as String, if you like. In first case changing that internal implementation would mean that you have to also change all code that depends on b being an int. In second case accessor methods will protect internal structure and give you int, and you don't have to change the rest of the code if internals have changed.
Accessor methods also give you opportunity to restrict access to the data making it read-only or write-only in addition to plain read-write access. Not to mention other logic that can verify integrity of data going into the object as well as changing object state accordingly.
What happens if you want to retrieve the state of B without being able to change its value? you would create the getter but not the setter, you can't accomplish that by accessing B as a public int.
Also, in both methods get and set, if we had a more complex object, maybe we want to set or get some property or state of the object.
Example.
private MyObject a;
public setMyObjectName(String name){
MyObject.name = name;
}
public getMyObjectName(){
return MyObject.name;
}
This way we keep the object encapsulated, by restricting access to its state.
In java, all methods are virtual. This means, that if you extend some class, you can override the result of a method. Imagine for example the next class (continuing on your example):
class DataHidingDouble extends DataHiding{
public int getB(){
return b*2;
}
}
this means that you maintain control over what b is to the outer world in your subclass.
imagine also some subclass where the value of b comes from something that is not a variable, eg. a database. How are you then going to make b return the value if it is a variable.
It hides the data, not the value. Because the class is responsible for maintaining the data, and returning the correct value to the outside world.

Difference between public variable and getVar/setVar? [duplicate]

What's the advantage of using getters and setters - that only get and set - instead of simply using public fields for those variables?
If getters and setters are ever doing more than just the simple get/set, I can figure this one out very quickly, but I'm not 100% clear on how:
public String foo;
is any worse than:
private String foo;
public void setFoo(String foo) { this.foo = foo; }
public String getFoo() { return foo; }
Whereas the former takes a lot less boilerplate code.
There are actually many good reasons to consider using accessors rather than directly exposing fields of a class - beyond just the argument of encapsulation and making future changes easier.
Here are the some of the reasons I am aware of:
Encapsulation of behavior associated with getting or setting the property - this allows additional functionality (like validation) to be added more easily later.
Hiding the internal representation of the property while exposing a property using an alternative representation.
Insulating your public interface from change - allowing the public interface to remain constant while the implementation changes without affecting existing consumers.
Controlling the lifetime and memory management (disposal) semantics of the property - particularly important in non-managed memory environments (like C++ or Objective-C).
Providing a debugging interception point for when a property changes at runtime - debugging when and where a property changed to a particular value can be quite difficult without this in some languages.
Improved interoperability with libraries that are designed to operate against property getter/setters - Mocking, Serialization, and WPF come to mind.
Allowing inheritors to change the semantics of how the property behaves and is exposed by overriding the getter/setter methods.
Allowing the getter/setter to be passed around as lambda expressions rather than values.
Getters and setters can allow different access levels - for example the get may be public, but the set could be protected.
Because 2 weeks (months, years) from now when you realize that your setter needs to do more than just set the value, you'll also realize that the property has been used directly in 238 other classes :-)
A public field is not worse than a getter/setter pair that does nothing except returning the field and assigning to it. First, it's clear that (in most languages) there is no functional difference. Any difference must be in other factors, like maintainability or readability.
An oft-mentioned advantage of getter/setter pairs, isn't. There's this claim that you can change the implementation and your clients don't have to be recompiled. Supposedly, setters let you add functionality like validation later on and your clients don't even need to know about it. However, adding validation to a setter is a change to its preconditions, a violation of the previous contract, which was, quite simply, "you can put anything in here, and you can get that same thing later from the getter".
So, now that you broke the contract, changing every file in the codebase is something you should want to do, not avoid. If you avoid it you're making the assumption that all the code assumed the contract for those methods was different.
If that should not have been the contract, then the interface was allowing clients to put the object in invalid states. That's the exact opposite of encapsulation If that field could not really be set to anything from the start, why wasn't the validation there from the start?
This same argument applies to other supposed advantages of these pass-through getter/setter pairs: if you later decide to change the value being set, you're breaking the contract. If you override the default functionality in a derived class, in a way beyond a few harmless modifications (like logging or other non-observable behaviour), you're breaking the contract of the base class. That is a violation of the Liskov Substitutability Principle, which is seen as one of the tenets of OO.
If a class has these dumb getters and setters for every field, then it is a class that has no invariants whatsoever, no contract. Is that really object-oriented design? If all the class has is those getters and setters, it's just a dumb data holder, and dumb data holders should look like dumb data holders:
class Foo {
public:
int DaysLeft;
int ContestantNumber;
};
Adding pass-through getter/setter pairs to such a class adds no value. Other classes should provide meaningful operations, not just operations that fields already provide. That's how you can define and maintain useful invariants.
Client: "What can I do with an object of this class?"
Designer: "You can read and write several variables."
Client: "Oh... cool, I guess?"
There are reasons to use getters and setters, but if those reasons don't exist, making getter/setter pairs in the name of false encapsulation gods is not a good thing. Valid reasons to make getters or setters include the things often mentioned as the potential changes you can make later, like validation or different internal representations. Or maybe the value should be readable by clients but not writable (for example, reading the size of a dictionary), so a simple getter is a nice choice. But those reasons should be there when you make the choice, and not just as a potential thing you may want later. This is an instance of YAGNI (You Ain't Gonna Need It).
Lots of people talk about the advantages of getters and setters but I want to play devil's advocate. Right now I'm debugging a very large program where the programmers decided to make everything getters and setters. That might seem nice, but its a reverse-engineering nightmare.
Say you're looking through hundreds of lines of code and you come across this:
person.name = "Joe";
It's a beautifully simply piece of code until you realize its a setter. Now, you follow that setter and find that it also sets person.firstName, person.lastName, person.isHuman, person.hasReallyCommonFirstName, and calls person.update(), which sends a query out to the database, etc. Oh, that's where your memory leak was occurring.
Understanding a local piece of code at first glance is an important property of good readability that getters and setters tend to break. That is why I try to avoid them when I can, and minimize what they do when I use them.
In a pure object-oriented world getters and setters is a terrible anti-pattern. Read this article: Getters/Setters. Evil. Period. In a nutshell, they encourage programmers to think about objects as of data structures, and this type of thinking is pure procedural (like in COBOL or C). In an object-oriented language there are no data structures, but only objects that expose behavior (not attributes/properties!)
You may find more about them in Section 3.5 of Elegant Objects (my book about object-oriented programming).
There are many reasons. My favorite one is when you need to change the behavior or regulate what you can set on a variable. For instance, lets say you had a setSpeed(int speed) method. But you want that you can only set a maximum speed of 100. You would do something like:
public void setSpeed(int speed) {
if ( speed > 100 ) {
this.speed = 100;
} else {
this.speed = speed;
}
}
Now what if EVERYWHERE in your code you were using the public field and then you realized you need the above requirement? Have fun hunting down every usage of the public field instead of just modifying your setter.
My 2 cents :)
One advantage of accessors and mutators is that you can perform validation.
For example, if foo was public, I could easily set it to null and then someone else could try to call a method on the object. But it's not there anymore! With a setFoo method, I could ensure that foo was never set to null.
Accessors and mutators also allow for encapsulation - if you aren't supposed to see the value once its set (perhaps it's set in the constructor and then used by methods, but never supposed to be changed), it will never been seen by anyone. But if you can allow other classes to see or change it, you can provide the proper accessor and/or mutator.
Thanks, that really clarified my thinking. Now here is (almost) 10 (almost) good reasons NOT to use getters and setters:
When you realize you need to do more than just set and get the value, you can just make the field private, which will instantly tell you where you've directly accessed it.
Any validation you perform in there can only be context free, which validation rarely is in practice.
You can change the value being set - this is an absolute nightmare when the caller passes you a value that they [shock horror] want you to store AS IS.
You can hide the internal representation - fantastic, so you're making sure that all these operations are symmetrical right?
You've insulated your public interface from changes under the sheets - if you were designing an interface and weren't sure whether direct access to something was OK, then you should have kept designing.
Some libraries expect this, but not many - reflection, serialization, mock objects all work just fine with public fields.
Inheriting this class, you can override default functionality - in other words you can REALLY confuse callers by not only hiding the implementation but making it inconsistent.
The last three I'm just leaving (N/A or D/C)...
Depends on your language. You've tagged this "object-oriented" rather than "Java", so I'd like to point out that ChssPly76's answer is language-dependent. In Python, for instance, there is no reason to use getters and setters. If you need to change the behavior, you can use a property, which wraps a getter and setter around basic attribute access. Something like this:
class Simple(object):
def _get_value(self):
return self._value -1
def _set_value(self, new_value):
self._value = new_value + 1
def _del_value(self):
self.old_values.append(self._value)
del self._value
value = property(_get_value, _set_value, _del_value)
Well i just want to add that even if sometimes they are necessary for the encapsulation and security of your variables/objects, if we want to code a real Object Oriented Program, then we need to STOP OVERUSING THE ACCESSORS, cause sometimes we depend a lot on them when is not really necessary and that makes almost the same as if we put the variables public.
EDIT: I answered this question because there are a bunch of people learning programming asking this, and most of the answers are very technically competent, but they're not as easy to understand if you're a newbie. We were all newbies, so I thought I'd try my hand at a more newbie friendly answer.
The two main ones are polymorphism, and validation. Even if it's just a stupid data structure.
Let's say we have this simple class:
public class Bottle {
public int amountOfWaterMl;
public int capacityMl;
}
A very simple class that holds how much liquid is in it, and what its capacity is (in milliliters).
What happens when I do:
Bottle bot = new Bottle();
bot.amountOfWaterMl = 1500;
bot.capacityMl = 1000;
Well, you wouldn't expect that to work, right?
You want there to be some kind of sanity check. And worse, what if I never specified the maximum capacity? Oh dear, we have a problem.
But there's another problem too. What if bottles were just one type of container? What if we had several containers, all with capacities and amounts of liquid filled? If we could just make an interface, we could let the rest of our program accept that interface, and bottles, jerrycans and all sorts of stuff would just work interchangably. Wouldn't that be better? Since interfaces demand methods, this is also a good thing.
We'd end up with something like:
public interface LiquidContainer {
public int getAmountMl();
public void setAmountMl(int amountMl);
public int getCapacityMl();
}
Great! And now we just change Bottle to this:
public class Bottle implements LiquidContainer {
private int capacityMl;
private int amountFilledMl;
public Bottle(int capacityMl, int amountFilledMl) {
this.capacityMl = capacityMl;
this.amountFilledMl = amountFilledMl;
checkNotOverFlow();
}
public int getAmountMl() {
return amountFilledMl;
}
public void setAmountMl(int amountMl) {
this.amountFilled = amountMl;
checkNotOverFlow();
}
public int getCapacityMl() {
return capacityMl;
}
private void checkNotOverFlow() {
if(amountOfWaterMl > capacityMl) {
throw new BottleOverflowException();
}
}
I'll leave the definition of the BottleOverflowException as an exercise to the reader.
Now notice how much more robust this is. We can deal with any type of container in our code now by accepting LiquidContainer instead of Bottle. And how these bottles deal with this sort of stuff can all differ. You can have bottles that write their state to disk when it changes, or bottles that save on SQL databases or GNU knows what else.
And all these can have different ways to handle various whoopsies. The Bottle just checks and if it's overflowing it throws a RuntimeException. But that might be the wrong thing to do.
(There is a useful discussion to be had about error handling, but I'm keeping it very simple here on purpose. People in comments will likely point out the flaws of this simplistic approach. ;) )
And yes, it seems like we go from a very simple idea to getting much better answers quickly.
Please note also that you can't change the capacity of a bottle. It's now set in stone. You could do this with an int by declaring it final. But if this was a list, you could empty it, add new things to it, and so on. You can't limit the access to touching the innards.
There's also the third thing that not everyone has addressed: getters and setters use method calls. That means that they look like normal methods everywhere else does. Instead of having weird specific syntax for DTOs and stuff, you have the same thing everywhere.
I know it's a bit late, but I think there are some people who are interested in performance.
I've done a little performance test. I wrote a class "NumberHolder" which, well, holds an Integer. You can either read that Integer by using the getter method
anInstance.getNumber() or by directly accessing the number by using anInstance.number. My programm reads the number 1,000,000,000 times, via both ways. That process is repeated five times and the time is printed. I've got the following result:
Time 1: 953ms, Time 2: 741ms
Time 1: 655ms, Time 2: 743ms
Time 1: 656ms, Time 2: 634ms
Time 1: 637ms, Time 2: 629ms
Time 1: 633ms, Time 2: 625ms
(Time 1 is the direct way, Time 2 is the getter)
You see, the getter is (almost) always a bit faster. Then I tried with different numbers of cycles. Instead of 1 million, I used 10 million and 0.1 million.
The results:
10 million cycles:
Time 1: 6382ms, Time 2: 6351ms
Time 1: 6363ms, Time 2: 6351ms
Time 1: 6350ms, Time 2: 6363ms
Time 1: 6353ms, Time 2: 6357ms
Time 1: 6348ms, Time 2: 6354ms
With 10 million cycles, the times are almost the same.
Here are 100 thousand (0.1 million) cycles:
Time 1: 77ms, Time 2: 73ms
Time 1: 94ms, Time 2: 65ms
Time 1: 67ms, Time 2: 63ms
Time 1: 65ms, Time 2: 65ms
Time 1: 66ms, Time 2: 63ms
Also with different amounts of cycles, the getter is a little bit faster than the regular way. I hope this helped you.
Don't use getters setters unless needed for your current delivery I.e. Don't think too much about what would happen in the future, if any thing to be changed its a change request in most of the production applications, systems.
Think simple, easy, add complexity when needed.
I would not take advantage of ignorance of business owners of deep technical know how just because I think it's correct or I like the approach.
I have massive system written without getters setters only with access modifiers and some methods to validate n perform biz logic. If you absolutely needed the. Use anything.
We use getters and setters:
for reusability
to perform validation in later stages of programming
Getter and setter methods are public interfaces to access private class members.
Encapsulation mantra
The encapsulation mantra is to make fields private and methods public.
Getter Methods: We can get access to private variables.
Setter Methods: We can modify private fields.
Even though the getter and setter methods do not add new functionality, we can change our mind come back later to make that method
better;
safer; and
faster.
Anywhere a value can be used, a method that returns that value can be added. Instead of:
int x = 1000 - 500
use
int x = 1000 - class_name.getValue();
In layman's terms
Suppose we need to store the details of this Person. This Person has the fields name, age and sex. Doing this involves creating methods for name, age and sex. Now if we need create another person, it becomes necessary to create the methods for name, age, sex all over again.
Instead of doing this, we can create a bean class(Person) with getter and setter methods. So tomorrow we can just create objects of this Bean class(Person class) whenever we need to add a new person (see the figure). Thus we are reusing the fields and methods of bean class, which is much better.
I spent quite a while thinking this over for the Java case, and I believe the real reasons are:
Code to the interface, not the implementation
Interfaces only specify methods, not fields
In other words, the only way you can specify a field in an interface is by providing a method for writing a new value and a method for reading the current value.
Those methods are the infamous getter and setter....
It can be useful for lazy-loading. Say the object in question is stored in a database, and you don't want to go get it unless you need it. If the object is retrieved by a getter, then the internal object can be null until somebody asks for it, then you can go get it on the first call to the getter.
I had a base page class in a project that was handed to me that was loading some data from a couple different web service calls, but the data in those web service calls wasn't always used in all child pages. Web services, for all of the benefits, pioneer new definitions of "slow", so you don't want to make a web service call if you don't have to.
I moved from public fields to getters, and now the getters check the cache, and if it's not there call the web service. So with a little wrapping, a lot of web service calls were prevented.
So the getter saves me from trying to figure out, on each child page, what I will need. If I need it, I call the getter, and it goes to find it for me if I don't already have it.
protected YourType _yourName = null;
public YourType YourName{
get
{
if (_yourName == null)
{
_yourName = new YourType();
return _yourName;
}
}
}
One aspect I missed in the answers so far, the access specification:
for members you have only one access specification for both setting and getting
for setters and getters you can fine tune it and define it separately
In languages which don't support "properties" (C++, Java) or require recompilation of clients when changing fields to properties (C#), using get/set methods is easier to modify. For example, adding validation logic to a setFoo method will not require changing the public interface of a class.
In languages which support "real" properties (Python, Ruby, maybe Smalltalk?) there is no point to get/set methods.
One of the basic principals of OO design: Encapsulation!
It gives you many benefits, one of which being that you can change the implementation of the getter/setter behind the scenes but any consumer of that value will continue to work as long as the data type remains the same.
You should use getters and setters when:
You're dealing with something that is conceptually an attribute, but:
Your language doesn't have properties (or some similar mechanism, like Tcl's variable traces), or
Your language's property support isn't sufficient for this use case, or
Your language's (or sometimes your framework's) idiomatic conventions encourage getters or setters for this use case.
So this is very rarely a general OO question; it's a language-specific question, with different answers for different languages (and different use cases).
From an OO theory point of view, getters and setters are useless. The interface of your class is what it does, not what its state is. (If not, you've written the wrong class.) In very simple cases, where what a class does is just, e.g., represent a point in rectangular coordinates,* the attributes are part of the interface; getters and setters just cloud that. But in anything but very simple cases, neither the attributes nor getters and setters are part of the interface.
Put another way: If you believe that consumers of your class shouldn't even know that you have a spam attribute, much less be able to change it willy-nilly, then giving them a set_spam method is the last thing you want to do.
* Even for that simple class, you may not necessarily want to allow setting the x and y values. If this is really a class, shouldn't it have methods like translate, rotate, etc.? If it's only a class because your language doesn't have records/structs/named tuples, then this isn't really a question of OO…
But nobody is ever doing general OO design. They're doing design, and implementation, in a specific language. And in some languages, getters and setters are far from useless.
If your language doesn't have properties, then the only way to represent something that's conceptually an attribute, but is actually computed, or validated, etc., is through getters and setters.
Even if your language does have properties, there may be cases where they're insufficient or inappropriate. For example, if you want to allow subclasses to control the semantics of an attribute, in languages without dynamic access, a subclass can't substitute a computed property for an attribute.
As for the "what if I want to change my implementation later?" question (which is repeated multiple times in different wording in both the OP's question and the accepted answer): If it really is a pure implementation change, and you started with an attribute, you can change it to a property without affecting the interface. Unless, of course, your language doesn't support that. So this is really just the same case again.
Also, it's important to follow the idioms of the language (or framework) you're using. If you write beautiful Ruby-style code in C#, any experienced C# developer other than you is going to have trouble reading it, and that's bad. Some languages have stronger cultures around their conventions than others.—and it may not be a coincidence that Java and Python, which are on opposite ends of the spectrum for how idiomatic getters are, happen to have two of the strongest cultures.
Beyond human readers, there will be libraries and tools that expect you to follow the conventions, and make your life harder if you don't. Hooking Interface Builder widgets to anything but ObjC properties, or using certain Java mocking libraries without getters, is just making your life more difficult. If the tools are important to you, don't fight them.
From a object orientation design standpoint both alternatives can be damaging to the maintenance of the code by weakening the encapsulation of the classes. For a discussion you can look into this excellent article: http://typicalprogrammer.com/?p=23
Code evolves. private is great for when you need data member protection. Eventually all classes should be sort of "miniprograms" that have a well-defined interface that you can't just screw with the internals of.
That said, software development isn't about setting down that final version of the class as if you're pressing some cast iron statue on the first try. While you're working with it, code is more like clay. It evolves as you develop it and learn more about the problem domain you are solving. During development classes may interact with each other than they should (dependency you plan to factor out), merge together, or split apart. So I think the debate boils down to people not wanting to religiously write
int getVar() const { return var ; }
So you have:
doSomething( obj->getVar() ) ;
Instead of
doSomething( obj->var ) ;
Not only is getVar() visually noisy, it gives this illusion that gettingVar() is somehow a more complex process than it really is. How you (as the class writer) regard the sanctity of var is particularly confusing to a user of your class if it has a passthru setter -- then it looks like you're putting up these gates to "protect" something you insist is valuable, (the sanctity of var) but yet even you concede var's protection isn't worth much by the ability for anyone to just come in and set var to whatever value they want, without you even peeking at what they are doing.
So I program as follows (assuming an "agile" type approach -- ie when I write code not knowing exactly what it will be doing/don't have time or experience to plan an elaborate waterfall style interface set):
1) Start with all public members for basic objects with data and behavior. This is why in all my C++ "example" code you'll notice me using struct instead of class everywhere.
2) When an object's internal behavior for a data member becomes complex enough, (for example, it likes to keep an internal std::list in some kind of order), accessor type functions are written. Because I'm programming by myself, I don't always set the member private right away, but somewhere down the evolution of the class the member will be "promoted" to either protected or private.
3) Classes that are fully fleshed out and have strict rules about their internals (ie they know exactly what they are doing, and you are not to "fuck" (technical term) with its internals) are given the class designation, default private members, and only a select few members are allowed to be public.
I find this approach allows me to avoid sitting there and religiously writing getter/setters when a lot of data members get migrated out, shifted around, etc. during the early stages of a class's evolution.
There is a good reason to consider using accessors is there is no property inheritance. See next example:
public class TestPropertyOverride {
public static class A {
public int i = 0;
public void add() {
i++;
}
public int getI() {
return i;
}
}
public static class B extends A {
public int i = 2;
#Override
public void add() {
i = i + 2;
}
#Override
public int getI() {
return i;
}
}
public static void main(String[] args) {
A a = new B();
System.out.println(a.i);
a.add();
System.out.println(a.i);
System.out.println(a.getI());
}
}
Output:
0
0
4
Getters and setters are used to implement two of the fundamental aspects of Object Oriented Programming which are:
Abstraction
Encapsulation
Suppose we have an Employee class:
package com.highmark.productConfig.types;
public class Employee {
private String firstName;
private String middleName;
private String lastName;
public String getFirstName() {
return firstName;
}
public void setFirstName(String firstName) {
this.firstName = firstName;
}
public String getMiddleName() {
return middleName;
}
public void setMiddleName(String middleName) {
this.middleName = middleName;
}
public String getLastName() {
return lastName;
}
public void setLastName(String lastName) {
this.lastName = lastName;
}
public String getFullName(){
return this.getFirstName() + this.getMiddleName() + this.getLastName();
}
}
Here the implementation details of Full Name is hidden from the user and is not accessible directly to the user, unlike a public attribute.
There is a difference between DataStructure and Object.
Datastructure should expose its innards and not behavior.
An Object should not expose its innards but it should expose its behavior, which is also known as the Law of Demeter
Mostly DTOs are considered more of a datastructure and not Object. They should only expose their data and not behavior. Having Setter/Getter in DataStructure will expose behavior instead of data inside it. This further increases the chance of violation of Law of Demeter.
Uncle Bob in his book Clean code explained the Law of Demeter.
There is a well-known heuristic called the Law of Demeter that says a
module should not know about the innards of the objects it
manipulates. As we saw in the last section, objects hide their data
and expose operations. This means that an object should not expose its
internal structure through accessors because to do so is to expose,
rather than to hide, its internal structure.
More precisely, the Law of Demeter says that a method f of a class C
should only call the methods of these:
C
An object created by f
An object passed as an argument to f
An object held in an instance variable of C
The method should not invoke methods on objects that are returned by any of the allowed functions.
In other words, talk to friends, not to strangers.
So according this, example of LoD violation is:
final String outputDir = ctxt.getOptions().getScratchDir().getAbsolutePath();
Here, the function should call the method of its immediate friend which is ctxt here, It should not call the method of its immediate friend's friend. but this rule doesn't apply to data structure. so here if ctxt, option, scratchDir are datastructure then why to wrap their internal data with some behavior and doing a violation of LoD.
Instead, we can do something like this.
final String outputDir = ctxt.options.scratchDir.absolutePath;
This fulfills our needs and doesn't even violate LoD.
Inspired by Clean Code by Robert C. Martin(Uncle Bob)
If you don't require any validations and not even need to maintain state i.e. one property depends on another so we need to maintain the state when one is change. You can keep it simple by making field public and not using getter and setters.
I think OOPs complicates things as the program grows it becomes nightmare for developer to scale.
A simple example; we generate c++ headers from xml. The header contains simple field which does not require any validations. But still as in OOPS accessor are fashion we generates them as following.
const Filed& getfield() const
Field& getField()
void setfield(const Field& field){...}
which is very verbose and is not required. a simple
struct
{
Field field;
};
is enough and readable.
Functional programming don't have the concept of data hiding they even don't require it as they do not mutate the data.
Additionally, this is to "future-proof" your class. In particular, changing from a field to a property is an ABI break, so if you do later decide that you need more logic than just "set/get the field", then you need to break ABI, which of course creates problems for anything else already compiled against your class.
One other use (in languages that support properties) is that setters and getters can imply that an operation is non-trivial. Typically, you want to avoid doing anything that's computationally expensive in a property.
One relatively modern advantage of getters/setters is that is makes it easier to browse code in tagged (indexed) code editors. E.g. If you want to see who sets a member, you can open the call hierarchy of the setter.
On the other hand, if the member is public, the tools don't make it possible to filter read/write access to the member. So you have to trudge though all uses of the member.
Getters and setters coming from data hiding. Data Hiding means We
are hiding data from outsiders or outside person/thing cannot access
our data.This is a useful feature in OOP.
As a example:
If you create a public variable, you can access that variable and change value in anywhere(any class). But if you create as private that variable cannot see/access in any class except declared class.
public and private are access modifiers.
So how can we access that variable outside:
This is the place getters and setters coming from. You can declare variable as private then you can implement getter and setter for that variable.
Example(Java):
private String name;
public String getName(){
return this.name;
}
public void setName(String name){
this.name= name;
}
Advantage:
When anyone want to access or change/set value to balance variable, he/she must have permision.
//assume we have person1 object
//to give permission to check balance
person1.getName()
//to give permission to set balance
person1.setName()
You can set value in constructor also but when later on when you want
to update/change value, you have to implement setter method.

Whats the difference between objects and data structures?

I've been reading the book Clean Code: A Handbook of Agile Software Craftsmanship and in chapter six pages 95-98 it clarifies about the differences between objects and data structures:
Objects hide their data behind abstractions and expose functions that operate on that data. Data structures expose their data and have no meaningful functions.
Object expose behavior and hide data. This makes it easy to add new kinds of objects without changing existing behaviors. It also makes it hard to add new behaviors to existing objects.
Data structures expose data and have no significant behavior. This makes it easy to add new behaviors to existing data structures but makes it hard to add new data structures to existing functions.
I'm a tad bit confused whether some classes are objects or data structures. Say for example HashMaps in java.util, are they objects? (because of its methods like put(), get(), we dont know their inner workings) or are they data structures? (I've always thought of it as data structures because its a Map).
Strings as well, are they data structures or objects?
So far majority of the code I've been writing have been the so called "hybrid classes" which try to act as an object and a data structure as well. Any tips on how to avoid them as well?
The distinction between data structures and classes/objects is a harder to explain in Java than in C++. In C, there are no classes, only data structures, that are nothing more than "containers" of typed and named fields. C++ inherited these "structs", so you can have both "classic" data structures and "real objects".
In Java, you can "emulate" C-style data structures using classes that have no methods and only public fields:
public class VehicleStruct
{
public Engine engine;
public Wheel[] wheels;
}
A user of VehicleStruct knows about the parts a vehicle is made of, and can directly interact with these parts. Behavior, i.e. functions, have to be defined outside of the class. That's why it is easy to change behavior: Adding new functions won't require existing code to change. Changing data, on the other hand, requires changes in virtually every function interacting with VehicleStruct. It violates encapsulation!
The idea behind OOP is to hide the data and expose behavior instead. It focuses on what you can do with a vehicle without having to know if it has engine or how many wheels are installed:
public class Vehicle
{
private Details hidden;
public void startEngine() { ... }
public void shiftInto(int gear) { ... }
public void accelerate(double amount) { ... }
public void brake(double amount) { ... }
}
Notice how the Vehicle could be a motorcycle, a car, a truck, or a tank -- you don't need to know the details. Changing data is easy -- nobody outside the class knows about data so no user of the class needs to be changed. Changing behavior is difficult: All subclasses must be adjusted when a new (abstract) function is added to the class.
Now, following the "rules of encapsulation", you could understand hiding the data as simply making the fields private and adding accessor methods to VehicleStruct:
public class VehicleStruct
{
private Engine engine;
private Wheel[] wheels;
public Engine getEngine() { return engine; }
public Wheel[] getWheels() { return wheels; }
}
In his book, Uncle Bob argues that by doing this, you still have a data structure and not an object. You are still just modeling the vehicle as the sum of its parts, and expose these parts using methods. It is essentially the same as the version with public fields and a plain old C struct -- hence a data structure. Hiding data and exposing methods is not enough to create an object, you have to consider if the methods actually expose behavior or just the data!
When you mix the two approaches, e.g. exposing getEngine() along with startEngine(), you end up with a "hybrid". I don't have Martin's Book at hand, but I remember that he did not recommend hybrids at all, as you end up with the worst of both worlds: Objects where both data and behavior is hard to change.
Your questions concerning HashMaps and Strings are a bit tricky, as these are pretty low level and don't fit quite well in the kinds of classes you will be writing for your applications. Nevertheless, using the definitions given above, you should be able to answer them.
A HashMap is an object. It exposes its behavior to you and hides all the nasty hashing details. You tell it to put and get data, and don't care which hash function is used, how many "buckets" there are, and how collisions are handled. Actually, you are using HashMap solely through its Map interface, which is quite a good indication of abstraction and "real" objects.
Don't get confused that you can use instances of a Map as a replacement for a data structure!
// A data structure
public class Point {
public int x;
public int y;
}
// A Map _instance_ used instead of a data structure!
Map<String, Integer> data = new HashMap<>();
data.put("x", 1);
data.put("y", 2);
A String, on the other hand, is pretty much an array of characters, and does not try to hide this very much. I guess one could call it a data structure, but to be honest I am not sure if much is to be gained one way or the other.
This is what, I believe, Robert. C. Martin was trying to convey:
Data Structures are classes that simply act as containers of structured data. For example:
public class Point {
public double x;
public double y;
}
Objects, on the other hand, are used to create abstractions. An abstraction is understood as:
a simplification of something much more complicated that is going on under the covers The Law of Leaky Abstractions, Joel on Software
So, objects hide all their underpinnings and only let you manipulate the essence of their data in a simplified way. For instance:
public interface Point {
double getX();
double getY();
void setCartesian(double x, double y);
double getR();
double getTheta();
void setPolar(double r, double theta);
}
Where we don't know how the Point is implemented, but we do know how to consume it.
As I see it , what Robert Martin tries to convey, is that objects should not expose their data via getters and setters unless their sole purpose is to act as simple data containers. Good examples of such containers might be java beans, entity objects (from object mapping of DB entities), etc.
The Java Collection Framework classes, however, are not a good example of what he's referring to, since they don't really expose their internal data (which is in a lot of cases basic arrays). It provides abstraction that lets you retrieve objects that they contain. Thus (in my POV) they fit in the "Objects" category.
The reasons are stated by the quotes you added from the book, but there are more good reasons for refraining from exposing the internals. Classes that provide getters and setters invite breaches of the Law of Demeter, for instance. On top of that, knowing the structure of the state of some class (knowing which getters/setters it has) reduces the ability to abstract the implementation of that class. There are many more reasons of that sort.
An object is an instance of a class.
A class can model various things from the real world. It's an abstraction of something (car, socket, map, connection, student, teacher, you name it).
A data structure is a structure which organizes certain data in a certain way.
You can implement structures in ways different that by using classes (that's what you do in languages which don't support OOP e.g.; you can still implement a data structure in C let's say).
HashMap in java is a class which models a map data structure using hash-based implementation, that's why it's called HashMap.
Socket in java is a class which doesn't model a data structure but something else (a socket).
A data structure is only an abstraction, a special way of representing data. They are just human-made constructs, which help in reducing complexity at the high-level, i.e. to not work in the low-level. An object may seem to mean the same thing, but the major difference between objects and data structures is that an object might abstract anything. It also offers behaviour. A data structure does not have any behaviour because it is just data-holding memory.
The libraries classes such as Map, List,etc. are classes, which represent data structures. They implement and setup a data structure so that you can easily work with them in your programs by creating instances of them (i.e. objects).
Data structures(DS) are an abstract way of saying that a structure holds some data'. HashMap with some key value pairs is a data structure in Java. Associated arrays are similarly in PHP etc. Objects is a little lower than the DS level. Your hashmap is a data structure. now to use a hashmap you create an 'object' of it and add data to that object using put method. I can have my own class Employee which has data and is thus a DS for me. But to use this DS to do some operations like o see if the employee is a male or a female colleague i need an instance of an Employee and test its gender property.
Don't confuse objects with data structures.
An object is an instance of a class. A class can define a set of properties/fields that every instance/object of that class inherits. A data structure is a way to organize and store data. Technically a data structure is an object, but it's an object with the specific use for holding other objects (everything in Java is an object, even primitive types).
To answer your question a String is an object and a data structure. Every String object you create is an instance of the String class. A String, as Java represents it internally, is essentially a character array, and an array is a data structure.
Not all classes are blueprints for data structures, however all data structures are technically objects AKA instances of a class (that is specifically designed to store data), if that makes any sense.
Your question is tagged as Java, so I will reference only Java here.
Objects are the Eve class in Java; that is to say everything in Java extends Object and object is a class.
Therefor, all data structures are Objects, but not all Objects are data structures.
The key to the difference is the term Encapsulation.
When you make an object in Java, it is considered best practice to make all of your data members private. You do this to protect them from anyone using the class.
However, you want people to be able to access the data, sometimes change it. So, you provide public methods called accessors and mutators to allow them to do so, also called getters and setters. Additionally, you may want them to view the object as a whole in a format of your choosing, so you can define a toString method; this returns a string representing the object's data.
A structure is slightly different.
It is a class.
It is an Object.
But it is usually private within another class; As a Node is private within a tree and should not be directly accessible to the user of the tree. However, inside the tree object the nodes data members are publicly visible. The node itself does not need accessors and mutators, because these functions are trusted to and protected by the tree object.
Keywords to research: Encapsulation, Visibility Modifiers

Why can attributes in Java be public?

As everybody knows, Java follows the paradigms of object orientation, where data encapsulation says, that fields (attributes) of an object should be hidden for the outer world and only accessed via methods or that methods are the only interface of the class for the outer world. So why is it possible to declare a field in Java as public, which would be against the data encapsulation paradigm?
I think it's possible because every rule has its exception, every best practice can be overridden in certain cases.
For example, I often expose public static final data members as public (e.g., constants). I don't think it's harmful.
I'll point out that this situation is true in other languages besides Java: C++, C#, etc.
Languages need not always protect us from ourselves.
In Oli's example, what's the harm if I write it this way?
public class Point {
public final int x;
public final int y;
public Point(int p, int q) {
this.x = p;
this.y = q;
}
}
It's immutable and thread safe. The data members might be public, but you can't hurt them.
Besides, it's a dirty little secret that "private" isn't really private in Java. You can always use reflection to get around it.
So relax. It's not so bad.
For flexibility. It would be a massive pain if I wasn't able to write:
class Point {
public int x;
public int y;
}
There is precious little advantage to hide this behind getters and setters.
Because rigid "data encapsulation" is not the only paradigm, nor a mandatory feature of object orientation.
And, more to the point, if one has a data attribute that has a public setter method and a public getter method, and the methods do nothing other than actually set/get the attribute, what's the point of keeping it private?
Not all classes follow the encapsulation paradigm (e.g. factory classes). To me, this increases flexibility. And anyway, it's the responsibility of the programmer, not the language, to scope appropriately.
Object Oriented design has no requirement of encapsulation. That is a best practice in languages like Java that has far more to do with the language's design than OO.
It is only a best practice to always encapsulate in Java for one simple reason. If you don't encapsulate, you can't later encapsulate without changing an object's signature. For instance, if your employee has a name, and you make it public, it is employee.name. If you later want to encapsulate it, you end up with employee.getName() and employee.setName(). this will of course break any code using your Employee class. Thus, in Java it is best practice to encapsulate everything, so that you never have to change an object's signature.
Some other OO languages (ActionScript3, C#, etc) support true properties, where adding a getter/setter does not affect the signature. In this case, if you have a getter or setter, it replaces the public property with the same signature, so you can easily switch back and forth without breaking code. In these languages, the practice of always encapsulating is no longer necessary.
Discussing good side of public variables... Like it... :)
There can be many reasons to use public variables. Let's check them one by one:
Performance
Although rare, there will be some situations in which it matters. The overhead of method call will have to be avoided in some cases.
Constants
We may use public variables for constants, which cannot be changed after it is initialized in constructor. It helps performance too. Sometimes these may be static constants, like connection string to the database. For example,
public static final String ACCEPTABLE_PUBLIC = "Acceptable public variable";
Other Cases
There are some cases when public makes no difference or having a getter and setter is unnecessary. A good example with Point is already written as answer.
Java is a branch from the C style-syntax languages. Those languages supported structs which were fixed offset aliases for a block of memory that was generally determined to be considered "one item". In other words, data structures were implemented with structs.
While using a struct directly violates the encapsulation goals of Object Oriented Programming, when Java was first released most people were far more competent in Iterative (procedural) programming. By exposing members as public you can effectively use a Java class the same way you might use a C struct even though the underlying implementations of the two envrionments were drastically different.
There are some scenarios where you can even do this with proper encapsulation. For example, many data structure consist of nodes of two or more pointers, one to point to the "contained" data, and one or more to point to the "other" connections to the rest of the data structure. In such a case, you might create a private class that has not visibility outside of the "data structure" class (like an inner class) and since all of your code to walk the structure is contained within the same .java file, you might remove the .getNext() methods of the inner class as a performance optimization.
To use public or not really depends on whether there is an invariant to maintain. For example, a pure data object does not restrict state transition in any fashion, so it does not make sense to encapsulate the members with a bunch of accessors that offer no more functionality that exposing the data member as public.
If you have both a getter and setter for a particular non-private data member that provides no more functionality than getting and setting, then you might want to reevaluate your design or make the member public.
I believe data encapsulation is offered more like an add-on feature and not a compulsory requirement or rule, so the coder is given the freedom to use his/her wisdom to apply the features and tweak them as per their needs.Hence, flexible it is!
A related example can be one given by #Oli Charlesworth
Accesibility modifiers are an implementation of the concept of encapsulation in OO languages (I see this implementation as a way to relax this concept and allow some flexibility). There are pure OO languages that doesn't have accesibility modifiers i.e. Smalltalk. In this language all the state (instance variables) is private and all the methods are public, the only way you have to modify or query the state of an object is through its instance methods. The absence of accesibility modifiers for methods force the developers to adopt certain conventions, for instance, methods in a private protocol (protocols are a way to organize methods in a class) should not be used outside the class, but no construct of the language will enforce this, if you want to you can call those methods.
I'm just a beginner, but if public statement doesn't exists, the java development will be really complicated to understand. Because we use public, private and others statements to simplify the understanding of code, like jars that we use and others have created. That I wanna say is that we don't need to invent, we need to learn and carry on.
I hope apologize from my english, I'm trying to improve and I hope to write clearly in the future.
I really can't think of a good reason for not using getters and setters outside of laziness. Effective Java, which is widely regarded as one of the best java books ever, says to always use getters and setters.
If you don't need to hear about why you should always use getters and setters skip this paragraph. I disagree with the number 1 answer's example of a Point as a time to not use getters and setters. There are several issues with this. What if you needed to change the type of the number. For example, one time when I was experimenting with graphics I found that I frequently changed my mind as to weather I want to store the location in a java Shape or directly as an int like he demonstrated. If I didn't use getters and setters and I changed this I would have to change all the code that used the location. However, if I didn't I could just change the getter and setter.
Getters and setters are a pain in Java. In Scala you can create public data members and then getters and or setters later with out changing the API. This gives you the best of both worlds! Perhaps, Java will fix this one day.

Disadvantage of object composition over class inheritance

Most design patten books say we should "Favor object composition over class inheritance."
But can anyone give me an example that inheritance is better than object composition.
Inheritance is appropriate for is-a relationships. It is a poor fit for has-a relationships.
Since most relationships between classes/components fall into the has-a bucket (for example, a Car class is likely not a HashMap, but it may have a HashMap), it then follows the composition is often a better idea for modeling relationships between classes rather than inheritance.
This is not to say however that inheritance is not useful or not the correct solution for some scenarios.
My simple answer is that you should use inheritance for behavioral purposes. Subclasses should override methods to change the behaviour of the method and the object itself.
This article (interview with Erich Gamma, one of the GoF) elaborates clearly why Favor object composition over class inheritance.
In Java, whenever you inherit from a class, your new class also automatically becomes a subtype of the original class type. Since it is a subtype, it needs to adhere to the Liskov substitution principle.
This principle basically says that you must be able to use the subtype anywhere where the supertype is expected. This severely limits how the behavior of your new inherited class can differ from the original class.
No compiler will be able to make you adhere to this principle though, but you can get in trouble if you don't, especially when other programmers are using your classes.
In languages that allow subclassing without subtyping (like the CZ language), the rule "Favor object composition over inheritance" is not as important as in languages like Java or C#.
Inheritance allows an object of the derived type to be used in nearly any circumstance where one would use an object of the base type. Composition does not allow this. Use inheritance when such substitution is required, and composition when it is not.
Just think of it as having an "is-a" or a "has-a" relationship
In this example Human "is-a" Animal, and it may inherits different data from the Animal class. Therefore Inheritance is used:
abstract class Animal {
private String name;
public String getName(){
return name;
}
abstract int getLegCount();
}
class Dog extends Animal{
public int getLegCount(){
return 4;
}
}
class Human extends Animal{
public int getLegCount(){
return 2;
}
}
Composition makes sense if one object is the owner of another object. Like a Human object owning a Dog object. So in the following example a Human object "has-a" Dog object
class Dog{
private String name;
}
class Human{
private Dog pet;
}
hope that helped...
It is a fundamental design principle of a good OOD. You can assign a behaviour to a class dynamicly "in runtime", if you use composition in your design rather than inheritance like in Strategy Pattern. Say,
interface Xable {
doSomething();
}
class Aable implements Xable { doSomething() { /* behave like A */ } }
class Bable implements Xable { doSomething() { /* behave like B */ } }
class Bar {
Xable ability;
public void setAbility(XAble a) { ability = a; }
public void behave() {
ability.doSomething();
}
}
/*now we can set our ability in runtime dynamicly */
/*somewhere in your code */
Bar bar = new Bar();
bar.setAbility( new Aable() );
bar.behave(); /* behaves like A*/
bar.setAbility( new Bable() );
bar.behave(); /* behaves like B*/
if you did use inheritance, the "Bar" would get the behaviour "staticly" over inheritance.
Inheritance is necessary for subtyping. Consider:
class Base {
void Foo() { /* ... */ }
void Bar() { /* ... */ }
}
class Composed {
void Foo() { mBase.Foo(); }
void Bar() { mBase.Foo(); }
private Base mBase;
}
Even though Composed supports all of the methods of Foo it cannot be passed to a function that expects a value of type Foo:
void TakeBase(Base b) { /* ... */ }
TakeBase(new Composed()); // ERROR
So, if you want polymorphism, you need inheritance (or its cousin interface implementation).
This is a great question. One I've been asking for years, at conferences, in videos, in blog posts. I've heard all kinds of answers. The only good answer I've heard is preformance:
Performance differences in languages. Sometimes, classes take advantage of built-in engine optimizations that dynamic compositions don't. Most of the time, this is a much smaller concern than the problems associated with class inheritance, and usually, you can inline everything you need for that performance optimization into a single class and wrap a factory function around it and get the benefits you need without a problematic class hierarchy.
You should never worry about this unless you detect a problem. Then you should profile and test differences in perf to make informed tradeoffs as needed. Often, there are other performance optimizations available that don't involve class inheritance, including tricks like inlining, method delegation, memoizing pure functions, etc... Perf will vary depending on the specific application and language engine. Profiling is essential, here.
Additionally, I've heard lots of common misconceptions. The most common is confusion about type systems:
Conflating types with classes (there are a couple existing answers concentrate on that here already). Compositions can satisfy polymorphism requirements by implementing interfaces. Classes and types are orthogonal, though in most class-supporting languages, subclasses automatically implement the superclass interface, so it can seem convenient.
There are three very good reasons to avoid class inheritance, and the crop up again and again:
The gorilla/banana problem
"I think the lack of reusability comes in object-oriented languages, not functional languages. Because the problem with object-oriented languages is they’ve got all this implicit environment that they carry around with them. You wanted a banana but what you got was a gorilla holding the banana and the entire jungle." ~ Joe Armstrong, quoted in "Coders at Work" by Peter Seibel.
This problem basically refers to the lack of selective code reuse in class inheritance. Composition lets you select just the pieces you need by approaching software design from a "small, reusable parts" approach rather than building monolithic designs that encapsulate everything related to some given functionality.
The fragile base class problem
Class inheritance is the tightest coupling available in object-oriented design, because the base class becomes part of the implementation of the child classes. This is why you'll also hear the advice from the Gang of Four's "Design Patterns" classic: "Program to an interface, not an implementation."
The problem with implementation inheritance is that even the smallest change to the inner details of that implementation could potentially break child classes. If the interface is public, exposed to user-land in any way, it could break code you are not even aware of.
This is the reason that class hierarchies become brittle -- hard to change as you grow them with new use-cases.
The common refrain is that we should be constantly refactoring our code (see Martin Fowler et al on extreme programming, agile, etc...). The key to refactor success is that you can't break things -- but as we've just seen, it's difficult to refactor a class hierarchy without breaking things.
The reason is that it's impossible to create the correct class hierarchy without knowing everything you need to know about the use-cases, but you can't know that in evolving software. Use cases get added or changed in projects all the time.
There is also a discovery process in programming, where you discover the right design as you implement the code and learn more about what works and what doesn't. But with class inheritance, once you get a class taxonomy going, you've painted yourself into a corner.
You need to know the information before you start the implementation, but part of learning the information you need involves building the implementation. It's a catch-22.
The duplication by necessity problem. This is where the death spiral really gets going. Sometimes, you really just want a banana, not the gorilla holding the banana, and the entire jungle. So you copy and paste it. Now there's a bug in a banana, so you fix it. Later, you get the same bug report and close it. "I already fixed that". And then you get the same bug report again. And again. Uh-oh. It's not fixed. You forgot the other banana! Google "copy pasta".
Other times, you really need to work a new use-case into your software, but you can't change the original base class, so instead, you copy and paste the entire class hierarchy into a new one and rename all the classes you need in the hierarchy to force that new use-case into the code base. 6 months later a new dev is looking at the code and wondering which class hierarchy to inherit from and nobody can provide a good answer.
Duplication by necessity leads to copy pasta messes, and pretty soon people start throwing around the word "rewrite" like it's no big deal. The problem with that is that most rewrite projects fail. I can name several orgs off the top of my head that are currently maintaining two development teams instead of one while they work on a rewrite project. I've seen such orgs cut funding to one or the other, and I've seen projects like that chew through so much cash that a startup or small business runs out of money and shuts down.
Developers underestimate the impact of class inheritance all the time. It's an important choice, and you need to be aware of the trade offs you opt into every time you create or inherit from a base class.

Categories

Resources