Related
I'm reviewing the API changes for Java 8 and I noticed that the new methods in java.util.Arrays are not overloaded for all primitives. The methods I noticed are:
parallelSetAll
parallelPrefix
spliterator
stream
Currently these new methods only handle int, long, and double primitives.
int, long, and double are probably the most widely used primitives so it makes sense that if they had to limit the API that they would choose those three, but why did they have to limit the API?
To address the questions as a whole, and not just this particular scenario, I think we all want to know...
Why There's Interface Pollution in Java 8
For instance, in a language like C#, there is a set of predefined function types accepting any number of arguments with an optional return type (Func and Action each one going up to 16 parameters of different types T1, T2, T3, ..., T16), but in the JDK 8 what we have is a set of different functional interfaces, with different names and different method names, and whose abstract methods represent a subset of well known function arities (i.e. nullary, unary, binary, ternary, etc). And then we have an explosion of cases dealing with primitive types, and there are even other scenarios causing an explosion of more functional interfaces.
The Type Erasure Issue
So, in a way, both languages suffer from some form of interface pollution (or delegate pollution in C#). The only difference is that in C# they all have the same name. In Java, unfortunately, due to type erasure, there is no difference between Function<T1,T2> and Function<T1,T2,T3> or Function<T1,T2,T3,...Tn>, so evidently, we couldn't simply name them all the same way and we had to come up with creative names for all possible types of function combinations. For further reference on this, please refer to How we got the generics we have by Brian Goetz.
Don't think the expert group did not struggle with this problem. In the words of Brian Goetz in the lambda mailing list:
[...] As a single example, let's take function types. The lambda
strawman offered at devoxx had function types. I insisted we remove
them, and this made me unpopular. But my objection to function types
was not that I don't like function types -- I love function types --
but that function types fought badly with an existing aspect of the
Java type system, erasure. Erased function types are the worst of
both worlds. So we removed this from the design.
But I am unwilling to say "Java never will have function types"
(though I recognize that Java may never have function types.) I
believe that in order to get to function types, we have to first deal
with erasure. That may, or may not be possible. But in a world of
reified structural types, function types start to make a lot more
sense [...]
An advantage of this approach is that we can define our own interface types with methods accepting as many arguments as we would like, and we could use them to create lambda expressions and method references as we see fit. In other words, we have the power to pollute the world with yet even more new functional interfaces. Also, we can create lambda expressions even for interfaces in earlier versions of the JDK or for earlier versions of our own APIs that defined SAM types like these. And so now we have the power to use Runnable and Callable as functional interfaces.
However, these interfaces become more difficult to memorize since they all have different names and methods.
Still, I am one of those wondering why they didn't solve the problem as in Scala, defining interfaces like Function0, Function1, Function2, ..., FunctionN. Perhaps, the only argument I can come up with against that is that they wanted to maximize the possibilities of defining lambda expressions for interfaces in earlier versions of the APIs as mentioned before.
Lack of Value Types Issue
So, evidently type erasure is one driving force here. But if you are one of those wondering why we also need all these additional functional interfaces with similar names and method signatures and whose only difference is the use of a primitive type, then let me remind you that in Java we also lack of value types like those in a language like C#. This means that the generic types used in our generic classes can only be reference types and not primitive types.
In other words, we can't do this:
List<int> numbers = asList(1,2,3,4,5);
But we can indeed do this:
List<Integer> numbers = asList(1,2,3,4,5);
The second example, though, incurs in the cost of boxing and unboxing of the wrapped objects back and forth from/to primitive types. This can become really expensive in operations dealing with collections of primitive values. So, the expert group decided to create this explosion of interfaces to deal with the different scenarios. To make things "less worse" they decided to only deal with three basic types: int, long and double.
Quoting the words of Brian Goetz in the lambda mailing list:
[...] More generally: the philosophy behind having specialized
primitive streams (e.g., IntStream) is fraught with nasty tradeoffs.
On the one hand, it's lots of ugly code duplication, interface
pollution, etc. On the other hand, any kind of arithmetic on boxed ops
sucks, and having no story for reducing over ints would be terrible.
So we're in a tough corner, and we're trying to not make it worse.
Trick #1 for not making it worse is: we're not doing all eight
primitive types. We're doing int, long, and double; all the others
could be simulated by these. Arguably we could get rid of int too, but
we don't think most Java developers are ready for that. Yes, there
will be calls for Character, and the answer is "stick it in an int."
(Each specialization is projected to ~100K to the JRE footprint.)
Trick #2 is: we're using primitive streams to expose things that are
best done in the primitive domain (sorting, reduction) but not trying
to duplicate everything you can do in the boxed domain. For example,
there's no IntStream.into(), as Aleksey points out. (If there were,
the next question(s) would be "Where is IntCollection? IntArrayList?
IntConcurrentSkipListMap?) The intention is many streams may start as
reference streams and end up as primitive streams, but not vice versa.
That's OK, and that reduces the number of conversions needed (e.g., no
overload of map for int -> T, no specialization of Function for int
-> T, etc.) [...]
We can see that this was a difficult decision for the expert group. I think few would agree that this is elegant, but most of us would most likely agree it was necessary.
For further reference on the subject you may want to read The State of Value Types by John Rose, Brian Goetz, and Guy Steele.
The Checked Exceptions Issue
There was a third driving force that could have made things even worse, and it is the fact that Java supports two types of exceptions: checked and unchecked. The compiler requires that we handle or explicitly declare checked exceptions, but it requires nothing for unchecked ones. So, this creates an interesting problem, because the method signatures of most of the functional interfaces do not declare to throw any exceptions. So, for instance, this is not possible:
Writer out = new StringWriter();
Consumer<String> printer = s -> out.write(s); //oops! compiler error
It cannot be done because the write operation throws a checked exception (i.e. IOException) but the signature of the Consumer method does not declare it throws any exception at all. So, the only solution to this problem would have been to create even more interfaces, some declaring exceptions and some not (or come up with yet another mechanism at the language level for exception transparency. Again, to make things "less worse" the expert group decided to do nothing in this case.
In the words of Brian Goetz in the lambda mailing list:
[...] Yes, you'd have to provide your own exceptional SAMs. But then
lambda conversion would work fine with them.
The EG discussed additional language and library support for this
problem, and in the end felt that this was a bad cost/benefit
tradeoff.
Library-based solutions cause a 2x explosion in SAM types (exceptional
vs not), which interact badly with existing combinatorial explosions
for primitive specialization.
The available language-based solutions were losers from a
complexity/value tradeoff. Though there are some alternative
solutions we are going to continue to explore -- though clearly not
for 8 and probably not for 9 either.
In the meantime, you have the tools to do what you want. I get that
you prefer we provide that last mile for you (and, secondarily, your
request is really a thinly-veiled request for "why don't you just give
up on checked exceptions already"), but I think the current state lets
you get your job done. [...]
So, it's up to us, the developers, to craft yet even more interface explosions to deal with these in a case-by-case basis:
interface IOConsumer<T> {
void accept(T t) throws IOException;
}
static<T> Consumer<T> exceptionWrappingBlock(IOConsumer<T> b) {
return e -> {
try { b.accept(e); }
catch (Exception ex) { throw new RuntimeException(ex); }
};
}
In order to do:
Writer out = new StringWriter();
Consumer<String> printer = exceptionWrappingBlock(s -> out.write(s));
Probably, in the future when we get Support for Value Types in Java and Reification, we will be able to get rid of (or at least no longer need to use anymore) some of these multiple interfaces.
In summary, we can see that the expert group struggled with several design issues. The need, requirement or constraint to keep backward compatibility made things difficult, then we have other important conditions like the lack of value types, type erasure and checked exceptions. If Java had the first and lacked the other two the design of JDK 8 would probably have been different. So, we all must understand that these were difficult problems with lots of tradeoffs and the EG had to draw a line somewhere and make decisions.
I can't see the reason why the Boolean wrapper classes were made Immutable.
Why the Boolean Wrapper was not implemented like MutableBoolean in Commons lang which actually can be reset.
Does anyone have any idea/understanding about this ? Thanks.
Because 2 is 2. It won't be 3 tomorrow.
Immutable is always preferred as the default, especially in multithreaded situations, and it makes for easier to read and more maintainable code. Case in point: the Java Date API, which is riddled with design flaws. If Date were immutable the API would be very streamlined. I would know Date operations would create new dates and would never have to look for APIs that modify them.
Read Concurrency in Practice to understand the true importance of immutable types.
But also note that if for some reason you want mutable types, use AtomicInteger AtomicBoolean, etc. Why Atomic? Because by introducing mutability you introduced a need for threadsafety. Which you wouldn't have needed if your types stayed immutable, so in using mutable types you also must pay the price of thinking about threadsafety and using types from the concurrent package. Welcome to the wonderful world of concurrent programming.
Also, for Boolean - I challenge you to name a single operation that you might want to perform that cares whether Boolean is mutable. set to true? Use myBool = true. That is a re-assignment, not a mutation. Negate? myBool = !myBool. Same rule. Note that immutability is a feature, not a constraint, so if you can offer it, you should - and in these cases, of course you can.
Note this applies to other types as well. The most subtle thing with integers is count++, but that is just count = count + 1, unless you care about getting the value atomically... in which case use the mutable AtomicInteger.
Wrapper classes in Java are immutable so the runtime can have only two Boolean objects - one for true, one for false - and every variable is a reference to one of those two. And since they can never be changed, you know they'll never be pulled out from under you. Not only does this save memory, it makes your code easier to reason about - since the wrapper classes you're passing around you know will never have their value change, they won't suddenly jump to a new value because they're accidentally a reference to the same value elsewhere.
Similarly, Integer has a cache of all signed byte values - -128 to 127 - so the runtime doesn't have to have extra instances of those common Integer values.
Patashu is the closest. Many of the goofy design choices in Java were because of the limitations of how they implemented a VM. I think originally they tried to make a VM for C or C++ but it was too hard (impossible?) so made this other, similar language. Write one, run everywhere!
Any computer sciency justification like those other dudes spout is just after-the-fact folderal. As you now know, Java and C# are evolving to be as powerful as C. Sure, they were cleaner. Ought to be for languages designed decade(s) later!
Simple trick is to make a "holder" class. Or use a closure nowadays! Maybe Java is evolving into JavaScript. LOL.
Boolean or any other wrapper class is immutable in java. Since wrapper classes are used as variables for storing simple data, those should be safe and data integrity must be maintained to avoid inconsistent or unwanted results. Also, immutability saves lots of memory by avoiding duplicate objects. More can be found in article Why Strings & Wrapper classes are designed immutable in java?
The error I get from the compiler is "The left hand side of an assignment must be a variable". My use case is deep copying, but is not really relevant.
In C++, one can assign to *this.
The question is not how to circumvent assignment to this. It's very simple, but rather what rationale is there behind the decision not to make this a variable.
Are the reasons technical or conceptual?
My guess so far - the possibility of rebuilding an Object in a random method is error-prone (conceptual), but technically possible.
Please restrain from variations of "because java specs say so". I would like to know the reason for the decision.
In C++, one can assign to *this
Yes, but you can't do this = something in C++, which I actually believe is a closer match for what you're asking about on the Java side here.
[...] what rationale is there behind the decision not to make this a variable.
I would say clarity / readability.
this was chosen to be a reserved word, probably since it's not passed as an explicit argument to a method. Using it as an ordinary parameter and being able to reassign a new value to it, would mess up readability severely.
In fact, many people argue that you shouldn't change argument-variables at all, for this very reason.
Are the reasons technical or conceptual?
Mostly conceptual I would presume. A few technical quirks would arise though. If you could reassign a value to this, you could completely hide instance variables behind local variables for example.
My guess so far - the possibility of rebuilding an Object in a random method is error-prone (conceptual), but technically possible.
I'm not sure I understand this statement fully, but yes, error prone is probably the primary reason behind the decision to make it a keyword and not a variable.
because this is final,
this is keyword, not a variable. and you can't assign something to keyword. now for a min consider if it were a reference variable in design spec..and see the example below
and it holds implicit reference to the object calling method. and it is used for reference purpose only, now consider you assign something to this so won't it break everything ?
Example
consider the following code from String class (Note: below code contains compilation error it is just to demonstrate OP the situation)
public CharSequence subSequence(int beginIndex, int endIndex) {
//if you assign something here
this = "XYZ" ;
// you can imagine the zoombie situation here
return this.substring(beginIndex, endIndex);
}
Are the reasons technical or conceptual?
IMO, conceptual.
The this keyword is a short hand for "the reference to the object whose method you are currently executing". You can't change what that object is. It simply makes no sense in the Java execution model.
Since it makes no sense for this to change, there is no sense in making it a variable.
(Note that in C++ you are assigning to *this, not this. And in Java there is no * operator and no real equivalent to it.)
If you take the view that you could change the target object for a method in mid flight, then here are some counter questions.
What is the use of doing this? What problems would this (hypothetical) linguistic feature help you solve ... that can't be solved in a more easy-to-understand way?
How would you deal with mutexes? For instance, what would happen if you assign to this in the middle of a synchronized method ... and does the proposed semantic make sense? (The problem is that you either end up executing in synchronized method on an object that you don't have a lock on ... or you have to unlock the old this and lock the new this with the complications that that entails. And besides, how does this make sense in terms of what mutexes are designed to achieve?)
How would you make sense of something like this:
class Animal {
foo(Animal other) {
this = other;
// At this point we could be executing the overridden
// Animal version of the foo method ... on a Llama.
}
}
class Llama {
foo(Animal other) {
}
}
Sure you can ascribe a semantic to this but:
you've broken encapsulation of the subclass in a way that is hard to understand, and
you've not actually achieved anything particularly useful.
If you try seriously to answer these questions, I expect you'll come to the conclusion that it would have been a bad idea to implement this. (But if you do have satisfactory answers, I'd encourage you to write them up and post them as your own Answer to your Question!)
But in reality, I doubt that the Java designers even gave this idea more than a moment's consideration. (And rightly so, IMO)
The *this = ... form of C++ is really just a shorthand for a sequence of assignments of the the attributes of the current object. We can already do that in Java ... with a sequence of normal assignments. There is certainly no need for new syntax to support this. (How often does a class reinitialize itself from the state of another class?)
I note that you commented thus:
I wonder what the semantics of this = xy; should be. What do you think it should do? – JimmyB Nov 2 '11 at 12:18
Provided xy is of the right type, the reference of this would be set to xy, making the "original" object gc-eligible - kostja Nov 2 '11 at 12:24
That won't work.
The value of this is (effectively) passed by value to the method when the method is invoked. The callee doesn't know where the this reference came from.
Even if it did, that's only one place where the reference is held. Unless null is assigned in all places, the object cannot be eligible of garbage collection.
Ignoring the fact that this is technically impossible, I do not think that your idea would be useful OR conducive to writing readable / maintainable code. Consider this:
public class MyClass {
public void kill(MyClass other) {
this = other;
}
}
MyClass mine = new MyClass();
....
mine.kill(new MyClass());
// 'mine' is now null!
Why would you want to do that? Supposing that the method name was something innocuous rather than kill, would you expect the method to be able to zap the value of mine?
I don't. In fact, I think that this would be a misfeature: useless and dangerous.
Even without these nasty "make it unreachable" semantics, I don't actually see any good use-cases for modifying this.
this isn't even a variable. It's a keyword, as defined in the Java Language Specification:
When used as a primary expression, the keyword this denotes a value that is a reference to the object for which the instance method was invoked (§15.12), or to the object being constructed
So, it's not possible as it's not possible to assign a value to while.
The this in Java is a part of the language, a key word, not a simple variable. It was made for accessing an object from one of its methods, not another object. Assigning another object to it would cause a mess. If you want to save another objects reference in your object, just create a new variable.
The reason is just conceptual. this was made for accessing an Object itself, for example to return it in a method. Like I said, it would cause a mess if you would assign another reference to it. Tell me a reason why altering this would make sense.
Assigning to (*this) in C++ performs a copy operation -- treating the object as a value-type.
Java does not use the concept of a value-type for classes. Object assignment is always by-reference.
To copy an object as if it were a value-type: How do I copy an object in Java?
The terminology used for Java is confusing though: Is Java “pass-by-reference” or “pass-by-value”
Answer: Java passes references by value. (from here)
In other words, because Java never treats non-primitives as value-types, every class-type variable is a reference (effectively a pointer).
So when I say, "object assignment is always by-reference", it might be more technically accurate to phrase that as "object assignment is always by the value of the reference".
The practical implication of the distinction drawn by Java always being pass-by-value is embodied in the question "How do I make my swap function in java?", and its answer: You can't. Languages such as C and C++ are able to provide swap functions because they, unlike Java, allow you to assign from any variable by using a reference to that variable -- thus allowing you to change its value (if non-const) without changing the contents of the object that it previously referenced.
It could make your head spin to try to think this all the way through, but here goes nothing...
Java class-type variables are always "references" which are effectively pointers.
Java pointers are primitive types.
Java assignment is always by the value of the underlying primitive (the pointer in this case).
Java simply has no mechanism equivalent to C/C++ pass-by-reference that would allow you to indirectly modify a free-standing primitive type, which may be a "pointer" such as this.
Additionally, it is interesting to note that C++ actually has two different syntaxes for pass-by-reference. One is based on explicit pointers, and was inherited from the C language. The other is based on the C++ reference-type operator &. [There is also the C++ smart pointer form of reference management, but that is more akin to Java-like semantics -- where the references themselves are passed by value.]
Note: In the above discussion assign-by and pass-by are generally interchangeable terminology. Underlying any assignment, is a conceptual operator function that performs the assignment based on the right-hand-side object being passed in.
So coming back to the original question: If you could assign to this in Java, that would imply changing the value of the reference held by this. That is actually equivalent to assigning directly to this in C++, which is not legal in that language either.
In both Java and C++, this is effectively a pointer that cannot be modified. Java seems different because it uses the . operator to dereference the pointer -- which, if you're used to C++ syntax, gives you the impression that it isn't one.
You can, of course, write something in Java that is similar to a C++ copy constructor, but unlike with C++, there is no way of getting around the fact that the implementation will need to be supplied in terms of an explicit member-wise initialization. [In C++ you can avoid this, ultimately, only because the compiler will provide a member-wise implementation of the assignment operator for you.]
The Java limitation that you can't copy to this as a whole is sort-of artificial though. You can achieve exactly the same result by writing it out member-wise, but the language just doesn't have a natural way of specifying such an operation to be performed on a this -- the C++ syntax, (*this) doesn't have an analogue in Java.
And, in fact, there is no built-in operation in Java that reassigns the contents of any existing object -- even if it's not referred to as this. [Such an operation is probably more important for stack-based objects such as are common in C++.]
Regarding the use-case of performing a deep copy: It's complicated in Java.
For C++, a value-type-oriented language. The semantic intention of assignment is generally obvious. If I say a=b, I typically want a to become and independent clone of b, containing an equal value. C++ does this automatically for assignment, and there are plans to automate the process, also, for the comparison.
For Java, and other reference-oriented languages, copying an object, in a generic sense, has ambiguous meaning. Primitives aside, Java doesn't differentiate between value-types and reference-types, so copying an object has to consider every nested class-type member (including those of the parent) and decide, on a case-by-case basis, if that member object should be copied or just referenced. If left to default implementations, there is a very good chance that result would not be what you want.
Comparing objects for equality in Java suffers from the same ambiguities.
Based on all of this, the answer to the underlying question: why can't I copy an object by some simple, automatically generated, operation on this, is that fundamentally, Java doesn't have a clear notion of what it means to copy an object.
One last point, to answer the literal question:
What rationale is there behind the decision not to make this a variable?
It would simply be pointless to do so. The value of this is just a pointer that has been passed to a function, and if you were able to change the value of this, it could not directly affect whatever object, or reference, was used to invoke that method. After all, Java is pass-by-value.
Assigning to *this in C++ isn't equivalent to assigning this in Java. Assigning this is, and it isn't legal in either language.
Referring to a ~2-year old discussion of the fact that there is no operator overloading in Java ( Why doesn't Java offer operator overloading? ), and coming from many intense C++ years myself to Java, I wonder whether there is a more fundamental reason that operator overloading is not part of the Java language, at least in the case of assignment, than the highest-rated answer in that link states near the bottom of the answer (namely, that it was James Gosling's personal choice).
Specifically, consider assignment.
// C++
#include <iostream>
class MyClass
{
public:
int x;
MyClass(const int _x) : x(_x) {}
MyClass & operator=(const MyClass & rhs) {x=rhs.x; return *this;}
};
int main()
{
MyClass myObj1(1), myObj2(2);
MyClass & myRef = myObj1;
myRef = myObj2;
std::cout << "myObj1.x = " << myObj1.x << std::endl;
std::cout << "myObj2.x = " << myObj2.x << std::endl;
return 0;
}
The output is:
myObj1.x = 2
myObj2.x = 2
In Java, however, the line myRef = myObj2 (assuming the declaration of myRef in the previous line was myClass myRef = myObj1, as Java requires, since all such variables are automatically Java-style 'references') behaves very differently - it would not cause myObj1.x to change and the output would be
myObj1.x = 1
myObj2.x = 2
This difference between C++ and Java leads me to think that the absence of operator overloading in Java, at least in the case of assignment, is not a 'matter of personal choice' on the part of James Gosling, but rather a fundamental necessity given Java's syntax that treats all object variables as references (i.e. MyClass myRef = myObj1 defines myRef to be a Java-style reference). I say this because if assignment in Java causes the left-hand side reference to refer to a different object, rather than allowing the possibility that the object itself change its value, then it would seem that there is no possibility of providing an overloaded assignment operator.
In other words - it's not simply a 'choice', and there's not even the possibility of 'holding your breath' with the hope that it will ever be introduced, as the aforementioned high-rated answer also states (near the end). Quoting: "The reasons for not adding them now could be a mix of internal politics, allergy to the feature, distrust of developers (you know, the saboteur ones), compatibility with the previous JVMs, time to write a correct specification, etc.. So don't hold your breath waiting for this feature.". <-- So this isn't correct, at least for the assignment operator: the reason there's no operator overloading (at least for assignment) is instead fundamental to the nature of Java.
Is this a correct assessment on my part?
ADDENDUM
Assuming the assignment operator is a special case, then my follow-up question is: Are there any other operators, or more generally any other language features, that would by necessity be affected in a similar way as the assignment operator? I would like to know how 'deep' the difference goes between Java and C++ regarding variables-as-values/references. i.e., in C++, variable tokens represent values (and note, even if the variable token was declared initially as a reference, it's still treated as a value essentially wherever it's used), whereas in Java, variable tokens represent honest-to-goodness references wherever the token is later used.
There is a big misconception when talking about similarities and differences between Java and C++, that arises in your question. C++ references and Java references are not the same. In Java a reference is a resettable proxy to the real object, while in C++ a reference is an alias to the object. To put it in C++ terms, a Java references is a garbage collected pointer not a reference. Now, going back to your example, to write equivalent code in C++ and Java you would have to use pointers:
int main() {
type a(1), b(2);
type *pa = &a, *pb = &b;
pa = pb;
// a is still 1, b is still 2, pa == pb == &b
}
Now the examples are the same: the assignment operator is being applied to the pointers to the objects, and in that particular case you cannot overload the operator in C++ either. It is important to note that operator overloading can be easily abused, and that is a good reason to avoid it in the first place. Now if you add the two different types of entities: objects and references, things become more messy to think about.
If you were allowed to overload operator= for a particular object in Java, then you would not be able to have multiple references to the same object, and the language would be crippled:
Type a = new Type(1);
Type b = new Type(2);
a = b; // dispatched to Type.operator=( Type )??
a.foo();
a = new Type(3); // do you want to copy Type(3) into a, or work with a new object?
That in turn would make the type unusable in the language: containers store references, and they reassign them (even the first time just when an object is created), functions don't really use pass-by-reference semantics, but rather pass-by-value the references (which is a completely different issue, again, the difference is void foo( type* ) versus void foo( type& ): the proxy entity is copied, you cannot modify the reference passed in by the caller.
The problem is that the language is trying really hard to hide the fact that a and the object that a refers to are not the same thing (same happens in C#), and that in turn means that you cannot explicitly state that one operation is to be applied to the reference/referent, that is resolved by the language. The outcome of that design is that any operation that can be applied to references can never be applied to the objects themselves.
As of the rest of the operators, the decision is most probably arbitrary, because the language hides the reference/object difference, it could have been designed such that a+b was translated into type* operator+( type*, type* ) by the compiler. Since you cannot use arithmetic then there would be no problem, as the compiler would recognize that a+b is an operation that must be applied to the objects (it does not make sense with references). But then it could be considered a little awkward that you can overload +, but you cannot overload =, ==, !=...
That is the path that C# took, where assignment cannot be overloaded for reference types. Interestingly in C# there are value types, and the set of operators that can be overloaded for reference and value types are different. Not having coded C# in large projects, I cannot really tell whether that potential source of confusion is such or if people are just used to it (but if you search SO, you will find that a few people do ask why X cannot be overloaded in C# for reference types where X is one of the operations that can be applied to the reference itself.
That doesn't explain why they couldn't have allowed overloading of other operators like + or -. Considering James Gosling designed the Java language, and he said it was his personal choice, which he explains in more detail at the link provided in the question you linked, I think that's your answer:
There are some things that I kind of feel torn about, like operator overloading. I left out operator overloading as a fairly personal choice because I had seen too many people abuse it in C++. I've spent a lot of time in the past five to six years surveying people about operator overloading and it's really fascinating, because you get the community broken into three pieces: Probably about 20 to 30 percent of the population think of operator overloading as the spawn of the devil; somebody has done something with operator overloading that has just really ticked them off, because they've used like + for list insertion and it makes life really, really confusing. A lot of that problem stems from the fact that there are only about half a dozen operators you can sensibly overload, and yet there are thousands or millions of operators that people would like to define -- so you have to pick, and often the choices conflict with your sense of intuition. Then there's a community of about 10 percent that have actually used operator overloading appropriately and who really care about it, and for whom it's actually really important; this is almost exclusively people who do numerical work, where the notation is very important to appealing to people's intuition, because they come into it with an intuition about what the + means, and the ability to say "a + b" where a and b are complex numbers or matrices or something really does make sense. You get kind of shaky when you get to things like multiply because there are actually multiple kinds of multiplication operators -- there's vector product, and dot product, which are fundamentally very different. And yet there's only one operator, so what do you do? And there's no operator for square-root. Those two camps are the poles, and then there's this mush in the middle of 60-odd percent who really couldn't care much either way. The camp of people that think that operator overloading is a bad idea has been, simply from my informal statistical sampling, significantly larger and certainly more vocal than the numerical guys. So, given the way that things have gone today where some features in the language are voted on by the community -- it's not just like some little standards committee, it really is large-scale -- it would be pretty hard to get operator overloading in. And yet it leaves this one community of fairly important folks kind of totally shut out. It's a flavor of the tragedy of the commons problem.
UPDATE: Re: your addendum, the other assignment operators +=, -=, etc. would also be affected. You also can't write a swap function like void swap(int *a, int *b);. and other stuff.
Is this a correct assessment on my part?
The lack of operator in general is a "personal choice". C#, which is a very similar language, does allow operator overloading. But you still can't overload assignment. What would that even do in a reference-semantics language?
Are there any other operators, or more
generally any other language features,
that would by necessity be affected in
a similar way as the assignment
operator? I would like to know how
'deep' the difference goes between
Java and C++ regarding
variables-as-values/references.
The most obvious is copying. In a reference-semantics language, clone() isn't that common, and isn't needed at all for immutable types like String. But in C++, where the default assignment semantics are based around copying, copy constructors are very common. And automatically generated if you don't define one.
A more subtle difference is that it's a lot harder for a reference-semantics language to support RAII than a value-semantics language, because object lifetime is harder to track. Raymond Chen has a good explanation.
The reason why operator overloading is abused in C++ language is because it's too complex feature. Here's some aspects of it which makes it complex:
expressions are a tree
operator overloading is the interface/documentation for those expressions
interfaces are basically invisible feature in c++
free functions/static functions/friend functions are a big mess in C++
function prototypes are already complex feature
choice of the syntax for operator overloading is less than ideal
there is no other comparable api in c++ language
user-defined types/function names are handled differently than built-in types/function names in function prototypes
it uses advanced math, like the operator<<(ostream&, ostream & (*fptr)(ostream &));
even the simplest examples of it uses polymorphism
It's the only c++ feature that has 2d array in it
this-pointer is invisible and whether your operators are member functions or outside the class is important choice for programmers
Because of these complexity, very small number of programmers actually understand how it works. I'm probably missing many important aspects of it, but the list above is good indication that it is very complex feature.
Update: some explanation about the #4: the argument pretty much is as follows:
class A { friend void f(); }; class B { friend void f(); }
void f() { /* use both A and B members inside this function */ }
With static functions, you can do this:
class A { static void f(); }; void f() { /* use only class A here */ }
And with free functions:
class A { }; void f() { /* you have no special access to any classes */ }
Update#2: The #10, the example I was thinking looks like this in stdlib:
ostream &operator<<(ostream &o, std::string s) { ... } // inside stdlib
int main() { std::cout << "Hello World" << std::endl; }
Now the polymorphism in this example happens because you can choose between std::cout and std::ofstream and std::stringstream. This is possible because operator<< first parameter takes a reference to ostream. This is normal runtime polymorphism in this example.
Update #3: About the prototypes still. The real interaction between operator overloading and prototypes is because the overloaded operators becomes part of the class' interface. This brings us to the 2d array thing, because inside the compiler the class interface is a 2d data structure which has quite complex data in it, including booleans, types, function names. The rule #4 is needed so that you can choose when your operators are inside this 2d data structure and when they're outside of it. Rule #8 deals with the booleans stored in the 2d data structure. Rule #7 is because class' interface is used to represent elements of an expression tree.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 12 years ago.
Java has primitive data types which doesn't derive from object like in Ruby. So can we consider Java as a 100% object oriented language? Another question: Why doesn't Java design primitive data types the object way?
When Java first appeared (versions 1.x) the JVM was really, really slow.
Not implementing primitives as first-class objects was a compromise they had taken for speed purposes, although I think in the long run it was a really bad decision.
"Object oriented" also means lots of things for lots of people.
You can have class-based OO (C++, Java, C#), or you can have prototype-based OO (Javascript, Lua).
100% object oriented doesn't mean much, really. Ruby also has problems that you'll encounter from time to time.
What bothers me about Java is that it doesn't provide the means to abstract ideas efficiently, to extend the language where it has problems. And whenever this issue was raised (see Guy Steele's "Growing a Language") the "oh noes, but what about Joe Sixpack?" argument is given. Even if you design a language that prevents shooting yourself in the foot, there's a difference between accidental complexity and real complexity (see No Silver Bullet) and mediocre developers will always find creative ways to shoot themselves.
For example Perl 5 is not object-oriented, but it is extensible enough that it allows Moose, an object system that allows very advanced techniques for dealing with the complexity of OO. And syntactic sugar is no problem.
No, because it has data types that are not objects (such as int and byte). I believe Smalltalk is truly object-oriented but I have only a little experience with that language (about two months worth some five years ago).
I've also heard claims from the Ruby crowd but I have zero experience with that language.
This is, of course, using the definition of "truly OO" meaning it only has objects and no other types. Others may not agree with this definition.
It appears, after a little research into Python (I had no idea about the name/object distinction despite having coded in it for a year or so - more fool me, I guess), that it may indeed be truly OO.
The following code works fine:
#!/usr/bin/python
i = 7
print id(i)
print type(i)
print i.__str__()
outputting:
6701648
<type 'int'>
7
so even the base integers are objects here.
To get to true 100% OO think Smalltalk for instance, where everything is an object, including the compiler itself, and even if statements: ifTrue: is a message sent to a Boolean with a block of code parameter.
The problem is that object-oriented is not really well defined and can mean a lot of things. This article explains the problem in more detail:
http://www.paulgraham.com/reesoo.html
Also, Alan Kay (the inventor of Smalltalk and author(?) of the term "object-oriented") famously said that he hadn't C++ in mind when thought about OOP. So I think this could apply to Java as well.
The language being fully OO (whatever that means) is desirable, because it means better orthogonality, which is a good thing. But given that Java is not very orthogonal anyway in other respects, the small bit of its OO incompleteness probably doesn't matter in practice.
Java is not 100% OO.
Java may going towards 99% OO (think of auto-boxing, Scala).
I would say Java is now 87% OO.
Why java doesn't design primitive data
types as object way ?
Back in the 90's there were Performance reasons and at the same time Java stays backward compatible. So they cannot take them out.
No, Java is not, since it has primitive data types, which are different from objects (they don't have methods, instance variables, etc.). Ruby, on the other hand, is completely OOP. Everything is an object. I can do this:
1.class
And it will return the class of 1 (Fixnum, which is basically a number). You can't do this in Java.
Java, for the reason you mentioned, having primitives, doesn't make it a purely object-oriented programming language. However, the enforcement of having every program be a class makes it very oriented toward object-oriented programming.
Ruby, as you mentioned, and happened to be the first language that came to my mind as well, is a language that does not have primitives, and all values are objects. This certainly does make it more object-oriented than Java. On the other hand, to my knowledge, there is no requirement that a piece of code must be associated with a class, as is the case with Java.
That said, Java does have objects that wrap around the primitives such as Integer, Boolean, Character and such. The reason for having primitives is probably the reason given in Peter's answer -- back when Java was introduced in the mid-90's, memory on systems were mostly in the double-digit megabytes, so having each and every value be an object was large overhead.
(Large, of course is relative. I can't remember the exact numbers, but an object overhead was around 50-100 bytes of memory. Definitely more than the minimum of several bytes required for primitive values)
Today, with many desktops with multiple gigabytes of memory, the overhead of objects are less of an issue.
"Why java doesn't design primitive data types as object way ?"
At Sun developer days, quite a few years ago I remember James Gosling answering this. He said that they'd liked to have totally abstracted away from primitives - to leave only standard objects, but then ran out of time and decided to ship with what they had. Very sad.
So can we consider java as 100% object
oriented language?
No.
Another question : Why java doesn't
design primitive data types as object
way?
Mainly for performance reasons, possibly also to be more familiar to people coming from C++.
One reason Java can't obviously do away with non-object primitives (int, etc.) is that it does not support native data members. Imagine:
class int extends object
{
// need some data member here. but what type?
public native int();
public native int plus(int x);
// several more non-mutating methods
};
On second thoughts, we know Java maintains internal data per object (locks, etc.). Maybe we could define class int with no data members, but with native methods that manipulate this internal data.
Remaining issues: Constants -- but these can be dealt with similarly to strings. Operators are just syntactical sugar and + and would be mapped do the plus method at compile time, although we need to be careful that int.plus(float) returns float as does float.plus(int), and so on.
Ultimately I think the justification for primitives is efficiency: the static analysis needed to determine that an int object can be treated purely as JVM integer value may have been considered too big a problem when the language was designed.
I'd say that full-OO languages are those which have their elements (classes, methods) accessible as objects to work with.
From this POV, Java is not fully OOP language, and JavaScript is (no matter it has primitives).
According to Concepts in Programming Languages book, there is something called Ingalls test, proposed by Dan Ingalls a leader of the Smalltalk group. That is:
Can you define a new kind of integer,
put your new integers into rectangles
(which are already part of the window
system), ask the system to blacken a
rectangle, and have everything work?
And again according to the book Smalltalk passes this test but C++ and Java do not. Unfortunately book is not available online but here are some supporting slides (slide 33 answers your question).
No. Javascript, for example, is.
What would those Integer and Long and Boolean classes be written in?
How would you write an ArrayList or HashMap without primitive arrays?
This is one of those questions that really only matters in an academic sense. Ruby is optimized to treat ints, longs, etc. as primitives whenever possible. Java just made this explicit. If Java had primitives be objects, there would be IntPrimitive, LongPrimitive, etc (by whatever name) classes. which would most likely be final without special methods (e.g. no IntPrimitive.factorial). Which would mean for most purposes they would be primitives.
Java clearly is not 100% OO. You can easily program it in a procedural style. Most people do. It's true that the libraries and containers tend not to be as forgiving of this paradigm.
Java is not fully object oriented. I would consider Smalltalk and Eiffel the most popular fully object oriented languages.