The question is on the strategy approach to the problem of defining a square root algorithm in a generic numerical interface. I am aware of the existance of algorithms solving the problem with different conditions. I'm interested in algorithms that:
Solves the problem using only selected functions;
Doesn't care if the objects manipulated are integers, floating points or other, provided those objects can be added, mutiplied and confronted;
Returns an exact solution if the input is a perfect square.
Because the subtlety of the distintion and for the sake of clarity, I will define the problem in a very verbose way. Beware the wall text!
Suppose to have a Java Interface Constant<C extends Constant<C>> with the following abstract methods, that we will call base functions:
C add(C a);
C subtract(C a);
C multiply(C a);
C[] divideAndRemainder(C b);
C additiveInverse();
C multiplicativeInverse();
C additiveIdentity();
C multiplicativeIdentity();
int compareTo(C arg1);
Is not known if C represents an integer or a floating point, nor this must be relevant in the following discussion.
Using only those methods is possible to create static or default implementation of some mathematical algorithm regarding numbers: for example, dividerAndRemainder(C b); and compareTo(C arg1); allow to create algorithms for the greater common divisor, the bezout identity, etc etc...
Now suppose our Interface has a default method for the exponentiation:
public default C pow(int n){
if(n < 0) return this.additiveInverse().pow(-n);
if(n == 0) return additiveIdentity();
int m = n;
C output = this;
while(m > 1)
{
if(m%2 == 0) output = output.multiply(output);
else output = this.multiply(output.multiply(output));
m = m/2;
}
return output;
}
The goal is to define two default method called C root(int n) and C maximumErrorAllowed() such that:
x.equals(y.pow(n)) implies x.root(n).equals(y);
C root(int n); is actually implemented using only base functions and methods created from the base functions;
The interface can still be applied to any kind of numbers, including but not limiting at both integers and floating points.
this.root(n).pow(n).compareTo(maximumErrorAllowed()) == -1 for all this such that this.root(n)!=null, i.e. any eventual approximation has an error minor than C maximumErrorAllowed();
Is that possible? If yes, how and what would be an estimation of the computational complexity?
I went through some time working on a custom number interface for Java, it's amazingly hard--one of the most disappointing experiences I've had with Java.
The problem is that you have to start over from scratch--you can't really re-use anything in Java, so if you want to have implementations for int, float, long, BigInteger, rational, Complex and Vector you have to implement all the methods yourself for every single class, and then don't expect the Math package to be of much help.
It got particularly nasty implementing the "Composed" classes like "Complex" which is made from two of the "Generic" floating point types, or "Rational" which composes two generic integer types.
And math operators are right out--this can be especially frustrating.
The way I got it to work reasonably well was to implement the classes in Java and then write some of the higher-level stuff in Groovy. If you name the operations correctly, Groovy can just pick them up, like if your class implements ".plus()" then groovy will let you do instance1+instance2.
IIRC because of being dynamic, Groovy often handled cross-class pieces nicely, like if you said Complex + Integer you could supply a conversion from Integer to complex and groovy would promote Integer to Complex to do the operation and return a complex.
Groovy is pretty interchangeable with Java, You can usually just rename a Java class ".groovy" and compile it and it will work, so it was a pretty good compromise.
This was a long time ago though, now you might get some traction with Java 8's default methods in your "Number" interface--that could make implementing some of the classes easier but might not help--I'd have to try it again to find out and I'm not sure I want to re-open that can o' worms.
Is that possible? If yes, how?
In theory, yes. There are approximation algorithms for root(), for example the n-th root algorithm. You will run into problems with precision, however, which you might want to solve on a case-by-case basis (i. e. use a look-up table for integers). As such, I'd recommend against a default implementation in an interface.
What would be an estimation of the computational complexity?
This, too, is implementation varies based on your type of number, and is dependant on your precision. For integers, you can create an implementation with a look-up table, and the complexity would be O(1).
If you want a better answer for the complexity of the operation itself, you might want to check out Computational complexity of calculating the nth root of a real number.
Related
This is a question I read on some lectures about dynamic programming I randomly found on the internet. (I am graduated and I know the basic of dynamic programming already)
In the section of explaining why memoization is needed, i.e.
// psuedo code
int F[100000] = {0};
int fibonacci(int x){
if(x <= 1) return x;
if(F[x]>0) return F[x];
return F[x] = fibonacci(x-1) + fibonacci(x-2);
}
If memoization is not used, then many subproblems will be re-calculated many time that makes the complexity very high.
Then on one page, the notes have a question without answer, which is exactly what I want to ask. Here I am using exact wordings and the examples it show:
Automated memoization: Many functional programming languages (e.g. Lisp) have built-in support for memoization.
Why not in imperative languages (e.g. Java)?
LISP example the note provides (which it claims it is efficient):
(defun F (n)
(if
(<= n 1)
n
(+ (F (- n 1)) (F (- n 2)))))
Java example it provides (which it claims it is exponential)
static int F(int n) {
if (n <= 1) return n;
else return F(n-1) + F(n-2);
}
Before reading this, I do not even know there is built-in support of memoization in some programming languages.
Is the claim in the notes true? If yes, then why imperative languages not supporting it?
The claims about "LISP" are very vague, they don't even mention which LISP dialect or implementation they mean. None of LISP dialects I'm familiar with do automatic memoization, but LISP makes it easy to write a wrapper function which transforms any existing function into a memoized one.
Fully automatic, unconditional memoization would be a very dangerous practice and would lead to out-of-memory errors. In imperative languages it would be even worse because return values are often mutable, therefore not reusable. Imperative languages don't usually support tail-recursion optimization, further reducing the applicability of memoization.
The support for memoization is nothing more than having first-class functions.
If you want to memoize the Java version for one specific case, you can write it explicitly: create a hashtable, check for existing values, etc. Unfortunately, you cannot easily generalize this in order to memoize any function. Languages with first-class functions make writing functions and memoizing them almost orthogonal problems.
The basic case is easy, but you have to take into account recursive calls.
In statically typed functional languages like OCaml, a function that is memoized cannot just call itself recursively, because it would call the non-memoized version. However the only change to your existing function is to accept a function as an argument, named for example self, which should be called whenever you function wants to recurse. The generic memoization facility then provides the appropriate function. A full example of this is available in this answer.
The Lisp version has two features that makes memoizing an existing function even more straightforward.
You can manipulate functions like any other value
You can redefine functions at runtime
So for example, in Common Lisp, you define F:
(defun F (n)
(if (<= n 1)
n
(+ (F (- n 1))
(F (- n 2)))))
Then, you see that you need to memoize the function, so you load a library:
(ql:quickload :memoize)
... and you memoize F:
(org.tfeb.hax.memoize:memoize-function 'F)
The facility accepts arguments to specify which input should be cached and which test function to use. Then, the function F is replaced by a fresh one, which introduces the necessary code to use an internal hash-table. Recursive calls to F inside F are now calling the wrapping function, not the original one (you don't even recompile F). The only potential problem is if the original F was subject to tail-call optimization. You should probably declare it notinline or use DEF-MEMOIZED-FUNCTION.
Although I'm not sure any widely-used Lisps have supported automatic memoization, I think there are two reasons why memoization is more common in functional languages, and an additional one for Lisp-family languages.
First of all, people write functions in functional languages: computations whose result depends only on their arguments and which do not side-effect the environment. Anything which doesn't meet that requirement isn't amenable to memoization at all. And, well, imperative languages are just those languages in which those requirements are not, or may not be, met, because they would not be imperative otherwise!
Of course, even in merely functional-friendly languages like (most) Lisps you have to be careful: you probably should not memoize the following, for instance:
(defvar *p* 1)
(defun foo (n)
(if (<= n 0)
*p*
(+ (foo (1- n)) (foo (- n *p*)))))
Secondly is that functional languages generally want to talk about immutable data structures. This means two things:
It is actually safe to memoize a function which returns a large data structure
Functions which build very large data structures often need to cons an enormous amount of garbage, because they can't mutate interim structures.
(2) is slightly controversial: the received wisdom is that GCs are now so good that it's not a problem, copying is very cheap, compilers can do magic and so on. Well, people who have written such functions will know that this is only partly true: GCs are good, copying is cheap (but pointer-chasing large structures to copy them is often very hostile to caches), but it's not actually enough (and compilers almost never do the magic they are claimed to do). So you either cheat by gratuitously resorting to non-functional code, or you memoize. If you memoize the function then you only build all the interim structures once, and everything becomes cheap (other than in memory, but suitable weakness in the memoization can handle that).
Thirdly: if your language does not support easy metalinguistic abstraction, it's a serious pain to implement memoization. Or to put it another way: you need Lisp-style macros.
To memoize a function you need to do at least two things:
You need to control which arguments are the keys for the memoization -- not all functions have just one argument, and not all functions with multiple arguments should be memoized on the first;
You need to intervene inside the function to disable any self-tail-call optimization, which will completely subvert memoization.
Although it's kind of cruel to do so because it's so easy, I will demonstrate this by poking fun at Python.
You might think that decorators are what you need to memoize functions in Python. And indeed, you can write memoizing tools using decorators (and I have written a bunch of them). And these even sort-of work, although they do so mostly by chance.
For a start, a decorator can't easily know anything about the function it is decorating. So you end up either trying to memoize based on a tuple of all the arguments to the function, or having to specify in the decorator which arguments to memoize on, or something equally grotty.
Secondly, the decorator gets the function it is decorating as an argument: it doesn't get to poke around inside it. That's actually OK, because Python, as part of its 'no concepts invented after 1956' policy, of course, does not assume that calls to f lexically within the definion of f (and with no intervening bindings) are in fact self-calls. But perhaps one day it will, and all your memoization will now break.
So in summary: to memoize functions robustly, you need Lisp-style macros. Probably the only imperative languages which have those are Lisps.
Consider a java project doing lots of floating point operations where efficiency and memory consumption can be important factors - such as a game. If this project targets multiple platforms, typically Android and the desktop, or more generally 32 and 64 bit machines, you might want to be able to build a single and a double precision build of your software.
In C/C++ and other lower level languages, this is easily achieved by typedef statements. You can have:
typedef float myfloat;
and the day you want to go 64 bit just change that to:
typedef double myfloat;
provided you use myfloat throughout your code.
How would one achieve a similar effect in java?
A global search and replace of "float" by "double" (or vice-versa) has the huge drawback of breaking compatibility with exterior libraries that only offer one flavor of floating point precision, chief among them certain functions of the java.lang.Math class.
Having a high-level polymorphic approach is less than ideal when you wish to remain efficient and keep memory tight (by having lots of primitive type arrays, for instance).
Have you ever dealt with such a situation and if so, what is in your opinion the most elegant approach to this problem?
The official Android documentation says this about float vs. double performance:
In speed terms, there's no difference between float and double on the more modern hardware. Space-wise, double is 2x larger. As with desktop machines, assuming space isn't an issue, you should prefer double to float.
So you shouldn't have to worry about performance too much. Just use the type that is reasonable for solving your problem.
Apart from that, if you really want to have the ability to switch between double and float, you could wrap your floatin point value in a class an work with that. But I would expect such a solution to be slower that using any floating point primitive directly. As Java does not support overloading operators, it would also make your math code much more complicated. Think of something like
double d = (a+b)/c;
when using primitives versus
MyFloat d = a.add(b).div(c);
when working with wrapper objects. According to my experience, the polymorphic approach makes maintaining your code much harder.
I will omit the part saying that for example double should be just fine etc. Others covered that more than good. I'm just assuming you want to do it - no matter what. Even for the sake of experiment to see what's the performance/memory difference - it's interesting.
So, a preprocessor would be great here. Java doesn't provide one.
But, you can use your own. Here are some existing implementations. Using javapp for example, you will have #define.
This is not practical without great pains.
While you could define your high level API's to work with wrapper types (e.g. use Number instead of a specific type and have multiple implementations of the API that uses Float or Double under the hood), chances are that the wrappers will eat more performance than you can ever gain by selecting a less precise type.
You could define high level objects as interfaces (e.g. Polygon etc.) and hide their actual data representation in the implementation. That means you will have to maintain two implementations, one using float and one for double. It probably requires considerable code duplication.
Personally, I think you are attempting to solve a non-existant conundrum. Either you need double precision, then float isn't an option, or float is "good enough", then there is no advantage of ever using double.
Simply use the smallest type that fits the requirements. Especially for a game float/double should make little difference. Its unlikely you spend that much time in math (in java code) - most likely your graphics will determine how fast you can go.
Generally use float and only switch to double for parts where you need the precision and the question disappears.
Java does not have such a functionality, aside from brute-force find-and-replace. However, you can create a helper class. Shown below, the type you will change to change the float precision is called F:
public class VarFloat {
F boxedVal;
public VarFloat(F f){
this.boxedVal = f;
}
public F getVal() { return boxedVal; }
public double getDoubleVal() { return (double)boxedVal; }
public double getFloatVal() { return (float)boxedVal; }
}
Where at all possible, you should use getVal as opposed to any of the type-specific ones. You can also consider adding methods like add, addLocal, etc. For example, the two add methods would be:
public VarFloat add(VarFloat vf){
return new VarFloat(this.boxedVal + vf.boxedVal);
}
public VarFloat addLocal(VarFloat vf){
this.boxedVal += vf.boxedVal;
return this; // for method chaining
}
My questions are motivated by a C++ code which is not mine and that I am currently trying to understand. Nevertheless, I think this question can be answered by OO developers in general (because I have ever seen this case in Java code for example).
Reading through the code, I noticed that the developer always work using side effects (most functions have "void return type" except for getters and some rare cases) instead of returning results directly. He sometimes uses return values but only for control flows (error code... instead of exceptions).
Here are two possible examples of his prototypes (in pseudo-code):
For a function that should return min, max and avg of the float values in a matrix M:
void computeStatistics(float min, float max, float avg, Matrix M);
OR
void computeStatistics(List myStat, Matrix M);
For a function that should return some objects in a given list that verifies a certain criteria and the number of objects found:
int controlValue findObjects(List result, int nbObjectsFound, Object myCriteria, List givenList)
I am not familiar with C++ as you can probably see in my very-pseudo-code... But rather with Matlab where it is possible to return everything you want from a function for example an int and a List side by side (which could be useful for the second example). I know it is not possible in C++ and that could explain the second prototype but it doesn't explain the choice for the first example where he could have done:
List myStat computeStat(Matrix M)
Finally, here are my questions:
What are the possible reasons that could motivate this choice? Is it a good practice, a convention or just a development choice? Are there advantages of one way over the other (returning values vs. side effects way)?
In terms of C++:
IMO using returns values is clearer than passing value by references and present, in most cases, no overhead. (please have a look at RVO and Copy Elision)
However if you do use return values for your control flow, using references is not a bad thing and is still clear for most developers.
So I guess we could say that the choice is yours.
Keep also in mind that many developers are not aware of what black magic your C++ compiler is doing and so using return values might offend them.
In the past it was a common practice to use reference parameters as output, since returning complex objects was very slow without return value optimization an move semantic. Today I belief in most cases returning the value is the best choice.
Want Speed? Pass by Value.
Writing the following provided that the list has a copy would by me be considered inappropriate.
void computeStatistics(List myStat, Matrix M);
Instead (provided that list has copy) you should.
List myStat computeStat(Matrix M)
However the call-by-reference approach can be motivated if you do not have a copy on your object, then you wont need to allocate it on the heap instead you can allocate it on the stack and send your function a pointer to it.
Regarding:
void computeStatistics(float min, float max, float avg, Matrix M);
My personal opinion is that best-practice is one method one purpose, so I would do this like:
float min computeMin(Matrix M);
float max computeMax(Matrix M);
float avg computeAvg(Matrix M);
The only reason that I can see for making all this in one function would be because the calculations are not done separately (more work to do it in separate functions).
If you however need to have several return types in one method i would do it with call-by-reference. For example:
void SomeMethod(input1, input2, &output1, &output2, &output3)
Does BigInteger only have equals for comparison?
Are there substitutes for math notation like < (greater than) or < (less than)?
^Answered!
now i want to know, is there a way using BigInteger in iteration such as while and for?
You can use the compareTo method.
You can't use math notation, in fact I wouldn't get in the habit of using == either, I'm fairly sure that unless they used some serious trickery it will fail.
a = new BigInteger(500);
b = a;
if( a == b )
will always be true
b=new BigInteger(500);
if( a == b )
will never be true
if( a.equals(b) )
will always work fine.
Java isn't a great language for this kind of stuff--I really love Java but ended up having a lot of trouble implementing a Complex class and then implementing a matrix that could hold and manipulate the complex class.
My solution was to use Java to create the core classes then use Groovy to create the classes that used the core classes. If you follow certain naming patterns, then you can use any operators on any class.
Or if you just want to mess with big numbers simply use groovy and don't even declare a type for your variables--it will automatically promote them to whatever you need.
Java operators are only designed to only operate on primitive data types; they do not act on classes. Since BigInteger is a class, arithmetic comparisons and operations can only be done through class methods.
From language design point of view , What type of practice is supporting operator overloading?
What are the pros & cons (if any) ?
EDIT: it has been mentioned that std::complex is a much better example than std::string for "good use" of operator overloading, so I am including an example of that as well:
std::complex<double> c;
c = 10.0;
c += 2.0;
c = std::complex<double>(10.0, 1.0);
c = c + 10.0;
Aside from the constructor syntax, it looks and acts just like any other built in type.
The primary pro is that you can create new types which act like the built in types. A good example of this is std::string (see above for a better example) in c++. Which is implemented in the library and is not a basic type. Yet you can write things like:
std::string s = "hello"
s += " world";
if(s == "hello world") {
//....
}
The downside is that it is easy to abuse. Poor choices in operator overloading can lead to accidentally inefficient or unclear code. Imagine if std::list had an operator[]. You may be tempted to write:
for(int i = 0; i < l.size(); ++i) {
l[i] += 10;
}
that's an O(n^2) algorithm! Ouch. Fortunately, std::list does not have operator[] since it is assumed to be an efficient operation.
The initial driver is usually maths. Programmers want to add vectors and quaternions by writing a + b and multiply matrices by vectors with M * v.
It has much broader scope than that. For instance, smart pointers look syntactically like normal pointers, streams can look a bit like Unix pipes and various custom data structures can be used as if they were arrays (with strings as indexes, even!).
The general principle behind overloading is to allow library implementors to enhance the language in a seamless way. This is both a strength and a curse. It provides a very powerful way to make the language seem richer than it is out of the box, but it is also highly vulnerable to abuse (more so than many other C++ features).
Pros: you can end up with code which is simpler to read. Few people would argue that:
BigDecimal a = x.add(y).divide(z).plus(m.times(n));
is as easy to read as
BigDecimal a = ((x + y) / z) + (m * n); // Brackets added for clarity
Cons: it's harder to tell what's being called. I seem to remember that in C++, a statement of the form
a = b[i];
can end up performing 7 custom operations, although I can't remember what they all are. I may be slightly "off" on the example, but the principle is right - operator overloading and custom conversions can "hide" code.
You can do some interesting tricks with syntax (i.e. streams in C++) and you can make some code very concise. However, you can have the effect of making code very difficult to debug and understand. You sort of have to constantly be on your guard about the effects of various operators of different kinds of objects.
You might deem operator overloading as a kind of method/function overloading. It is part of polymorphism in object oriented language.
With overloading, each class of objects work like primitive types, which make classes more natural to use, just as easy as 1 + 2.
Say a complex number class, Complex.
Complex(1,2i) + Complex(2,3i) yields Complex(3,5i).
5 + Complex(3, 2i) yields Complex(8, 2i).
Complex(2, 4i) + -1.8 yields Complex(0.2, 4i).
It is much easier to use class this way. If there is no operator overloading, you have to use a bunch of class methods to denote 'add' and make the notation clumsy.
The operator overloading ought to be defined carefully; otherwise, comes confusion. For example, '+' operator is natural to adding numbers, time, date, or concatenation of array or text. Adding '+' operator to a class of Mouse or Car might not make any sense. Sometimes some overloading might not be seem natural to some people. For example, Array(1,2,3) + Array(3,4,5). Some might expect Array(4,6,8) but some expect Array(1,2,3,3,4,5).