How to find function range? - java

I have an arbitrary function or inequality (consisting of a number of trigonometrical, logarithmical, exponential, and arithmetic terms) that takes several arguments and I want to get its range knowing the domains of all the arguments. Are there any Java libraries that can help to solve a problem? What are the best practices to do that? Am I right that for an arbitrary function the only thing can be done is a brute-force approximation? Also, I'm interested in functions that can build intersections and complements for given domains.
Upd. The functions are entered by the user so the complexity cannot be predicted. However, if the library will treat at least simple cases (1-2 variables, 1-2 terms) it will be OK. I suggest the functions will mostly define the intervals and contain at most 2 independent variables. For instance, definitions like
y > (x+3), x ∈ [-7;8]
y <= 2x, x ∈ [-∞; ∞]
y = x, x ∈ {1,2,3}
will be treated in 99% of cases and covering them will be enough for now.
Well, maybe it's faster to write a simple brute-force for treating such cases. Probably it will be satisfactory for my case but if there are better options I would like to learn them.

Notational remark: I assume you want to find the range of the function, i.e. the set of values that the function can take.
I think this problem is not simple. I don't think that "brute force" is a solution at all, what does "brute force" even mean when we have continuous intervals (i.e infinitely many points!).
However, there might be some special cases where this is actually possible. For example, when you take a sin(F(x)) function, you know that its range is [-1,1], regardless of the inner function F(x) or when you take Exp(x) you know the range is (0,+inf).
You could try constructing a syntax tree with information about the ranges associated to each node. Then you could try going bottom-up through the tree to try to compute the information about the actual intervals in which the function values lie.
For example, for the function Sin(x)+Exp(x) and x in (-inf, +inf) you would get a tree
+ range: [left range] union [right range]
/ \
sin exp range [-1, 1] , range: (0,+inf)
| |
x x
so here the result would be [-1, 1] union (0, +inf) = [-1, +inf).
Of course there are many problems with this approach, for example the operation on ranges for + is not always union. Say you have two functions F(x) = Sin(x) and G(x) = 1-Sin(x). Both have ranges [-1,1], but their sum collapses to {1}. You need to detect and take care of such behaviour, otherwise you will get only an upper bound on the possible range (So sort of codomain).
If you provide more examples, maybe someone can propose a better solution, I guess a lot depends on the details of the functions.
#High Performance Mark: I had a look at JAS and it seems that its main purpose is to deal with multivariate polynomial rings, but the question mentioned trigonometric, logarithmic and other transcendental functions so pure polynomial arithmetic will not be sufficient.

Here's another approach and depending on how crazy your function can be (see EDIT) it might give you the universal solution to your problem.
Compose the final expression, which might be rather complex.
After that use numerical methods to find minimum and maximum of the function - this should give you the resulting range.
EDIT: Only in the case that your final expression is not continuous the above would not work and you would have to divide into continuous sections for each you would have to find min and max. At the end you would have to union those.

I would have thought that this is a natural problem to tackle with a Computer Algebra System. I googled around and JAS seems to be the most-cited Java CAS.
If I had to confine myelf to numeric approaches, then I'd probably tackle it with some variety of interval computations. So: the codomain of sin is [-1,1], the codomain of exp is (0,+Inf), and the codomain of exp(sin(expression)) is ...
over to you, this is where I'd reach for Mathematica (which is callable from Java).

Related

What Crossover Method should I use for crossing Postfix expressions in Genetic Algorithm?

I'm building a project whose main objective is to find a given number (if possible, otherwise closest possible) using 6 given numbers and main operators (+, -, *, /). Idea is to randomly generate expressions, using the numbers given and the operators, in reverse polish (postfix) notation, because I found it the easiest to generate and compute later. Those expressions are Individuals in Population of my Genetic Algorithm. Those expressions have the form of an ArrayList of Strings in Java, where Strings are both the operators and operands (the numbers given).
The main question here is, what would be the best method to crossover these individuals (postfix expressions actually)? Right now I'm thinking about crossing expressions that are made out of all the six operands that are given (and 5 operators with them). Later I'll probably also cross the expressions that would be made out of less operands (5, 4, 3, 2 and also only 1), but I guess that I should figure this out first, as the most complex case (if you think it might be a better idea to start differently, I'm open to any suggestions). So, the thing is that every expression is made from all the operands given, and also the child expression should have all the operands included, too. I understand that this requires some sort of ordered crossover (often used in problems like TSP), and I read a lot about it (for example here where multiple methods are described), but I didn't quite figure out which one would be best in my case (I'm also aware that in Genetic Algorithms there is a lot of 'trial and error' process, but I'm talking about something else here).
What I'm saying is bothering me, are operators. If I had only a list of operands, then it wouldn't be a problem to cross 2 such lists, for example taking a random subarray of half elements from 1 parent, and fill the rest with remaining elements from parent 2 keeping the order like it was. But here, if I, say, take first half of an expression from first parent expression, I would definitely have to fill the child expression with remaining operands, but what should I do with operators? Take them from parent 2 like the remaining operands (but then I would have to watch out because in order to use an operator in postfix expression, I need to have at least 1 operand more, and checking that all the time might be time consuming, or not?), or maybe I could generate random operators for the rest of the child expression (but that wouldn't be a pure crossover then, would it)?
When talking about crossover, there is also mutation, but I guess I have that worked out. I can take an expression and perform a mutation where I'll just switch 2 operands, or take an expression and randomly change 1 or more operators. For that, I have some ideas, but the crossover is what really bothers me.
So, that pretty much sums my problem. Like I said, the main question is how to crossover, but if you have any other suggestions or questions about the program (maybe easier representation of expressions - other then list of strings - which may be easier/faster to crossover, maybe something I didn't mention here, it doesn't matter, maybe even a whole new approach to the problem?), I'd love to hear them. I didn't give any code here because I don't think it's needed to answer this question, but if you think it would help, I'll definitely edit in order to solve this. One more time, main question is to answer how to crossover, this specific part of the problem (idea or pseudocode expected, although the code itself would be great, too :D), but if you think that I need to change something more, or you know some other solutions to my whole problem, feel free to say.
Thanks in advance!
There are two approaches that come to mind:
Approach #1
Encode each genome as a fixed length expression where odd indices are numbers and even indices are the operators. For mutation, you could slightly change the numbers and/or change the operators.
Pros:
Very simple to code
Cons:
Would have to create an infix parser
Fixed length expressions
Approach #2
Encode each genome as a syntax tree. For instance, 4 + 3 / 2 - 1 is equivalent to Add(4, Subtract(Divide(3, 2), 1)) which looks like:
_____+_____
| |
4 ____-____
| |
__/__ 1
| |
3 2
Then when crossing over, pick a random node from each tree and swap them. For mutation, you could add, remove, and/or modify random nodes.
Pros:
Might find better results
Variable length expressions
Cons:
Adds time complexity
Adds programming complexity
Here is an example of the second approach:
Source

Logarithm Algorithm

I need to evaluate a logarithm of any base, it does not matter, to some precision. Is there an algorithm for this? I program in Java, so I'm fine with Java code.
How to find a binary logarithm very fast? (O(1) at best) might be able to answer my question, but I don't understand it. Can it be clarified?
Use this identity:
logb(n) = loge(n) / loge(b)
Where log can be a logarithm function in any base, n is the number and b is the base. For example, in Java this will find the base-2 logarithm of 256:
Math.log(256) / Math.log(2)
=> 8.0
Math.log() uses base e, by the way. And there's also Math.log10(), which uses base 10.
I know this is extremely late, but this may come to be useful for some since the matter here is precision. One way of doing this is essentially implementing a root-finding algorithm that uses, from its base, the high precision types you might want to be using, consisting of simple +-x/ operations.
I would recommend implementing Newton's ​method since it demands relatively few iterations and has great convergence. For this sort of application, specifically, I believe it's fair to say it will always provide the correct result provided good input validation is implemented.
Considering a simple constant "a" where
Where a is sought to be solved for such that it obeys, then
We can use the Newton method iteratively to find "a" within any specified tolerance, where each a-ith iteration can be computed by
and the denominator is
,
because that's the first derivative of the function, as necessary for the Newton method. Once this is solved for, "a" is the direct answer for the "a = log,b(x)" problem, obtainable by simple +-x/ operations, so you're already good to go. "Wait, but there's a power there?". Yes. If you can rely on your power function as being accurate enough, then there are no issues with going ahead and using it there. Otherwise, you can further break down the power operation into a series of other +-x/ operations by using these methods to simplify whatever decimal number that is on the power into two integer power operations that can be computed easily with a series of multiplication operations. This process will eventually leave you with nth-roots to solve for, which you can also find with the Newton method. If you do go down that road, you can use this for the newton method
which, as you can see, has to be solved for recursively until you reach b = 1.
Phew, but yeah, that's it. This is the way you can solve the problem by making sure you use high precision types along the whole way with only +-x/ operations. Below is a quick implementation I did in Excel to solve for log,2(3), compared with the solution given by the software's original function. As you can see, I can just keep refining "a" until I reach the tolerance I want by monitoring what the optimization function gives me. In this, I used a=2 as the initial guess, which you can use and should be fine for most cases.

Convert string to a large integer?

I have an assignment (i think a pretty common one) where the goal is to develop a LargeInteger class that can do calculations with.. very large integers.
I am obviously not allowed to use the Java.math.bigeinteger class at all.
Right off the top I am stuck. I need to take 2 Strings from the user (the long digits) and then I will be using these strings to perform the various calculation methods (add, divide, multiply etc.)
Can anyone explain to me the theory behind how this is supposed to work? After I take the string from the user (since it is too large to store in int) am I supposed to break it up maybe into 10 digit blocks of long numbers (I think 10 is the max long maybe 9?)
any help is appreciated.
First off, think about what a convenient data structure to store the number would be. Think about how you would store an N digit number into an int[] array.
Now let's take addition for example. How would you go about adding two N digit numbers?
Using our grade-school addition, first we look at the least significant digit (in standard notation, this would be the right-most digit) of both numbers. Then add them up.
So if the right-most digits were 7 and 8, we would obtain 15. Take the right-most digit of this result (5) and that's the least significant digit of the answer. The 1 is carried over to the next calculation. So now we look at the 2nd least significant digit and add those together along with the carry (if there is no carry, it is 0). And repeat until there are no digits left to add.
The basic idea is to translate how you add, multiply, etc by hand into code when the numbers are stored in some data structure.
I'll give you a few pointers as to what I might do with a similar task, but let you figure out the details.
Look at how addition is done from simple electronic adder circuits. Specifically, they use small blocks of addition combined together. These principals will help. Specifically, you can add the blocks, just remember to carry over from one block to the next.
Your idea of breaking it up into smaller blogs is an excellent one. Just remember to to the correct conversions. I suspect 9 digits is just about right, for the purpose of carry overs, etc.
These tasks will help you with addition and subtraction. Multiplication and Division are a bit trickier, but again, a few tips.
Multiplication is the easier of the tasks, just remember to multiply each block of one number with the other, and carry the zeros.
Integer division could basically be approached like long division, only using whole blocks at a time.
I've never actually build such a class, so hopefully there will be something in here you can use.
Look at the source code for MPI 1.8.6 by Michael Bromberger (a C library). It uses a simple data structure for bignums and simple algorithms. It's C, not Java, but straightforward.
Its division performs poorly (and results in slow conversion of very large bignums to tex), but you can follow the code.
There is a function mpi_read_radix to read a number in an arbitrary radix (up to base 36, where the letter Z is 35) with an optional leading +/- sign, and produce a bignum.
I recently chose that code for a programming language interpreter because although it is not the fastest performer out there, nor the most complete, it is very hackable. I've been able to rewrite the square root myself to a faster version, fix some coding bugs affecting a port to 64 bit digits, and add some missing operations that I needed. Plus the licensing is BSD compatible.

Unification - Infinity of results

I'm developing (in Java), for fun, an application which uses an unification algorithm.
I have chosen that my unification algorithm returns all the possible unifications. For example, if I try to solve
add(X,Y) = succ(succ(0))
it returns
{X = succ(succ(0)), Y = 0}, {X = succ(0), Y = succ(0)}, {X = 0, Y = succ(succ(0))}
However, in some cases, there exists an infinite number of possible unifications
(e.g. X > Y = true).
Does someone know am algorithm allowing to determine if an infinite number of unifications may be encountered?
Thanks in advance
In the context of Prolog, when you say "unification", you usually mean syntactic unification. Therefore, add(X, Y) and succ(succ(0)), do not unify (as terms), because their functors and arities differ. You seem to be referring to unification modulo theories, where distinct terms like add(X, Y) and succ(succ(0)) can be unified provided some additional equations or predicates are satisfied. Syntactic unification is decidable, and the number of possible unifiers is infinite if, after applying the most general unifier, you still have variables in both terms. Unification modulo theories is in general not decidable. To see that already basic questions can be hard consider for example the unification problem N > 2, X^N + Y^N = Z^N over the integers, which, if you could easily algorithmically decide whether or not a solution exists (i.e., whether the terms are unifiable modulo integer arithmetic), would immediately settle Fermat's Last Theorem. Consider also Matiyasevich's theorem and similar undecidability results.
In certain constraint logic programming systems you can easily see if the solution set is infinite or not. For example in some CLP(FD) implementations (i.e. SWI-Prolog, Jekejeke Minlog, other implementations such as GNU Prolog and B-Prolog not, since they assume a finite upper/lower bound) a certain degree of reasoning with infinite integer sets is thus supported. This is seen by interval notations such as (SWI-Prolog):
?- use_module(library(clpfd)).
true.
?- X #\= 2.
X in inf..1\/3..sup.
But there is a disadvantage of those sets, they cannot be used in CLP(FD) labeling where the elements of the set are enumerated and a further attempt to solve the instantiated equations is made. It would also run counter to the following result, if something could be done in general to decide CLP(FD) queries:
"In 1900, in recognition of their depth, David Hilbert proposed the
solvability of all Diophantine problems as the tenth of his
celebrated problems. In 1970, a novel result in mathematical logic
known as Matiyasevich's theorem settled the problem negatively: in
general Diophantine problems are unsolvable."
(From Wikipedia on Diophantine equations)
Another constraint logic programming that can usually also deal with infinite solution sets is CLP(R). The reasoning among equations is a little stronger there. For example CLP(FD) does not detect the following inconsistency (depends on the system, this is the result for SWI-Prolog, in Jekejeke Minlog you will immediately see a No for the second query, and GNU Prolog will loop for around 4 secs and then say No):
?- X #> Y.
Y#=<X+ -1.
?- X #> Y, Y #> X.
X#=<Y+ -1,
Y#=<X+ -1.
On the other hand CLP(R) will find:
?- use_module(library(clpr)).
?- {X > Y}.
{Y=X-_G5542, _G5542 > 0.0}.
?- {X > Y, Y > X}.
false.
Constraint systems work by implementing algorithms from number theory, linear algebra, analysis, etc.. depending on the domain they model, i.e. what * denotes in the notation CLP( * ). These algorithms can go as far as quantifier elimination.
Bye

Numerical computation in Java

Ok so I'm trying to use Apache Commons Math library to compute a double integral, but they are both from negative infinity (to around 1) and it's taking ages to compute. Are there any other ways of doing such operations in java? Or should it run "faster" (I mean I could actually see the result some day before I die) and I'm doing something wrong?
EDIT: Ok, thanks for the answers. As for what I've been trying to compute it's the Gaussian Copula:
So we have a standard bivariate normal cumulative distribution function which takes as arguments two inverse standard normal cumulative distribution functions and I need integers to compute that (I know there's a Apache Commons Math function for standard normal cumulative distribution but I failed to find the inverse and bivariate versions).
EDIT2: as my friend once said "ahhh yes the beauty of Java, no matter what you want to do, someone has already done it" I found everything I needed here http://www.iro.umontreal.ca/~simardr/ssj/ very nice library for probability etc.
There are two problems with infinite integrals: convergence and value-of-convergence. That is, does the integral even converge? If so, to what value does it converge? There are integrals which are guaranteed to converge, but whose value it is not possible to determine exactly (try the integral from 1 to infinity of e^(-x^2)). If it can't be exactly returned, then an exact answer is not possible mathematically, which leaves only approximation. Apache Commons uses several different approximation schemes, but all require the use of finite bounds for correctness.
The best way to get an appropriate answer is to repeatedly evaluate finite integrals, with ever increasing bounds, and compare the results. In pseudo-code, it would look something like this:
double DELTA = 10^-6//your error threshold here
double STEP_SIZE = 10.0;
double oldValue=Double.MAX_VALUE;
double newValue=oldValue;
double lowerBound=-10; //or whatever you want to start with--for (-infinity,1), I'd
//start with something like -10
double upperBound=1;
do{
oldValue = newValue;
lowerBound-= STEP_SIZE;
newValue = integrate(lowerBound,upperBound); //perform your integration methods here
}while(Math.abs(newValue-oldValue)>DELTA);
Eventually, if the integral converges, then you will get enough of the important stuff in that widening the bounds further will not produce meaningful information.
A word to the wise though: this kind of thing can be explosively bad if the integral doesn't converge. In that case, one of two situations can occur: Either your termination condition is never satisfied and you fall into an infinite loop, or the value of the integral oscillates indefinitely around a value, which may cause your termination condition to be incorrectly satisfied (giving incorrect results).
To avoid the first, the best way is to put in some maximum number of steps to take before returning--doing this should stop the potentially infinite loop that can result.
To avoid the second, hope it doesn't happen or prove that the integral must converge (three cheers for Calculus 2, anyone? ;-)).
To answer your question formally, no, there are no other such ways to perform your computation in java. In fact, there are no guaranteed ways of doing it in any language, with any algorithm--the mathematics just don't work out the way we want them to. However, in practice, a lot (though by no means all!) of the practical integrals do converge; its been my experience that only about ~20 iterations will give you an approximation of reasonable accuracy, and Apache should be fast enough to handle that without taking absurdly long.
Suppose you are integrating f(x) over -infinity to 1, then substitute x = 2 - 1/(1-t), and evaluate over the range 0 .. 1. Note check a maths text for how to do the substition, I'm a little rusty and its too late here.
The result of a numerical integration where one of the bounds is infinity has a good chance to be infinity as well. And it will take infinite time to prove it ;)
So you either find an equivalent formula (using real math) that can be computed or your replace the lower bound with a reasonable big negative value and look, if you can get a good estimation for the integral.
If Apache Commons Math could do numerical integration for integrals with infinite bounds in finite time, they wouldn't give it away for free ;-)
Maybe it's your algorithm.
If you're doing something naive like Simpson's rule it's likely to take a very long time.
If you're using Gaussian or log quadrature you might have better luck.
What's the function you're trying to integrate, and what's the algorithm you're using?

Categories

Resources