I need to minimize a complex linear multivariable function under some constraints.
Let x be an array of complex numbers of length L.
a[0], a[1], ..., a[L-1] are complex coefficients and
F is the complex function F(x)= x[0]*a[0] + x[1]*a[1] + ... + x[L-1]*a[L-1] that has to be minimized.
b[0], b[1], ..., b[L-1] are complex coefficients and there is a constraint
1 = complexConjuate(x[0])*x[0] + complexConjuate(x[1])*x[1] + ... + complexConjuate(x[L-1])*x[L-1] that has to be fulfilled.
I already had a detailed look at http://math.nist.gov/javanumerics/ and went through many documentations. But I couldn't find a library which does minimization for complex functions.
You want to minimize a differentiable real-valued function f on a smooth hypersurface S. If such a minimum exists - in the situation after the edit it is guaranteed to exist because the hypersurface is compact - it occurs at a critical point of the restriction f|S of f to S.
The critical points of a differentiable function f defined in the ambient space restricted to a manifold M are those points where the gradient of f is orthogonal to the tangent space T(M) to the manifold. For the general case, read up on Lagrange multipliers.
In the case where the manifold is a hypersurface (it has real codimension 1) defined (locally) by an equation g(x) = 0 with a smooth function g, that is particularly easy to detect, the critical points of f|S are the points x on S where grad(f)|x is collinear with grad(g)|x.
Now the problem is actually a real (as in concerns the real numbers) problem and not a complex (as in concerning complex numbers) one.
Stripping off the unnecessary imaginary parts, we have
the hypersurface S, which conveniently is the unit sphere, globally defined by (x|x) = 1 where (a|b) denotes the scalar product a_1*b_1 + ... + a_k*b_k, the gradient of g at x is just 2*x
a real linear function L(x) = (c|x) = c_1*x_1 + ... + c_k*x_k, the gradient of L is c independent of x
So there are two critical points of L on the sphere (unless c = 0 in which case L is constant), the points where the line through the origin and c intersects the sphere, c/|c| and -c/|c|.
Obviously L(c/|c|) = 1/|c|*(c|c) = |c| and L(-c/|c|) = -1/|c|*(c|c) = -|c|, so the minimum occurs at -c/|c| and the value there is -|c|.
Each complex variable x can be considered as two real variables, representing the real and imaginary part, respectively, of x.
My recommendation is that you reformulate your objective function and constraint using the real and imaginary parts of each variable or coefficient as independent components.
According to the comments, you only intend to optimize the real part of the objective function, so you can end up with a single objective function subject to optimization.
The constraint can be split into two, where the "real" constraint should equal 1 and the "imaginary" constraint should equal 0.
After having reformulated the optimization problem this way, you should be able to apply any optimization algorithm that is applicable to the reformulated problem. For example, there is a decent set of optimizers in the Apache Commons Math library, and the SuanShu library also contains some optimization algorithms.
Related
I made coded the global alignment algorithm using affine gap cost. The algorithm running time in big-O notation is O(n^2). However, my teacher in the class said that in order to verify the running time one has to divide the running time with n^2 where n is the length of the sequence (using sequences of different lengths). And if (n,R(n)/n^2) is below a horizontal line then the running time is verified. I couldn't really understand what that meant. I plotted R(n)/n^2 using sequences of different lengths. And it worked. I had fluctuations in the beginning but then the fluctuations went away (you can see this in the image too). The sequences I used were from 1000 bases long to 10,000 bases long.
Can someone how R(n)/n^2 verifies the running time? I've tried looking on the internet but couldn't find a possible answer. Forgive me, if my question is too simple. I do not have a computer science background.
O(n2) is defined as the set of functions R(n) such that, for some constant c, R(n) ≤ cn2 for all sufficiently large n.
That inequality is equivalent to R(n)/n2 ≤ c.
So your teacher is suggesting that you plot R(n)/n2 to see that, for sufficiently large n, you get (roughly) a constant c.
Note that this plot is not a formal proof. It's just a helpful visualization.
Some maths. (Sorry, but you do need a bit of maths to really understand this.)
The notation f(n) is O(n^2) means that:
1) define g(n) = f(n) / n2
2) the limit as N tends to infinity of g(n) is C, where C is some constant
Intuitively, that says the g(n) function gets closer and closer to C as n gets larger and larger. A mathematician would say that it "converges" to C.
(Actually, the maths is a bit more complicated than that, but I'm very rusty. And I'm trying to keep this simple.)
What your graph is doing to demonstrating that g(n) does actually converge. And the C value is that value on your Y-axis that corresponds to the straight line that your graph is approaching ... as the value on the X-axis goes to infinity.
If we say an algorithm is in O(n²), we actually mean the number of steps needed to run the algorithm as a function of input size n is in O(n²).
When we say a function is in O(n²), we actually mean there is a positive constant c so that f(n) < c * n² for big input sizes n.
This is equivalent to saying: there is a positive constant c so that c > f(n) / n² for big input sizes n.
You already noticed that your graph above converges towards a constant value. This is your c.
Of course, the number of steps needed to run an algorithm depend on the machine you use.
To explain it even simpler: An algorithm in O(n²) is like a quadratic function. You can always divide a quadratic function by a function of the form a * b² to get a constant function.
The running time being O(n^2) means there will always exist 2 real positive numbers b and n0 such that, for every n >= n0,
R(n) <= b*(n^2), where R(n) is the running time function.
In other words, the running time stays lower than a function of the form b*(n^2).
If you take that inequality and divide everything by (n^2), you get:
R(n)/(n^2) <= b
and the Big O definition still applies.
So as long as R(n)/(n^2) stays below b for n >= n0, then R(n) is O(n^2). Again, in other words, this means staying below the constant function b (horizontal line).
What you are doing is plotting R(n)/n^2 and checking if the graph stabilizes horizontally, for sufficiently high n.
This is easier than plotting R(n) and trying to identify if the graph is approximately a quadratic function (quadratic functions are quite similar to cubic and higher order polynomials, don't you agree?). If R(n) was actually O(n^3), R(n)/(n^2) would give you a crescent function, and you'd verify that in your graph.
I have implemented MFCC algorithm and want to implement BFCC. What are the differences between them and is it enough just to use another function instead of frequency to mel (2595 * Math.log10(1 + frequency / 700) ) and mel to frequency functions (700 * (Math.pow(10, mel / 2595) - 1) ) I follow that code: MFCC
PS: Does it need to change the code for triangular filters?
These are just different scales of representing the frequency spacings of the filters. MFCC uses filters whose center frequencies are spaced along the mel scale, while BFCC will use filters with center frequencies spaced along the bark scale.
The bark scale would simply be represented as:
Bark(f)=13*arctan(0.00076*f)+3.5*arctan((f/(7500))*(f/(7500)))
where f is the frequency in Hz.
Though you can use the bark scale to represent the center frequency spacings, research shows that using either mfcc or bfcc to represent feature vectors of an input speech sample has very little effect on ASR systems performance. The industry standard remains MFCC. In fact, I have not heard much of the BFCC.
If the code for the computation of filter coefficients is relatively generic and it takes in center frequencies as an input parameter, then I would say that you are OK. But, it is always best to double-check. Use MATLAB and plot frequency responses and check! You can check the [following paper][1] out for a comparison between MFCC, BFCC and uniform scale frequency spacings.
Update 1: The center frequency of a filter is either the arithmetic/geometric mean between the upper and lower cutoff frequencies of a band-pass/band-stop filter.
Also, the reverse equation to solve for f given the Bark frequencies is not trivial. It will be a quadratic equation that will need to be solved. One way would be to have a table constructed for different values of f and Bark and then do a table lookup. But I have not been able to find any links to the reverse equation.
[1]: http://148.204.64.201/paginas%20anexas/voz/articulos%20interesantes/front%20end/MFCC/a-comparative-study-of.pdf
You could just instead select the frequencies by hand of each bark critical band (a bounch of if's and else's), since there is no exact equation for bark critical bands (for mel's either, but there is a pretty close one), then get the logarithm of the value for each band, and then apply dct, remember this is for each frame, mel scale uses also logarithmic scale, so there is not much point between doing mfcc or bfcc.
There is a formula i need to use for my app : here
the part Sqrt(5-25) , can be positive or negative , and of course when negative we got a imaginary part that java cant handle.
i've searched around to find a complex class to handle that , but found only basic operation (+-*/).
how can i solve this in java knowing i only need to get the real part ?(imaginary have no importance)
i precise that i develop on android platform
(i post on stack because it's concerning application in java , but if its belong to math.se , tell me)
You can simply compute verything before:
plot 25-20+((2Pi0.3²)/(Pi10²)Sqrt[2*980(1+(Pi10²)/(Pi10²))]t)² from 0 to 38
or
plot 25-20+((2*0.3²)/(10²)Sqrt[2*980(1+1)]t)² from 0 to 38
or
25 - 20 + 4 * 0.0000081 * 3920*t^2 from 0 to 38 (I have some factor wrong, but you get the idea)
just apply basic math to the constants and remove the middle (imaginary part) after applying the 2nd binomic formula.
There is nothing to do with complex numbers.
Taking the Square root of a general complex number can be done with the basic arithmetic operations on real numbers (plus taking square root of reals): http://www.mathpropress.com/stan/bibliography/complexSquareRoot.pdf (one technique is to utilise De Moivre's theorem: Any complex number a + bi can be written as r(cos θ + i sin θ) where
r = sqrt(a^2 + b^2), cos θ = a/r, sin θ = b/r
Update: the formula r(cos θ + i sin θ) is originaly due to Euler, whereas De Moivre's theorem is
(a + ib)ⁿ = rⁿ(cos nθ + i sin nθ)
You are confused about the math. Square root of -25 is square root of 25*(-1) and that is square root of 25 * square root of -1 and that is 5i. Real part of that number is 0.
If you want the 5, just check for the sign of the number to be "rooted" and change it if it is negative.
It's not right to say that "Java can't handle it". No language that returns a double from a square root can handle it, but if you have a Complex class it's not an issue. Python has one built in; it's easy to write one in Java.
The square root of an integer is going to be an integer or a complex number whose real part is zero. The real part of the square root of a negative integer is zero. Always.
So ...
public double realPartOfSquareRoot(int i) {
return (i > 0) Math.sqrt(i) : 0;
}
But how I solve this? If I replace the squareroot by 0, I don't get a good result. Do I suppose the imaginary part do something on the real with the rest of the formula.
I expect that's so! (The idea of discarding the imaginary part didn't make much sense to me ... but I assumed you had a sound reason to do this.)
The real answer is to find a Java library that will do complex arithmetic. I've never needed to use it, but the first one to examine should be the Apache Commons Maths library.
I'm developing (in Java), for fun, an application which uses an unification algorithm.
I have chosen that my unification algorithm returns all the possible unifications. For example, if I try to solve
add(X,Y) = succ(succ(0))
it returns
{X = succ(succ(0)), Y = 0}, {X = succ(0), Y = succ(0)}, {X = 0, Y = succ(succ(0))}
However, in some cases, there exists an infinite number of possible unifications
(e.g. X > Y = true).
Does someone know am algorithm allowing to determine if an infinite number of unifications may be encountered?
Thanks in advance
In the context of Prolog, when you say "unification", you usually mean syntactic unification. Therefore, add(X, Y) and succ(succ(0)), do not unify (as terms), because their functors and arities differ. You seem to be referring to unification modulo theories, where distinct terms like add(X, Y) and succ(succ(0)) can be unified provided some additional equations or predicates are satisfied. Syntactic unification is decidable, and the number of possible unifiers is infinite if, after applying the most general unifier, you still have variables in both terms. Unification modulo theories is in general not decidable. To see that already basic questions can be hard consider for example the unification problem N > 2, X^N + Y^N = Z^N over the integers, which, if you could easily algorithmically decide whether or not a solution exists (i.e., whether the terms are unifiable modulo integer arithmetic), would immediately settle Fermat's Last Theorem. Consider also Matiyasevich's theorem and similar undecidability results.
In certain constraint logic programming systems you can easily see if the solution set is infinite or not. For example in some CLP(FD) implementations (i.e. SWI-Prolog, Jekejeke Minlog, other implementations such as GNU Prolog and B-Prolog not, since they assume a finite upper/lower bound) a certain degree of reasoning with infinite integer sets is thus supported. This is seen by interval notations such as (SWI-Prolog):
?- use_module(library(clpfd)).
true.
?- X #\= 2.
X in inf..1\/3..sup.
But there is a disadvantage of those sets, they cannot be used in CLP(FD) labeling where the elements of the set are enumerated and a further attempt to solve the instantiated equations is made. It would also run counter to the following result, if something could be done in general to decide CLP(FD) queries:
"In 1900, in recognition of their depth, David Hilbert proposed the
solvability of all Diophantine problems as the tenth of his
celebrated problems. In 1970, a novel result in mathematical logic
known as Matiyasevich's theorem settled the problem negatively: in
general Diophantine problems are unsolvable."
(From Wikipedia on Diophantine equations)
Another constraint logic programming that can usually also deal with infinite solution sets is CLP(R). The reasoning among equations is a little stronger there. For example CLP(FD) does not detect the following inconsistency (depends on the system, this is the result for SWI-Prolog, in Jekejeke Minlog you will immediately see a No for the second query, and GNU Prolog will loop for around 4 secs and then say No):
?- X #> Y.
Y#=<X+ -1.
?- X #> Y, Y #> X.
X#=<Y+ -1,
Y#=<X+ -1.
On the other hand CLP(R) will find:
?- use_module(library(clpr)).
?- {X > Y}.
{Y=X-_G5542, _G5542 > 0.0}.
?- {X > Y, Y > X}.
false.
Constraint systems work by implementing algorithms from number theory, linear algebra, analysis, etc.. depending on the domain they model, i.e. what * denotes in the notation CLP( * ). These algorithms can go as far as quantifier elimination.
Bye
I have an arbitrary function or inequality (consisting of a number of trigonometrical, logarithmical, exponential, and arithmetic terms) that takes several arguments and I want to get its range knowing the domains of all the arguments. Are there any Java libraries that can help to solve a problem? What are the best practices to do that? Am I right that for an arbitrary function the only thing can be done is a brute-force approximation? Also, I'm interested in functions that can build intersections and complements for given domains.
Upd. The functions are entered by the user so the complexity cannot be predicted. However, if the library will treat at least simple cases (1-2 variables, 1-2 terms) it will be OK. I suggest the functions will mostly define the intervals and contain at most 2 independent variables. For instance, definitions like
y > (x+3), x ∈ [-7;8]
y <= 2x, x ∈ [-∞; ∞]
y = x, x ∈ {1,2,3}
will be treated in 99% of cases and covering them will be enough for now.
Well, maybe it's faster to write a simple brute-force for treating such cases. Probably it will be satisfactory for my case but if there are better options I would like to learn them.
Notational remark: I assume you want to find the range of the function, i.e. the set of values that the function can take.
I think this problem is not simple. I don't think that "brute force" is a solution at all, what does "brute force" even mean when we have continuous intervals (i.e infinitely many points!).
However, there might be some special cases where this is actually possible. For example, when you take a sin(F(x)) function, you know that its range is [-1,1], regardless of the inner function F(x) or when you take Exp(x) you know the range is (0,+inf).
You could try constructing a syntax tree with information about the ranges associated to each node. Then you could try going bottom-up through the tree to try to compute the information about the actual intervals in which the function values lie.
For example, for the function Sin(x)+Exp(x) and x in (-inf, +inf) you would get a tree
+ range: [left range] union [right range]
/ \
sin exp range [-1, 1] , range: (0,+inf)
| |
x x
so here the result would be [-1, 1] union (0, +inf) = [-1, +inf).
Of course there are many problems with this approach, for example the operation on ranges for + is not always union. Say you have two functions F(x) = Sin(x) and G(x) = 1-Sin(x). Both have ranges [-1,1], but their sum collapses to {1}. You need to detect and take care of such behaviour, otherwise you will get only an upper bound on the possible range (So sort of codomain).
If you provide more examples, maybe someone can propose a better solution, I guess a lot depends on the details of the functions.
#High Performance Mark: I had a look at JAS and it seems that its main purpose is to deal with multivariate polynomial rings, but the question mentioned trigonometric, logarithmic and other transcendental functions so pure polynomial arithmetic will not be sufficient.
Here's another approach and depending on how crazy your function can be (see EDIT) it might give you the universal solution to your problem.
Compose the final expression, which might be rather complex.
After that use numerical methods to find minimum and maximum of the function - this should give you the resulting range.
EDIT: Only in the case that your final expression is not continuous the above would not work and you would have to divide into continuous sections for each you would have to find min and max. At the end you would have to union those.
I would have thought that this is a natural problem to tackle with a Computer Algebra System. I googled around and JAS seems to be the most-cited Java CAS.
If I had to confine myelf to numeric approaches, then I'd probably tackle it with some variety of interval computations. So: the codomain of sin is [-1,1], the codomain of exp is (0,+Inf), and the codomain of exp(sin(expression)) is ...
over to you, this is where I'd reach for Mathematica (which is callable from Java).