Add quadratic penalty term to objective function in cplex (Java) - java

I'm developing an Optimization tool for a domestic energy system that also contains a battery. All values are correct and the solution makes sense. The problem is that the solution contains very strong fluctuations. Meaning that the decision variable is often either 0 or the maximum value. In order to avoid that I would like to add a quadratic constraint that penalizes the difference of two values (something like the derivative). Should look something like this:
((x[t] - x[t-1]) / stepsize) ^ 2
Where x is the decision variable of interest. E.g. power_g_h[t].
My objective function (so far) ist definded as follows:
IloLQNumExpr expr = model.lqNumExpr();
for (int t = 0; t < timesteps; t++) {
expr.addTerm(problem.getCosts().getElectricityCosts(t), power_g_h[t]);
expr.addTerm(problem.getCosts().getElectricityCosts(t), power_g_b[t]);
expr.addTerm(problem.getCosts().getElectricityCosts(t), power_g_bev[t]);
expr.addTerm(problem.getCosts().getFeedCompensation(), power_pv_g[t]);
}
I hope this was somewhat understandable and someone is able to tell whether or not this is even possible in CPLEX.
If this is not possible, I would be very happy about hints on how to "smoothen" a solution in CPLEX.
With kind regards,
L.

The problem was solved as follows:
It does not seem to be possible to add an expression like x * ((a - b) ^ 2). Instead the solution was to write the above as x*a*a - 2x*a*b + x*b*b. Where x is the penalty factor and a & b are the decision variables. This way it was possible to add the term to the objective function in cplex. In code it looks something like this:
IloCplex model = new IloCplex();
...
IloLQNumExpr expr = model.lqNumExpr();
expr.addTerm(x, a, a);
expr.addTerm(x, b, b);
expr.addTerm(-2 * x, a, b);
In my case a and b was the same variable for two consecutive timestep, s.t. the change over time was kept small.

Related

Is there a way to compare two methods by function rather than value? [duplicate]

This question already has answers here:
Is finding the equivalence of two functions undecidable?
(9 answers)
Closed 6 years ago.
Is there a way to compare if two methods are equivalent by function (i.e. they do the same thing) rather than equivalent by value (i.e. all of the code in the method is the same) ?
For example these two methods are coded differently, but perform the same function.
public int doIt(int a, int b) {
a = a + 1;
b = b + 1;
return a + b;
}
public int doIt2(int z, int x) {
int total = z + x + 2;
return total;
}
I was looking for a way to do this in Eclipse, but am interested if this is even possible beyond a trivial method.
The only way to be 100% is to mathematically prove it
There are ways:
1- Theorem proving
2- Model Checking
and etc
Although these approaches can be very hard, sometime it might take days to prove it even for trivial programs and even days to produce the adequate abstraction level.
There are some heuristic approaches but obviously they are not 100% accurate (heuristic)
A simple heuristic approach would be to try both methods for 1000 inputs and see if the results are the same
EDIT:
here is a list of Model Checker I found on Wikipedia. I haven't used any of them, they may not be exactly what you are looking for.
https://en.wikipedia.org/wiki/List_of_model_checking_tools
Ignoring side effects, 2 functions will be functionally equivalent if for the same input, they produce the same output.
This will only work for pure code though. There's no way I know of to monitor for side effects in general since the side effects a function carries out could be anything.
Note, there wouldn't be a way to completely verify this without testing every possible input. If the input is just a limited Enum, that might be easy. If it's 2 integers though for example, the total number of combinations would be huge.
In general, the purpose of refactoring is to have a function behave the same before and after it is refactored. Developers generally do this by creating extensive unit tests, testing both normal, edge, and exception cases.
In the OP's two functions to be compared, doIt and doIt2, they might usually return the same answer, given any integer inputs a and b. Unit testing would demonstrate this.
But what if a or b were the largest integer that Java could store, MAX_VALUE?
What if there were a side effect from a=a+1?
In these cases, the two functions may appear similar on the surface, but yield different results.

Declaring a new data type for DNA

I am involved with biology, specifically DNA and often there is a problem with the size of the data that comes from sequencing a genome.
For those of you who don't have a background in biology, I'll give a quick overview of DNA sequencing. DNA consists of four letters: A, T, G, and C, the specific order of which determines what happens in the cell.
A major problem with DNA sequencing technology however is the size of the data that results, (for a whole genome, often much more than gigabytes).
I know that the size of an int in C varies from computer to computer, but it still has way more information storage possibility than four choices. Is there a way to define a type/way to define a 'base' that only takes up 2 or 3 bits? I've searched for defining a structure, but am afraid this isn't what I'm looking for. Thanks.
Also, would this work better in other languages (maybe higher level like java)?
Can't you just stuff two ATGC sets into one byte then? Like:
0 1 0 1 1 0 0 1
A T G C A T G C
So this one byte would represent TC,AC?
If you want to use Java, you're going to have to give up some control over how big things are. The smallest you can go, AFAIK, is the byte primitive, which is 8 bits (-128 to 127).
Although I guess this is debatable, it seems like Java is more suitable for broad systems control rather than fast, efficient nitty-gritty detail work such as you would generally do with C.
If there is no requirement that you hold the entire dataset in memory at once, you might even try using a managed database like MySQL to store the base information and then read that in piece by piece.
If I would write a similiar code, I would store the nucleotid identifier in a byte, where you can add 1,2,3,4 as values for A,T,G,C. Even if you will consider that you will use RNA then you can just add a 5th element, with value 5 for U.
If you are really digging yourself into the project, I would recommend making a class for codons. In this class you can specify if this is an intron/exon, a Start or Stop codon and so on. And on top of this, you can make a gene class, where you can specify the promoter regions and etc.
If you will have big sequences of dna, rna, and it will need a lot of computing than I strongly recommend to use C++ and for scientific computations Fortrain. ( The total human genom is 1.4 Gb)
Also because there are much repetitive sequences, structuring the genom into codons is usefull, this way you save a lot of memory (you just have to make a refrence to a codon class, and do not have to build the class N times).
Also strucuring into codons, you can predefine your classes, and there is only 64 of them, so your whole genom would be only an ordered referencing list. So in my opinion making a codon as a base unit is much more efficient.
Below link is one of my research paper, Checkout and let me know if you need more details about implementation if you find it useful for you.
GenCodeX - Kaliuday Balleda
Try a char datatype.
They are generally the smallest addressable memory unit in C\C++. Most systems I've used have it at 1 Byte.
The reason you can't use anything like one or two bits is because the CPU is already pulling in that extra data.
Take a look at this for more details
The issue is not just which data type will hold the smallest value, but also what is the most efficient way to access bit-level memory.
With my limited knowledge I might try setting up a bit-array of ints (which are, from my understanding, the most efficient way to access memory for bit-arrays; I may be mistaken in my understanding, but the same principles apply if there is a better one), then using bit-wise operators to write/read.
Here are some partial codes that should give you an idea of how to proceed with 2-bit definitions and a large array of ints.
Assuming a pointer (a) set to a large array of ints:
unsinged int *a, dna[large number];
a = dna;
*a = 0;
Setting up bit definitions:
For A:
da = 0;
da = ~da;
da = da << 2;
da = ~da; (11)
For G:
dg = 0;
dg = ~dg;
dg = dg << 1;
dg = ~dg;
dg = dg << 1; (10);
and so on for T and C
For the loop:
while ((b = getchar())!=EOF){
i = sizeof(int)*8; /*bytes into bits*/
if (i-= 2 > 0){ /*keeping track of how much unused memory is left in int*/
if (b =='a' || b == 'A')
*a = *a | da;
else if (b == 't' || b == 'T')
*a = *a | ta;
else if (t...
else if (g...
else
error;
*a = *a << 2;
} else{
*++a = 0; /*advance to next 32-bit set*/
i = sizeof(int)*8 /* it may be more efficient to set this value aside earlier, I don't honestly know enough to know this yet*/
if (b == 'a'...
else if (b == 't'...
...
else
error;
*a = *a <<2;
}
}
And so on. This will store 32 bits for each int (or 16 of letters). For array size maximums, see The maximum size of an array in C.
I am speaking only from a novice C perspective. I would think that a machine language would do a better job of what you are asking for specifically, though I'm certain there are high-level solutions out there. I know that FORTRAN is a well-regarded when it comes to the sciences, but I understand that it is so due to its computational speed, not necessarily because of its efficient storage (though I'm sure it's not lacking there); an interesting read here: http://arstechnica.com/science/2014/05/scientific-computings-future-can-any-coding-language-top-a-1950s-behemoth/. I would also look into compression, though I sadly have not learned much of it myself.
A source I turned to when I was looking into bit-arrays:
http://www.mathcs.emory.edu/~cheung/Courses/255/Syllabus/1-C-intro/bit-array.html

Compute probability over a multivariate normal

My question addresses both mathematical and CS issues, but since I need a performant implementation I am posting it here.
Problem:
I have an estimated normal bivariate distribution, defined as a python matrix, but then I will need to transpose the same computation in Java. (dummy values here)
mean = numpy.matrix([[0],[0]])
cov = numpy.matrix([[1,0],[0,1]])
When I receive in inupt a column vector of integers values (x,y) I want to compute the probability of that given tuple.
value = numpy.matrix([[4],[3]])
probability_of_value_given_the_distribution = ???
Now, from a matematical point of view, this would be the integral for 3.5 < x < 4.5 and 2.5 < y < 3.5 over the probability density function of my normal.
What I want to know:
Is there a way to avoid the effective implementation of this, that implies dealing with expressions defined over matrices and with double integrals? Besides that it will take me a while if I had to implement it by myself, this would be computationally expensive. An approximate solution would be perfectly fine for me.
My reasonings:
In an univariate normal, one could simply use the cumulative distribution function (or even store its values for the standard one and then normalize), but unfortunately there appears not to be a closed cdf form for multivariates.
Another approach for univariate is to use the inverse of bivariate approximation (so, approximate a normal as a binomial), but extending this to the multivariate I can't figure out how to keep in count the covariances.
I really hope someone has already implemented this, I need it soon (finishing my thesis) and I couldn't find anything.
OpenTURNS provides an efficient implementation of the CDF of a multinormal distribution (see the code).
import numpy as np
mean = np.array([0.0, 0.0])
cov = np.array([[1.0, 0.0],[0.0, 1.0]])
Let us create the multinormal distribution with these parameters.
import openturns as ot
multinormal = ot.Normal(mean, ot.CovarianceMatrix(cov))
Now let us compute the probability of the square [3.5, 4.5] x |2.5, 3.5]:
prob = multinormal.computeProbability(ot.Interval([3.5,2.5], [4.5,3.5]))
print(prob)
The computed probability is
1.3701244220201715e-06
If you are looking for the probabiliy density function of a bivariate normal distribution, below are a few lines that could do the job:
import numpy as np
def multivariate_pdf(vector, mean, cov):
quadratic_form = np.dot(np.dot(vector-mean,np.linalg.inv(cov)),np.transpose(vector-mean))
return np.exp(-.5 * quadratic_form)/ (2*np.pi * np.linalg.det(cov))
mean = np.array([0,0])
cov = np.array([[1,0],[0,1]])
vector = np.array([4,3])
pdf = multivariate_pdf(vector, mean, cov)

Java typecasting for retrieving integer part of a double as double

I sometimes tend to use (double)(long)(a*b/c) to store the integer part of the result as double. This works well for negative numbers too.
Is there any better way to achieve the same thing as I believe typecasting is a costly operation.
Please note I'm looking for Integer part of the number and not the rounded value.
For eg :
MyObj.setDouble((double)(long)(522.99))
MyObj.getDouble() returns 522.0 and not 523.0
Thanks.
Try Math.rint(double) or Math.round(double). Regardless of performance differences it's at least more clear and concise.
[Edit]
In response to your clarified question - "how do I get the integer part of a double without casting" (despite your title asking about rounding), try this:
public static double integerPart(double d) {
return (d <= 0) ? Math.ceil(d) : Math.floor(d);
}
integerPart(522.99); // => 522d
integerPart(-3.19); // => -3d
Of course, this form is likely no faster than casting since it's using a comparison and a method call.
Performance is not an issue here. But code (double)(long)(a*b/c) is ugly. You actually do not need casting at all if you assign the result to `double variable:
double d = a*b/c; exactly the same as double d = (double)(long)a*b/c;
You actually never need to perform casting when moving from lower to upper types. It is correct for primitives (e.g. int -> double) and for classes (e.g. ArrayList -> List).
What about Math.floor(double) I cant see the difference between integer part and rouding it down.

Generating sequentially all combination of a finite set using lexicographic order and bitwise arithmetic

Consider all combination of length 3 of the following array of integer {1,2,3}.
I would like to traverse all combination of length 3 using the following algorithm from wikipedia
// find next k-combination
bool next_combination(unsigned long& x) // assume x has form x'01^a10^b in binary
{
unsigned long u = x & -x; // extract rightmost bit 1; u = 0'00^a10^b
unsigned long v = u + x; // set last non-trailing bit 0, and clear to the right; v=x'10^a00^b
if (v==0) // then overflow in v, or x==0
return false; // signal that next k-combination cannot be represented
x = v +(((v^x)/u)>>2); // v^x = 0'11^a10^b, (v^x)/u = 0'0^b1^{a+2}, and x ← x'100^b1^a
return true; // successful completion
}
What should be my starting value for this algorithm for all combination of {1,2,3}?
When I get the output of the algorithm, how do I recover the combination?
I've try the following direct adaptation, but I'm new to bitwise arithmetic and I can't tell if this is correct.
// find next k-combination, Java
int next_combination(int x)
{
int u = x & -x;
int v = u + x;
if (v==0)
return v;
x = v +(((v^x)/u)>>2);
return x;
}
I found a class that exactly solve this problem. See the class CombinationGenerator here
https://bitbucket.org/rayortigas/everyhand-java/src/9e5f1d7bd9ca/src/Combinatorics.java
To recover a combination do
for(Long combination : combinationIterator(10,3))
toCombination(toPermutation(combination);
Thanks everybody for your input.
I have written a class to handle common functions for working with the binomial coefficient, which is the type of problem that your problem falls under. It performs the following tasks:
Outputs all the K-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters. This method makes solving this type of problem quite trivial.
Converts the K-indexes to the proper index of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle. My paper talks about this. I believe I am the first to discover and publish this technique, but I could be wrong.
Converts the index in a sorted binomial coefficient table to the corresponding K-indexes. I believe it might be faster than the link you have found.
Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers.
The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to perform the 4 above methods. Accessor methods are provided to access the table.
There is an associated test class which shows how to use the class and its methods. It has been extensively tested with 2 cases and there are no known bugs.
To read about this class and download the code, see Tablizing The Binomial Coeffieicent.
It should not be hard to convert this class to Java.

Categories

Resources