How to calculate max value of function in range? - java

I have some function (for example, double function(double value)), and some range (for example, from A to B). I need to calculate max value of function in this range. Are there existed libraries for it? Please, give me advice.

If the function needs to handle floating-point values, you're going to have to use something like Golden section search. Note that for this specific method, there are significant limitations regarding the functions that can be handled (specifically it must be unimodal). There are some adjustments you can make to the algorithm which extend it to more functions, specifically these modifications will allow it to work for continuous functions.

Is this a continuous function, or a set of discrete values? If discrete values, then you can either iterate over all values, and set max/min flags as 808sound suggests, or you can load all values into an array.
If it's a continuous function, then you can either populate an array with the function's value at discrete inputs, and find the max as above, or if it's differentiable, then you can use basic calculus to find the points at which df(x)/dx are 0. The latter case is a little more abstract, and probably more complicated than you want, though?
A quick google search led me to this:
http://code.google.com/p/javacalculus/
But I've never used it myself, so I don't know if that implements the required functionality. It does differential equations, though, so I assume they'd have "baby stuff" like basic differentiation.

I do not know if there are any librairies in Java for your problem.
But I know you can easily do that with MatLab (or Octave for the OpenSource equivalent).

If you do not have any indication of what the functions inner workings are (i.e. the function is a black box that accepts an input and produces an output), there is no "easy" way to find the global maximum.
There are an infinite number of points to choose for your input (technically) so "iterating over all possible inputs" is not feasible mathematically.
There are various algorithms that will give you estimated maximum values ina function like this:
The hill climbing algorithm, and the firefly algorithm are two, but there are many more. This is a fairly well documented/studied computer science problem and there is a lot of material online for you to look at. I suggest starting with the hill climbing algorithm, and maybe expanding out to other global optimization algorithms.
Note: These algorithms do not guarantee that the result is the maximum, but provide an estimate of its value.*

Related

Programatic way of counting floating point operations (JAVA)

I'm looking for a programmatic way of counting the number of floating point operations (flops) in call to a function, in JAVA.
There are several closely related questions, asking about what floating points are, and how to do big-O computational complexity analysis, for example here, here and here. But note that in my application I don't want a big-O number, I want to know for a particular run of a function (i.e. a particular input data size) how many flops did it take.
The two closest solutions I can find are (1) suggestions to use a run-time profiler to count the number of flops, but this does not suit my needs as I need to use the result later in the program and (2) a library of computation functions which can be called to increment a counter, and a closely related suggestion here.
These last two suggestions would meet my needs but involve a lot of manual modifications to the code I need to count. An alternative would be to just use CPU run-time which would be very quick and easy, but also quite rough.
Is anyone aware of a programmatic way of counting the flops executed by a section of code?

Concerning a recommendation engine

What's a fast "if user A and user B like product C, they might be interested to follow each other" algorithm. I don't think that calculating their similarity at runtime is smart enough, because it will slow down the response. On the other hand, computing an overnight index will require making an (N*N-1) different runs, where N is the number of users ... not very clever, too. Plus, every time a user likes a new product, or a new user registers, indexes have to be recomputed.
What's the smartest thing which could be applied here? Some sort of ultrafast hashing, to which then only the new items are added?
Well among the algorithms that I studied in a course in Uni, there was one dealing with things like this. Their recommended approach was to compute a "similarity" index for each pair of users (which I guess is your N*N method mentioned) and then based on this determine to which users a particular user is closest to.
Of course, you are not required to immediately recalculate the similarity indexes for every change, just once in a while, somewhat like a search engine crawler works. In fact, once you have computed the initial index, you can use various heuristic methods to recompute more often for users who change their preferences fast, and much slower for those who only change them rarely.
Have you thought about an RDF database?
Like OWLIM http://www.ontotext.com/owlim

Most Efficient way to calculate integrals/derivatives of inputted functions in Java?

I now have an idea, that I use the function as a string, and I calculate the real integral by hand, and ask a question to the user what the definite integral is, but that isn't a real solution.
I was wondering if there was a way to input a function and output an integral/derivative (depending on user choice). My initial step was to put it into an array somehow, but given the many types of functions, this wasn't happening.
I researched everywhere, and I haven't found a method that actually does this with no additional code, nor any code that actually does this, period.
Also, I want to see if there was a way to make a GUI interface and plot inputted functions on to that, if that's possible too.
Thanks :)
What you're describing is known as symbolic integration. There's currently no fully general way to implement it, but there are some techniques available. One such is the Risch algorithm.
Alternatively, an easier problem than symbolic integration is [symbolic differentiation -- and, if the differential of the user's input is equivalent* to the expression which they were asked to integrate, then their integral is probably correct.
You may also want to consider using an existing CAS**, such as Mathematica, to implement this. They've already implemented most of the tools you're after.
*: Keep in mind, though, that two mathematical expressions may be equivalent without being identical, either in trivial ways (e.g, terms in a different order), more complex ones (e.g, large expressions factored differently), or fundamentally (e.g, trig functions replaced with complex exponentials or vice versa).
**: Computer algebra system
Javacalculus is what you are looking for.
Good luck!

How do I find the minimum/maximum values to use with Root Solvers?

I want to use the root solvers (ex: BrentSolver) in Commons Math to find roots for polynomial functions, but they all seem to require using an initial estimate for minimum/maximum, where the function has different signals.
So how do I go about doing this? I know I can compute f(x) for points inside whatever interval I have in mind, but if my Interval is too big, do I still do that? How big should the step be between every attempt? Isn't there a better way to do this?
You might try the Durand-Kerner-Weierstrass method as an estimate or check. A Java implementation is shown here.
I think what they want is a starting interval to search in. The min and max values define the region where you think the roots are.
I don't know what you mean by "interval too big". It won't be +/- infinity; you must have some region of interest to start with.
Run it once; see what you get. Try a few other intervals to see if you can find a true global min/max.
It's not possible to use numerical methods as a complete black box. You have to know something about your function and how the methods work. Use them as an iterative tool to learn something about your function of interest.

How do I determine a best-fit distribution in java?

I have a bunch of sets of data (between 50 to 500 points, each of which can take a positive integral value) and need to determine which distribution best describes them. I have done this manually for several of them, but need to automate this going forward.
Some of the sets are completely modal (every datum has the value of 15), some are strongly modal or bimodal, some are bell-curves (often skewed and with differing degrees of kertosis/pointiness), some are roughly flat, and there are any number of other possible distributions (possion, power-law, etc.). I need a way to determine which distribution best describes the data and (ideally) also provides me with a fitness metric so that I know how confident I am in the analysis.
Existing open-source libraries would be ideal, followed by well documented algorithms that I can implement myself.
Looking for a distribution that fits is unlikely to give you good results in the absence of some a priori knowledge. You may find a distribution that coincidentally is a good fit but is unlikely to be the underlying distribution.
Do you have any metadata available that would hint at what the data means? E.g., "this is open-ended data sampled from a natural population, so it's some sort of normal distribution", vs. "this data is inherently bounded at 0 and discrete, so check for the best-fitting Poisson".
I don't know of any distribution solvers for Java off the top of my head, and I don't know of any that will guess which distribution to use. You could examine some statistical properties (skew/etc.) and make some guesses here--but you're more likely to end up with an accidentally good fit which does not adequately represent the underlying distribution. Real data is noisy and there are just too many degrees of freedom if you don't even know what distribution it is.
This may be above and beyond what you want to do, but it seems the most complete approach (and it allows access to the wealth of statistical knowledge available inside R):
use JRI to communicate with the R statistical language
use R, internally, as indicated in this thread
Look at Apache commons-math.
What you're looking for comes under the general heading of "goodness of fit." You could search on "goodness of fit test."
Donald Knuth describes a couple popular goodness of fit tests in Seminumerical Algorithms: the chi-squared test and the Kolmogorov-Smirnov test. But you've got to have some idea first what distribution you want to test. For example, if you have bell curve data, you might try normal or Cauchy distributions.
If all you really need the distribution for is to model the data you have sampled, you can make your own distribution based on the data you have:
1. Create a histogram of your sample: One method for selecting the bin size is here. There are other methods for selecting bin size, which you may prefer.
2. Derive the sample CDF: Think of the histogram as your PDF, and just compute the integral. It's probably best to scale the height of the bins so that the CDF has the right characteristics ... namely that the value of the CDF at +Infinity is 1.0.
To use the distribution for modeling purposes:
3. Draw X from your distribution: Make a draw Y from U(0,1). Use a reverse lookup on your CDF of the value Y to determine the X such that CDF(X) = Y. Since the CDF is invertible, X is unique.
I've heard of a package called Eureqa that might fill the bill nicely. I've only downloaded it; I haven't tried it myself yet.
You can proceed with a three steps approach, using the SSJ library:
Fit each distribution separately using maximum likelihood estimation (MLE). Using SSJ, this can be done with the static method getInstanceFromMLE(double[] x,
int n) available on each distribution.
For each distribution you have obtained, compute its goodness-of-fit with the real data, for example using Kolmogorov-Smirnov: static void kolmogorovSmirnov (double[] data, ContinuousDistribution dist, double[] sval,double[] pval), note that you don't need to sort the data before calling this function.
Pick the distribution having the highest p-value as your best fit distribution

Categories

Resources