mahout Spearmans Correlation java [closed] - java

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I'm using mahout KMeansDriver to build clusters, and want to use Spearman as DistanceMeasure.
Can I find this algorithm in java or do I need to write it myself?
I didn't find any examples for that on web.

Do not use k-means with other distance measures.
It may stop converging.
K-means is designed to minimize variance. Your distance function must also minimize variance, otherwise you lose the convergence property. For guaranteed convergence with other distances, see partitioning around medoids (PAM) aka k-medoids.
Correlation measures are a good example of distances that do not work with k-means:
Consider the two vectors, and absolute spearman correlation: dist=1-|r|
1 2 3 4 5
5 4 3 2 1
Obviously, spearman correlation is -1, and these two vectors are considered "identical".
However, k-means will now compute the mean of these two, which yields the constant vector
3 3 3 3 3
which is as dis-similar to these two (in fact, it's correlation with anything isn't even well defined). In other words: the mean does not minimize absolute correlation, and
you shouldn't use this distance function.
Variance = squared Euclidean
This is why you should be using k-means only with squared Euclidean distance.
On L2 normalized vectors: Variance ~ Cosine
This is easy to see when looking at the definition of cosine similarity, and the reason why spherical k-means also works.

Related

Converting time domain to frequency domain in Java [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I have 2 arrays containing time and voltage. I would like to convert time domain to frequency domain in Java. I would like to use FFT. If there is any open source library I could use, please point me to it. I have done a research and found few algorithms but they are asking for real part and imaginary part. If anyone got idea regarding that, please let me know how I could use that in my context.
Code I have found so far
Here is one library:
http://www.fftw.org/download.html
You can also use R with Java. See this link:
Java-R integration?
If you are not familiar with R check their home page r-project dot org (I can't post more links)
While I haven't checked the implementation you link to, you should be able to use that one by suppling 0s for the imaginary part. In that case you are going "forward", i.e. set DIRECT to true transforming from time-domain to the frequency domain. The function will return an array containing real parts of the frequency in even numbered seats, and the imaginary part in odd numbered.

Are there any Java graphic libraries that take in RGB values as inputs? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I apologize if this is the wrong place to ask - please let me know if it is, and I will remove this post.
My question is - are there any Java graphic libraries that can take RGB values as inputs and maps those onto a graph? I have been looking at JFreeChart, and a number of the open source solutions, but looking at the documentation, I haven't been having much luck.
Currently, I have a multi-dimensional array that stores 1302 RGB values, which corresponds to 93 rows and 14 columns. As each "index" stores a RGB string in this format i.e. 0,0,0 I hope to graph each individual color into a x-y graph such as like this:
In the above graph, the black is a 0,0,0 value, while the cyan, green, red, etc, are all their individual RGB values.
Since you're asking about plotting heat maps, the answer is yes. Many, in fact, but one such library is jHeatChart over at http://tc33.org/projects/jheatchart
Note that just because your values are encoded as "RGB Strings" doesn't mean you want to ask about using RGB strings, you want to ask about plotting temperature values. The fact that they're RGB strings is irrelevant since we can transform them however we need to make them suitable input.

Library to segment and classify binary or grayscale images [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I am interpreting scientific (STEM) images into their component parts and adding semantics. These images are born digital, noise-free and either binary (monochrome) or have a small number of colours. I would like Java libraries/methods to partition the images into the whitespace-separated components and to identify (classify) the resulting segments. A typical image is:
where I would want the extracted segments to include numerals and other characters (some rotated) and the asterisks in the diagram. [I will use other methods to extract the geometrical components - e.g. the bars) . I would also like the library to identify identical segments (e.g. 6 zero characters, 5 decimal points). I have successfully used Tesseract for characters but many of the segments may not belong to a Unicode character set (e.g. purpose-created symbols).
UPDATE: I have opened a bounty. I am only interested in libraries, NOT suggestions for algorithms as I have already written a prototype one. If the functionality is part of a larger system (e.g. I think JBIG2 has this functionality) please make it clear where the entry points are.
NOTE: "born-digital" means that the image was created without noise, clean lines unlike - say - scanned documents.
I am only aware of openCV. With this you can analyze your image like:
binarizing it (if you have a few colors or greyscale)
gather blobs in Mat-objects
get the position of those Mats to get the correct label (which should be a Mat for each letter)
and then apply your algorithm to those Mats

Calculating mathematical functions in Android [duplicate]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I'm an Android Developer and as part of my next app I will need to evaluate a large variety of user created mathematical expressions and equations. I am looking for a good java library that is lightweight and can evaluate mathematical expressions using user defined variables and constants, trig and exponential functions, etc.
I've looked around and Jep seems to be popular, but I would like to hear more suggestions, especially from people who have used these libraries before.
JEval is a good alternative. I abandoned Jep due to it becoming commercial. The only concern is that JEval seems to be a little dormant at the moment (last release in 2008).
I wrote a simple but capable Math Expression Evaluator a while back, which is free and open-source. It's main advantage is being fast and tiny - both are a good thing with hand-held devices. If it meets your need you are welcome to use it.
Primary Features:
Basic math operators, with inferred precedence (^ * × · / ÷ % + -).
Explicit precedence with parenthesis.
Implicit multiplication of bracketed subexpressions.
Correct right-associativity of exponentials (power operator).
Direct support for hexadecimal numbers prefixed by 0x.
Constants and variables.
Extensible operators.
Extensible functions.
20 KiB footprint.
Example
MathEval math=new MathEval();
math.setVariable("Top", 5);
math.setVariable("Left", 20);
math.setVariable("Bottom",15);
math.setVariable("Right", 60);
System.out.println("Middle: "+math.evaluate("floor((Right+1-Left)/2)"));
Try https://code.google.com/p/expressionoasis/. It is an extensible Expression Evaluation framework and will meet such requirements.
This doesn't exactly fit my initial conditions, but I found a wonderful parser written in C++. I'm trying to figure out Android's Native code support to see if I can use it. It's exactly what I need.
Here's the documentation for the project.
http://www.codeproject.com/KB/recipes/MathieuMathParser.aspx
There is a new commercial tool called formula4j, which may be of interest to some.

Fast accurate sparse svd library? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I'm looking for a fast svd library, in either c, c++ or java. Ultimately I'm using Java, but I'm very comfortable using jna to wrap c++, eg http://github.com/hughperkins/jeigen
I'm looking for a fast svd library that will handle sparse matrices. To keep this objective, so that the question doesn't get marked as too subjective, let's say:
targeting use with news20.binary , eg from http://mldata.org/repository/data/viewslug/news20binary/
how fast does it take to run?
how much variance is conserved, eg for an S matrix of size 6 or 20?
I looked around at a few libraries and found:
matlab: super fast, about 10 seconds, but it's not really a 'library' as such. average squared projection error: 0.93
redsvd: super fast, about 1 second to run, for 6 features, but the average squared projection error is 0.97, which is very high
Eigen's svd is both very slow, and only for dense matrices
svdlibc: ran for 28 minutes before I stopped it; I guess it's calculating the full S, rather than just the first 6 features or so
Basically, I'm looking for a library that gives about the same speed and average squared projection error as matlab, or at least, somewhat comparable.
From my experience, svdlibc is the best library of those options. I've dug a bit through its code before and I don't believe it's calculating the full S matrix (i.e., it is a true "thin svd"). If you can control the matrix representation on disk, svdlibc performs much faster when using the sparse binary input format due to the significantly lower I/O overhead.
The S-Space Package provided an executable jar around the SVDLIBJ java port of SVDLIBC. However, they found it had different results than SVDLIBC for certain input solutions.

Categories

Resources