Matrix Operation in Spark MLlib in Java

Matrix Operation in Spark MLlib in Java - java

This question is about MLlib (Spark 1.2.1+).
What is the best way to manipulate local matrices (moderate size, under 100x100, so does not need to be distributed).
For instance, after computing the SVD of a dataset, I need to perform some matrix operation.
The RowMatrix only provide a multiply function. The toBreeze method returns a DenseMatrix<Object> but the API does not seem Java friendly:
public final <TT,B,That> That $plus(B b, UFunc.UImpl2<OpAdd$,TT,B,That> op)
In Spark+Java, how to do any of the following operations:
transpose a matrix
add/subtract two matrices
crop a Matrix
perform element-wise operations
etc
Javadoc RowMatrix: https://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/linalg/distributed/RowMatrix.html
RDD<Vector> data = ...;
RowMatrix matrix = new RowMatrix(data);
SingularValueDecomposition<RowMatrix, Matrix> svd = matrix.computeSVD(15, true, 1e-9d);
RowMatrix U = svd.U();
Vector s = svd.s();
Matrix V = svd.V();
//Example 1: How to compute transpose(U)*matrix
//Example 2: How to compute transpose(U(:,1:k))*matrix
EDIT: Thanks for dlwh for pointing me in the right direction, the following solution works:
import no.uib.cipr.matrix.DenseMatrix;
// ...
RowMatrix U = svd.U();
DenseMatrix U_mtj = new DenseMatrix((int) U.numCols(), (int) U.numRows(), U.toBreeze().toArray$mcD$sp(), true);
// From there, matrix operations are available on U_mtj

Breeze just doesn't provide a Java-friendly API. (And, speaking as the main author, I have no plans to: it would hamstring the API too much.)
You can probably exploit the fact that MTJ uses the same dense matrix representation as we do. (Well, almost. Their API doesn't expose majorStride, but that shouldn't be an issue for you.)
That is, you can do something like this:
import no.uib.cipr.matrix.DenseMatrix;
// ...
breeze.linalg.DenseMatrix[Double] Ubreeze = U.toBreeze();
new DenseMatrix(Ubreeze.cols(), Ubreeze.rows(), Ubreeze.data());

Related

How to find the optimal K for clustering? [duplicate]

I've been studying about k-means clustering, and one thing that's not clear is how you choose the value of k. Is it just a matter of trial and error, or is there more to it?

You can maximize the Bayesian Information Criterion (BIC):
BIC(C | X) = L(X | C) - (p / 2) * log n
where L(X | C) is the log-likelihood of the dataset X according to model C, p is the number of parameters in the model C, and n is the number of points in the dataset.
See "X-means: extending K-means with efficient estimation of the number of clusters" by Dan Pelleg and Andrew Moore in ICML 2000.
Another approach is to start with a large value for k and keep removing centroids (reducing k) until it no longer reduces the description length. See "MDL principle for robust vector quantisation" by Horst Bischof, Ales Leonardis, and Alexander Selb in Pattern Analysis and Applications vol. 2, p. 59-72, 1999.
Finally, you can start with one cluster, then keep splitting clusters until the points assigned to each cluster have a Gaussian distribution. In "Learning the k in k-means" (NIPS 2003), Greg Hamerly and Charles Elkan show some evidence that this works better than BIC, and that BIC does not penalize the model's complexity strongly enough.

Basically, you want to find a balance between two variables: the number of clusters (k) and the average variance of the clusters. You want to minimize the former while also minimizing the latter. Of course, as the number of clusters increases, the average variance decreases (up to the trivial case of k=n and variance=0).
As always in data analysis, there is no one true approach that works better than all others in all cases. In the end, you have to use your own best judgement. For that, it helps to plot the number of clusters against the average variance (which assumes that you have already run the algorithm for several values of k). Then you can use the number of clusters at the knee of the curve.

Yes, you can find the best number of clusters using Elbow method, but I found it troublesome to find the value of clusters from elbow graph using script. You can observe the elbow graph and find the elbow point yourself, but it was lot of work finding it from script.
So another option is to use Silhouette Method to find it. The result from Silhouette completely comply with result from Elbow method in R.
Here`s what I did.
#Dataset for Clustering
n = 150
g = 6
set.seed(g)
d <- data.frame(x = unlist(lapply(1:g, function(i) rnorm(n/g, runif(1)*i^2))),
y = unlist(lapply(1:g, function(i) rnorm(n/g, runif(1)*i^2))))
mydata<-d
#Plot 3X2 plots
attach(mtcars)
par(mfrow=c(3,2))
#Plot the original dataset
plot(mydata$x,mydata$y,main="Original Dataset")
#Scree plot to deterine the number of clusters
wss <- (nrow(mydata)-1)*sum(apply(mydata,2,var))
for (i in 2:15) {
wss[i] <- sum(kmeans(mydata,centers=i)$withinss)
}
plot(1:15, wss, type="b", xlab="Number of Clusters",ylab="Within groups sum of squares")
# Ward Hierarchical Clustering
d <- dist(mydata, method = "euclidean") # distance matrix
fit <- hclust(d, method="ward")
plot(fit) # display dendogram
groups <- cutree(fit, k=5) # cut tree into 5 clusters
# draw dendogram with red borders around the 5 clusters
rect.hclust(fit, k=5, border="red")
#Silhouette analysis for determining the number of clusters
library(fpc)
asw <- numeric(20)
for (k in 2:20)
asw[[k]] <- pam(mydata, k) $ silinfo $ avg.width
k.best <- which.max(asw)
cat("silhouette-optimal number of clusters:", k.best, "\n")
plot(pam(d, k.best))
# K-Means Cluster Analysis
fit <- kmeans(mydata,k.best)
mydata
# get cluster means
aggregate(mydata,by=list(fit$cluster),FUN=mean)
# append cluster assignment
mydata <- data.frame(mydata, clusterid=fit$cluster)
plot(mydata$x,mydata$y, col = fit$cluster, main="K-means Clustering results")
Hope it helps!!

May be someone beginner like me looking for code example. information for silhouette_score
is available here.
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
range_n_clusters = [2, 3, 4] # clusters range you want to select
dataToFit = [[12,23],[112,46],[45,23]] # sample data
best_clusters = 0 # best cluster number which you will get
previous_silh_avg = 0.0
for n_clusters in range_n_clusters:
clusterer = KMeans(n_clusters=n_clusters)
cluster_labels = clusterer.fit_predict(dataToFit)
silhouette_avg = silhouette_score(dataToFit, cluster_labels)
if silhouette_avg > previous_silh_avg:
previous_silh_avg = silhouette_avg
best_clusters = n_clusters
# Final Kmeans for best_clusters
kmeans = KMeans(n_clusters=best_clusters, random_state=0).fit(dataToFit)

Look at this paper, "Learning the k in k-means" by Greg Hamerly, Charles Elkan. It uses a Gaussian test to determine the right number of clusters. Also, the authors claim that this method is better than BIC which is mentioned in the accepted answer.

There is something called Rule of Thumb. It says that the number of clusters can be calculated by
k = (n/2)^0.5
where n is the total number of elements from your sample.
You can check the veracity of this information on the following paper:
http://www.ijarcsms.com/docs/paper/volume1/issue6/V1I6-0015.pdf
There is also another method called G-means, where your distribution follows a Gaussian Distribution or Normal Distribution.
It consists of increasing k until all your k groups follow a Gaussian Distribution.
It requires a lot of statistics but can be done.
Here is the source:
http://papers.nips.cc/paper/2526-learning-the-k-in-k-means.pdf
I hope this helps!

If you don't know the numbers of the clusters k to provide as parameter to k-means so there are four ways to find it automaticaly:
G-means algortithm: it discovers the number of clusters automatically using a statistical test to decide whether to split a k-means center into two. This algorithm takes a hierarchical approach to detect the number of clusters, based on a statistical test for the hypothesis that a subset of data follows a Gaussian distribution (continuous function which approximates the exact binomial distribution of events), and if not it splits the cluster. It starts with a small number of centers, say one cluster only (k=1), then the algorithm splits it into two centers (k=2) and splits each of these two centers again (k=4), having four centers in total. If G-means does not accept these four centers then the answer is the previous step: two centers in this case (k=2). This is the number of clusters your dataset will be divided into. G-means is very useful when you do not have an estimation of the number of clusters you will get after grouping your instances. Notice that an inconvenient choice for the "k" parameter might give you wrong results. The parallel version of g-means is called p-means. G-means sources:
source 1
source 2
source 3
x-means: a new algorithm that efficiently, searches the space of cluster locations and number of clusters to optimize the Bayesian Information Criterion (BIC) or the Akaike Information Criterion (AIC) measure. This version of k-means finds the number k and also accelerates k-means.
Online k-means or Streaming k-means: it permits to execute k-means by scanning the whole data once and it finds automaticaly the optimal number of k. Spark implements it.
MeanShift algorithm: it is a nonparametric clustering technique which does not require prior knowledge of the number of clusters, and does not constrain the shape of the clusters. Mean shift clustering aims to discover “blobs” in a smooth density of samples. It is a centroid-based algorithm, which works by updating candidates for centroids to be the mean of the points within a given region. These candidates are then filtered in a post-processing stage to eliminate near-duplicates to form the final set of centroids. Sources: source1, source2, source3

First build a minimum spanning tree of your data.
Removing the K-1 most expensive edges splits the tree into K clusters,
so you can build the MST once, look at cluster spacings / metrics for various K,
and take the knee of the curve.
This works only for Single-linkage_clustering,
but for that it's fast and easy. Plus, MSTs make good visuals.
See for example the MST plot under
stats.stackexchange visualization software for clustering.

I'm surprised nobody has mentioned this excellent article:
http://www.ee.columbia.edu/~dpwe/papers/PhamDN05-kmeans.pdf
After following several other suggestions I finally came across this article while reading this blog:
https://datasciencelab.wordpress.com/2014/01/21/selection-of-k-in-k-means-clustering-reloaded/
After that I implemented it in Scala, an implementation which for my use cases provide really good results. Here's code:
import breeze.linalg.DenseVector
import Kmeans.{Features, _}
import nak.cluster.{Kmeans => NakKmeans}
import scala.collection.immutable.IndexedSeq
import scala.collection.mutable.ListBuffer
/*
https://datasciencelab.wordpress.com/2014/01/21/selection-of-k-in-k-means-clustering-reloaded/
*/
class Kmeans(features: Features) {
def fkAlphaDispersionCentroids(k: Int, dispersionOfKMinus1: Double = 0d, alphaOfKMinus1: Double = 1d): (Double, Double, Double, Features) = {
if (1 == k || 0d == dispersionOfKMinus1) (1d, 1d, 1d, Vector.empty)
else {
val featureDimensions = features.headOption.map(_.size).getOrElse(1)
val (dispersion, centroids: Features) = new NakKmeans[DenseVector[Double]](features).run(k)
val alpha =
if (2 == k) 1d - 3d / (4d * featureDimensions)
else alphaOfKMinus1 + (1d - alphaOfKMinus1) / 6d
val fk = dispersion / (alpha * dispersionOfKMinus1)
(fk, alpha, dispersion, centroids)
}
}
def fks(maxK: Int = maxK): List[(Double, Double, Double, Features)] = {
val fadcs = ListBuffer[(Double, Double, Double, Features)](fkAlphaDispersionCentroids(1))
var k = 2
while (k <= maxK) {
val (fk, alpha, dispersion, features) = fadcs(k - 2)
fadcs += fkAlphaDispersionCentroids(k, dispersion, alpha)
k += 1
}
fadcs.toList
}
def detK: (Double, Features) = {
val vals = fks().minBy(_._1)
(vals._3, vals._4)
}
}
object Kmeans {
val maxK = 10
type Features = IndexedSeq[DenseVector[Double]]
}

If you use MATLAB, any version since 2013b that is, you can make use of the function evalclusters to find out what should the optimal k be for a given dataset.
This function lets you choose from among 3 clustering algorithms - kmeans, linkage and gmdistribution.
It also lets you choose from among 4 clustering evaluation criteria - CalinskiHarabasz, DaviesBouldin, gap and silhouette.

I used the solution I found here : http://efavdb.com/mean-shift/ and it worked very well for me :
import numpy as np
from sklearn.cluster import MeanShift, estimate_bandwidth
from sklearn.datasets.samples_generator import make_blobs
import matplotlib.pyplot as plt
from itertools import cycle
from PIL import Image
#%% Generate sample data
centers = [[1, 1], [-.75, -1], [1, -1], [-3, 2]]
X, _ = make_blobs(n_samples=10000, centers=centers, cluster_std=0.6)
#%% Compute clustering with MeanShift
# The bandwidth can be automatically estimated
bandwidth = estimate_bandwidth(X, quantile=.1,
n_samples=500)
ms = MeanShift(bandwidth=bandwidth, bin_seeding=True)
ms.fit(X)
labels = ms.labels_
cluster_centers = ms.cluster_centers_
n_clusters_ = labels.max()+1
#%% Plot result
plt.figure(1)
plt.clf()
colors = cycle('bgrcmykbgrcmykbgrcmykbgrcmyk')
for k, col in zip(range(n_clusters_), colors):
my_members = labels == k
cluster_center = cluster_centers[k]
plt.plot(X[my_members, 0], X[my_members, 1], col + '.')
plt.plot(cluster_center[0], cluster_center[1],
'o', markerfacecolor=col,
markeredgecolor='k', markersize=14)
plt.title('Estimated number of clusters: %d' % n_clusters_)
plt.show()

My idea is to use Silhouette Coefficient to find the optimal cluster number(K). Details explanation is here.

Assuming you have a matrix of data called DATA, you can perform partitioning around medoids with estimation of number of clusters (by silhouette analysis) like this:
library(fpc)
maxk <- 20 # arbitrary here, you can set this to whatever you like
estimatedK <- pamk(dist(DATA), krange=1:maxk)$nc

One possible answer is to use Meta Heuristic Algorithm like Genetic Algorithm to find k.
That's simple. you can use random K(in some range) and evaluate the fit function of Genetic Algorithm with some measurment like Silhouette
And Find best K base on fit function.
https://en.wikipedia.org/wiki/Silhouette_(clustering)

km=[]
for i in range(num_data.shape[1]):
kmeans = KMeans(n_clusters=ncluster[i])#we take number of cluster bandwidth theory
ndata=num_data[[i]].dropna()
ndata['labels']=kmeans.fit_predict(ndata.values)
cluster=ndata
co=cluster.groupby(['labels'])[cluster.columns[0]].count()#count for frequency
me=cluster.groupby(['labels'])[cluster.columns[0]].median()#median
ma=cluster.groupby(['labels'])[cluster.columns[0]].max()#Maximum
mi=cluster.groupby(['labels'])[cluster.columns[0]].min()#Minimum
stat=pd.concat([mi,ma,me,co],axis=1)#Add all column
stat['variable']=stat.columns[1]#Column name change
stat.columns=['Minimum','Maximum','Median','count','variable']
l=[]
for j in range(ncluster[i]):
n=[mi.loc[j],ma.loc[j]]
l.append(n)
stat['Class']=l
stat=stat.sort(['Minimum'])
stat=stat[['variable','Class','Minimum','Maximum','Median','count']]
if missing_num.iloc[i]>0:
stat.loc[ncluster[i]]=0
if stat.iloc[ncluster[i],5]==0:
stat.iloc[ncluster[i],5]=missing_num.iloc[i]
stat.iloc[ncluster[i],0]=stat.iloc[0,0]
stat['Percentage']=(stat[[5]])*100/count_row#Freq PERCENTAGE
stat['Cumulative Percentage']=stat['Percentage'].cumsum()
km.append(stat)
cluster=pd.concat(km,axis=0)## see documentation for more info
cluster=cluster.round({'Minimum': 2, 'Maximum': 2,'Median':2,'Percentage':2,'Cumulative Percentage':2})

Another approach is using Self Organizing Maps (SOP) to find optimal number of clusters. The SOM (Self-Organizing Map) is an unsupervised neural
network methodology, which needs only the input is used to
clustering for problem solving. This approach used in a paper about customer segmentation.
The reference of the paper is
Abdellah Amine et al., Customer Segmentation Model in E-commerce Using
Clustering Techniques and LRFM Model: The Case
of Online Stores in Morocco, World Academy of Science, Engineering and Technology
International Journal of Computer and Information Engineering
Vol:9, No:8, 2015, 1999 - 2010

Hi I'll make it simple and straight to explain, I like to determine clusters using 'NbClust' library.
Now, how to use the 'NbClust' function to determine the right number of clusters: You can check the actual project in Github with actual data and clusters - Extention to this 'kmeans' algorithm also performed using the right number of 'centers'.
Github Project Link: https://github.com/RutvijBhutaiya/Thailand-Customer-Engagement-Facebook

You can choose the number of clusters by visually inspecting your data points, but you will soon realize that there is a lot of ambiguity in this process for all except the simplest data sets. This is not always bad, because you are doing unsupervised learning and there's some inherent subjectivity in the labeling process. Here, having previous experience with that particular problem or something similar will help you choose the right value.
If you want some hint about the number of clusters that you should use, you can apply the Elbow method:
First of all, compute the sum of squared error (SSE) for some values of k (for example 2, 4, 6, 8, etc.). The SSE is defined as the sum of the squared distance between each member of the cluster and its centroid. Mathematically:
SSE=∑Ki=1∑x∈cidist(x,ci)2
If you plot k against the SSE, you will see that the error decreases as k gets larger; this is because when the number of clusters increases, they should be smaller, so distortion is also smaller. The idea of the elbow method is to choose the k at which the SSE decreases abruptly. This produces an "elbow effect" in the graph, as you can see in the following picture:
In this case, k=6 is the value that the Elbow method has selected. Take into account that the Elbow method is an heuristic and, as such, it may or may not work well in your particular case. Sometimes, there are more than one elbow, or no elbow at all. In those situations you usually end up calculating the best k by evaluating how well k-means performs in the context of the particular clustering problem you are trying to solve.

I worked on a Python package kneed (Kneedle algorithm). It finds cluster numbers dynamically as the point where the curve starts to flatten. Given a set of x and y values, kneed will return the knee point of the function. The knee joint is the point of maximum curvature. Here is the sample code.
y = [7342.1301373073857, 6881.7109460930769, 6531.1657905495022,
6356.2255554679778, 6209.8382535595829, 6094.9052166741121,
5980.0191582610196, 5880.1869867848218, 5779.8957906367368,
5691.1879324562778, 5617.5153566271356, 5532.2613232619951,
5467.352265375117, 5395.4493783888756, 5345.3459908298091,
5290.6769823693812, 5243.5271656371888, 5207.2501206569532,
5164.9617535255456]
x = range(1, len(y)+1)
from kneed import KneeLocator
kn = KneeLocator(x, y, curve='convex', direction='decreasing')
print(kn.knee)

Leave here a pretty cool gif from Codecademy course:
The K-Means algorithm:
Place k random centroids for the initial clusters.
Assign data samples to the nearest centroid.
Update centroids based on the above-assigned data samples.
Btw, its not a explanation of full algorithm, its just helpful vizualization

Multiple linear regression with matricies in Java

I am a quant newbie trying to compute regression coefficients using apache common math libraries in Java. I am trying to use OLSMultipleLinearRegression class to estimate the regression coefficients and residuals for a multiple linear regression model which defines a regressand y which is a [nX1] state vector. The observations or regressors are defined by a state vector x which is again a [nX1] state vector. A test with sample data looks as follows:
//n=3
double[][] y = new double[][]{{-0.03125,0.0078125,0.0.0.0,0.015625},
{-0.03125,0.0078125,0.0.0.0,0.015625},
{-0.03125,0.0078125,0.0.0.0,0.015625}};
//n=3
double[][] x = new double[][]{{+0.03195,-0.005812,0.0.0.0,0.015925},
{-0.03125,0.0079125,0.0.0.0,0.025625},
{-0.03195,0.0078825,0.0.0.0,-0.015625}};
OLSMultipleLinearRegression r = new OLSMultipleLinearRegression()
r.setNoIntercept(true)
r.newSampleData(y,x) //compiler error.
The regressor x is composed of 5 independent state variables which is captured at a given time t. The multiple regression model will attempt to predict a state y or the regressand at t+1 using the regression co-efficients which I am trying to determine using historical data as shown above.
How do I input data of this nature to the model? Apologies in advance if any of this sounds trivial or obvious to you but any help would be much appreciated.

How to print the element of RowMatrix in java?

RowMatrix mat = new RowMatrix(parsedData.rdd());
Matrix pc = mat.computePrincipalComponents(1);
RowMatrix projected = mat.multiply(pc);
I need to print the elements in the RowMatrix projected using java

RowMatrix is a distributed data structure and the only way to reliably output its content is to to fetch data to the driver and print locally. Typically it is an operation you want to avoid but general approach is as follows
val mat: RowMatrix = ???
mat
.rows // Extract RDD[org.apache.spark.mllib.linalg.Vector]
.collect // you can use toLocalIterator to limit memory usage
.foreach(println) // Iterate over local Iterator and print
With Java equivalent being something like this:
List<Vector> vs = mat.rows().toJavaRDD().collect();
for(Vector v: vs) {
System.out.println(v);
}
In practice there should be no need for operation like this. If your data is small enough to be handled locally there is no reason to use DistributedMatrix. If data is large but wide then RowMatrix is a poor choice for a distributed data structure.

Sample Nullspace Using Colt

I'm writing Java and using colt as my matrix library and would like to find a (any) vector in the kernel of a matrix. I can do this in python using sympy as follows:
def kernel(A, n):
if A.rows == 0:
return Matrix([1]*n)
R, pivots = A.rref()
Ap = A.extract(range(A.rows), pivots)
bp = Matrix([0]*Ap.rows)
free = list(set(range(n)) - set(pivots))
for i in free:
bp -= A[:, i]
xp = Ap.LUsolve(bp)
x = [1]*n
for i in range(len(pivots)):
x[pivots[i]] = xp[i]
return Matrix(x)
Using sympy I can call nullspace to get the entire nullspace or use rref to get the pivots used when reducing to row-echelon form and from that find a single vector in the nullspace myself. I cant find a function in Colt to calculate the nullspace and trapezoidalLower doesn't return the pivots.
Am I left to write my own rref or does someone know a higher level way of achieving this with Colt?

The answer is WHATEVER YOU DO DONT USE RREF in java. Converting to reduced echelon form turns out to have lots of comparisons to 0. If the value is 0 we do one thing. If the value is very close to 0, but not quite 0, we do something completely different (like divide by the value). This means one unstable algorithm.
Instead we can use QR Decomposition, which happens to be implemented in colt.

Compute probability over a multivariate normal

My question addresses both mathematical and CS issues, but since I need a performant implementation I am posting it here.
Problem:
I have an estimated normal bivariate distribution, defined as a python matrix, but then I will need to transpose the same computation in Java. (dummy values here)
mean = numpy.matrix([[0],[0]])
cov = numpy.matrix([[1,0],[0,1]])
When I receive in inupt a column vector of integers values (x,y) I want to compute the probability of that given tuple.
value = numpy.matrix([[4],[3]])
probability_of_value_given_the_distribution = ???
Now, from a matematical point of view, this would be the integral for 3.5 < x < 4.5 and 2.5 < y < 3.5 over the probability density function of my normal.
What I want to know:
Is there a way to avoid the effective implementation of this, that implies dealing with expressions defined over matrices and with double integrals? Besides that it will take me a while if I had to implement it by myself, this would be computationally expensive. An approximate solution would be perfectly fine for me.
My reasonings:
In an univariate normal, one could simply use the cumulative distribution function (or even store its values for the standard one and then normalize), but unfortunately there appears not to be a closed cdf form for multivariates.
Another approach for univariate is to use the inverse of bivariate approximation (so, approximate a normal as a binomial), but extending this to the multivariate I can't figure out how to keep in count the covariances.
I really hope someone has already implemented this, I need it soon (finishing my thesis) and I couldn't find anything.

OpenTURNS provides an efficient implementation of the CDF of a multinormal distribution (see the code).
import numpy as np
mean = np.array([0.0, 0.0])
cov = np.array([[1.0, 0.0],[0.0, 1.0]])
Let us create the multinormal distribution with these parameters.
import openturns as ot
multinormal = ot.Normal(mean, ot.CovarianceMatrix(cov))
Now let us compute the probability of the square [3.5, 4.5] x |2.5, 3.5]:
prob = multinormal.computeProbability(ot.Interval([3.5,2.5], [4.5,3.5]))
print(prob)
The computed probability is
1.3701244220201715e-06

If you are looking for the probabiliy density function of a bivariate normal distribution, below are a few lines that could do the job:
import numpy as np
def multivariate_pdf(vector, mean, cov):
quadratic_form = np.dot(np.dot(vector-mean,np.linalg.inv(cov)),np.transpose(vector-mean))
return np.exp(-.5 * quadratic_form)/ (2*np.pi * np.linalg.det(cov))
mean = np.array([0,0])
cov = np.array([[1,0],[0,1]])
vector = np.array([4,3])
pdf = multivariate_pdf(vector, mean, cov)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.