How can we apply PCA to a one dimensional array ?
double[][] data = new double [1][600];
PCA pca = new PCA(data, 20);
data = pca.getPCATransformedDataAsDoubleArray();
When a print the values in data array, the features in the data array decrease 600 to 20, but all values zero.
Why?
package VoiceRecognation;
import Jama.Matrix;
import comirva.data.DataMatrix;
import comirva.util.PCA;
import javax.print.attribute.standard.Finishings;
import java.io.File;
/**
* Created by IntelliJ IDEA.
* User: SAHIN
* Date: 11.06.2011
* Time: 19:33
* To change this template use File | Settings | File Templates.
*/
public class Deneme {
public static void main(String[] args) {
int[] group = Groups.getGroups();
File[] files = Files.getFiles();
double[][] data = FindMfccOfFiles.findMFCCValuesOfFiles(files);
PCA pca = new PCA(data, 20);
data = pca.getPCATransformedDataAsDoubleArray();
File file = new File("src/main/resources/Karisik/E-Mail/(1).wav");
double[] testdata = MFCC.getMFCC(file);
double[][] result = new double[1][600];
result[0] = testdata;
PCA p = new PCA(result, 20);
double [][] sum = p.getPCATransformedDataAsDoubleArray();
for (int i = 0; i < sum[0].length; i++) {
System.out.print(sum[0][i] + " ");
}
}
}
Principal component analysis is used for reducing the dimensionality of your problem. The dimensions of the audio file are the channels (e.g. left speaker, right speaker), not the individual samples. In that case, you really have only one dimension for a mono audio stream. So, you're not going to reduce the number of samples using PCA, but you could reduce the number of channels in the audio. But you could do that without PCA just by averaging the samples on each channel. So unless you're trying to convert stereo audio into mono, I think you need a different approach to your problem.
You overwrite the data array with the result of the method getPCATransformedDataAsDoubleArray. I assume, this is an array with 20 entries because of the constructor arg. I don't know, why all values are zero, i think, because it's defined in the class PCA.
Related
This question already has an answer here:
Sending a Keyboard Input with Java JNA and SendInput()
(1 answer)
Closed 4 days ago.
Intent
I am trying to create a localized implementation of cloud gaming where a user's own PC serves as the server. What I am trying to achieve is the last piece stopping me. That is implementing hardware mouse movements. This is where Windows' SendInput() comes into play.
Issue
The issue I have right now is that My entire code is based on Kotlin/Java. So, I don't exactly know how to replicate the functions of SendInput() in Java other than to use JNA to access C++ functions. More specifically access SendInput(). But, here's where I'm stuck. My Java code compiles, but doesn't execute when called.
Code to be translated to Java
#include<windows.h>
#include<iostream>
using namespace std;
void moveMouse(int x, int y) {
INPUT input;
input.type = INPUT_MOUSE;
input.mi.dx = x;
input.mi.dy = y;
input.mi.time = 0;
input.mi.dwFlags = MOUSEEVENTF_MOVE;
UINT qwe = SendInput(1, &input, sizeof(input));
cout<< qwe;
}
Code in Java using JNA
look into the comments in the code below.
import com.sun.jna.platform.win32.User32;
import com.sun.jna.platform.win32.WinDef.DWORD;
import com.sun.jna.platform.win32.WinDef.LONG;
import com.sun.jna.platform.win32.WinUser.INPUT;
//import static com.sun.jna.Native.sizeof;//compilation error 'sizeof(int)' has private access in 'com.sun.jna.Native'
import static com.sun.jna.platform.win32.WinUser.INPUT.INPUT_MOUSE;
public class Test {
public static void main(String[] a) {
moveMouse(
1_000L,
1_000L
);
}
/**
* #param x change required in x co-ordinate
* #param y change required in y co-ordinate
*
*/
public static void moveMouse(Long x, Long y) {
INPUT input = new INPUT();
input.type = new DWORD(INPUT_MOUSE);
input.input.mi.dx = new LONG(x);
input.input.mi.dy = new LONG(y);
input.input.mi.time = new DWORD(0);
input.input.mi.dwFlags = new DWORD(0x0001L);
INPUT[] inputArray = {input};
DWORD result = User32.INSTANCE.SendInput(new DWORD(1), inputArray, input.size());
// sizeof (used below in the commented code) returns the compilation error
// 'sizeof(int)' has private access in 'com.sun.jna.Native'
// I got the idea to use sizeof from another StackOverflow post
// DWORD result = User32.INSTANCE.SendInput(new DWORD(1), inputArray, sizeof(input));
System.out.println("result = " + result.longValue());
System.out.println("size = " + input.size());
}
}
Output
result = 1
size = 40
You'll see that in SendInput, I'm sending the size of the input variable instead of the array's size which is what I'm supposed to send. To get the size of array in bytes, I got the idea to use sizeof, but, as you have seen above in the comments, you'll have already understood why I can't use it as I can't import a private function.
The mouse doesn't move a pixel when the Java code is executed.
I forgot to net type. Didn't realize this was necessary. Anyways, here is the code -
public static void moveMouse(Long x, Long y) {
INPUT input = new INPUT();
input.type = new DWORD(INPUT_MOUSE);
input.input.setType("mi");//--------------------------------------added this
input.input.mi.dx = new LONG(x);
input.input.mi.dy = new LONG(y);
input.input.mi.time = new DWORD(0);
input.input.mi.dwFlags = new DWORD(0x0001L);
INPUT[] inputArray = {input};
DWORD result = User32.INSTANCE.SendInput(new DWORD(1), inputArray, input.size());
}
For anybody having this issue, this function can be straight up copy pasted and it should work.
I am looking for a way to generate and play back sound in Kotlin/Java. I have been searching a lot and tried different solutions, but it's not really satisfactory.
I'm not looking for the Java Control class, that lets me add reverb to existing sounds, or the javax.sound.midi package that lets me do MIDI sequencing. Instead, I want to build up sound from scratch as a sound vector/List, through something like this:
fun createSinWaveBuffer(freq: Double, ms: Int, sampleRate: Int = 44100): ByteArray {
val samples = (ms * sampleRate / 1000)
val output = ByteArray(samples)
val period = sampleRate.toDouble() / freq
for (i in output.indices) {
val angle = 2.0 * Math.PI * i.toDouble() / period
output[i] = (Math.sin(angle) * 127f).toByte()
}
//output.forEach { println(it) }
return output
}
I then want to play back the sound, and have the actual output of the speakers match the input parameters sent to the function with regards to frequency, length etc. Of course, creating two sound vectors like this with different frequencies should summing them together or at least taking the average should result in two tones playing simultanously.
This is simple enough in matlab, if you have a vector y, like this
t=0:1/samplerate:duration;
y=sin(2*pi*freq*t);
Just do
sound(y,sampleRate)
Although there might not be as simple or clean solution in Java, I still feel like it should be possible to play custom sound.
After searching around a bit here and on other places, this is one of the cleanest solutions I'm trying now (even though it uses sun.audio, the other suggestions were way messier):
import sun.audio.AudioPlayer
import sun.audio.AudioDataStream
import sun.audio.AudioData
private fun playsound(sound: ByteArray) {
val audiodata = AudioData(sound)
val audioStream = AudioDataStream(audiodata)
AudioPlayer.player.start(audioStream)
}
but playsound(createSinWaveBuffer(440.0, 10000, 44100)) does not sound right in my speakers. It sounds choppy, it is not at 440 hz, it is not a pure sine wave and it is not ten seconds.
What am I missing?
First of all, do not use sun packages. Ever.
For the desktop, the way to go is to generate the data, acquire a SourceDataLine, open and start the line and then write your data to it. It's important that the line is suitable for the AudioFormat you have chosen to generate. In this case, 8 bits/sample and a sample rate of 44,100 Hz.
Here's a working example in Java, which I am sure you can easily translate to Kotlin.
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.SourceDataLine;
public class ClipDemo {
private static byte[] createSinWaveBuffer(final float freq, final int ms, final float sampleRate) {
final int samples = (int)(ms * sampleRate / 1000);
final byte[] output = new byte[samples];
final float period = sampleRate / freq;
for (int i=0; i<samples; i++) {
final float angle = (float)(2f * Math.PI * i / period);
output[i] = (byte) (Math.sin(angle) * 127f);
}
return output;
}
public static void main(String[] args) throws LineUnavailableException {
final int rate = 44100;
final byte[] sineBuffer = createSinWaveBuffer(440, 5000, rate);
// describe the audio format you're using.
// because its byte-based, it's 8 bit/sample and signed
// if you use 2 bytes for one sample (CD quality), you need to pay more attention
// to endianess and data encoding in your byte buffer
final AudioFormat format = new AudioFormat(rate, 8, 1, true, true);
final SourceDataLine line = AudioSystem.getSourceDataLine(format);
// open the physical line, acquire system resources
line.open(format);
// start the line (... to your speaker)
line.start();
// write to the line (... to your speaker)
// this call blocks.
line.write(sineBuffer, 0, sineBuffer.length);
// cleanup, i.e. close the line again (left out in this example)
}
}
For the past week or so, I have been trying to get a neural network to function using RGB images, but no matter what I do it seems to only be predicting one class.
I have read all the links I could find with people encountering this problem and experimented with a lot of different things, but it always ends up predicting only one out of the two output classes. I have checked the batches going in to the model, I have increased the size of the dataset, I have increased the original pixel size(28*28) to 56*56, increased epochs, done a lot of model tuning and I have even tried a simple non-convolutional neural network as well as dumbing down my own CNN model, yet it changes nothing.
I have also checked into the structure of how the data is passed in for the training set(specifically imageRecordReader), but this input structure(in terms of folder structure and how the data is passed into the training set) works perfectly when given gray-scale images(as it originally was created with a 99% accuracy on the MNIST dataset).
Some context: I use the following folder names as my labels, i.e folder(0), folder(1) for both training and testing data as there will only be two output classes. The training set contains 320 images of class 0 and 240 images of class 1, whereas the testing set is made up of 79 and 80 images respectively.
Code below:
private static final Logger log = LoggerFactory.getLogger(MnistClassifier.class);
private static final String basePath = System.getProperty("java.io.tmpdir") + "/ISIC-Images";
public static void main(String[] args) throws Exception {
int height = 56;
int width = 56;
int channels = 3; // RGB Images
int outputNum = 2; // 2 digit classification
int batchSize = 1;
int nEpochs = 1;
int iterations = 1;
int seed = 1234;
Random randNumGen = new Random(seed);
// vectorization of training data
File trainData = new File(basePath + "/Training");
FileSplit trainSplit = new FileSplit(trainData, NativeImageLoader.ALLOWED_FORMATS, randNumGen);
ParentPathLabelGenerator labelMaker = new ParentPathLabelGenerator(); // parent path as the image label
ImageRecordReader trainRR = new ImageRecordReader(height, width, channels, labelMaker);
trainRR.initialize(trainSplit);
DataSetIterator trainIter = new RecordReaderDataSetIterator(trainRR, batchSize, 1, outputNum);
// vectorization of testing data
File testData = new File(basePath + "/Testing");
FileSplit testSplit = new FileSplit(testData, NativeImageLoader.ALLOWED_FORMATS, randNumGen);
ImageRecordReader testRR = new ImageRecordReader(height, width, channels, labelMaker);
testRR.initialize(testSplit);
DataSetIterator testIter = new RecordReaderDataSetIterator(testRR, batchSize, 1, outputNum);
log.info("Network configuration and training...");
Map<Integer, Double> lrSchedule = new HashMap<>();
lrSchedule.put(0, 0.06); // iteration #, learning rate
lrSchedule.put(200, 0.05);
lrSchedule.put(600, 0.028);
lrSchedule.put(800, 0.0060);
lrSchedule.put(1000, 0.001);
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.seed(seed)
.l2(0.0008)
.updater(new Nesterovs(new MapSchedule(ScheduleType.ITERATION, lrSchedule)))
.optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
.weightInit(WeightInit.XAVIER)
.list()
.layer(0, new ConvolutionLayer.Builder(5, 5)
.nIn(channels)
.stride(1, 1)
.nOut(20)
.activation(Activation.IDENTITY)
.build())
.layer(1, new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.MAX)
.kernelSize(2, 2)
.stride(2, 2)
.build())
.layer(2, new ConvolutionLayer.Builder(5, 5)
.stride(1, 1)
.nOut(50)
.activation(Activation.IDENTITY)
.build())
.layer(3, new SubsamplingLayer.Builder(SubsamplingLayer.PoolingType.MAX)
.kernelSize(2, 2)
.stride(2, 2)
.build())
.layer(4, new DenseLayer.Builder().activation(Activation.RELU)
.nOut(500).build())
.layer(5, new OutputLayer.Builder(LossFunctions.LossFunction.SQUARED_LOSS)
.nOut(outputNum)
.activation(Activation.SOFTMAX)
.build())
.setInputType(InputType.convolutionalFlat(56, 56, 3)) // InputType.convolutional for normal image
.backprop(true).pretrain(false).build();
MultiLayerNetwork net = new MultiLayerNetwork(conf);
net.init();
net.setListeners(new ScoreIterationListener(10));
log.debug("Total num of params: {}", net.numParams());
// evaluation while training (the score should go down)
for (int i = 0; i < nEpochs; i++) {
net.fit(trainIter);
log.info("Completed epoch {}", i);
Evaluation eval = net.evaluate(testIter);
log.info(eval.stats());
trainIter.reset();
testIter.reset();
}
ModelSerializer.writeModel(net, new File(basePath + "/Isic.model.zip"), true);
}
Output from running the model:
Odd iteration scores
Evaluation metrics
Any insight would be much appreciated.
I would suggest changing the activation functions in Layer 1 and 2 to a non-linear function. You may try with Relu and Tanh functions.
You may refer to this Documentaion for a list of available activation functions.
Identity on CNNs almost never makes sense 99% of the time. Stick to RELU if you can.
I would instead shift your efforts towards gradient normalization or interspersing drop out layers. Almost every time a CNN doesn't learn, it's usually due to lack of reguarlization.
Also: Never use squared loss with softmax. It never works. Stick to negative log likelihood.
I've never seen squared loss used with softmax in practice.
You can try l2 and l1 regularization (or both: This is called elastic net regularization)
It seems using an ADAM optimizer gave some promising results as well as increasing the batch size(I now have thousands of images) otherwise the net requires an absurd amount of epochs(at least 50+) in order to begin learning.
Thank you for all responses regardless.
Here is my code:
package algorithms;
import Jama.Matrix;
import java.io.File;
import java.util.Arrays;
public class ThetaGetter {
//First column is one, second is price and third is BHK
private static double[][] variables = {
{1,1130,2},
{1,1100,2},
{1,2055,3},
{1,1047,2},
{1,1927,3},
{1,2667,3},
{1,1146,2},
{1,2020,3},
{1,1190,2},
{1,2165,3},
{1,1250,2},
{1,1185,2},
{1,2825,4},
{1,1200,2},
{1,1580,3},
{1,3200,3},
{1,715,1},
{1,1270,2},
{1,2403,3},
{1,1465,3},
{1,1345,2}
};
private static double[][] prices = {
{69.65},
{60},
{115},
{55},
{140},
{225},
{76.78},
{120},
{73.11},
{140},
{56},
{79.39},
{161},
{73.69},
{80},
{145},
{34.87},
{77.72},
{165},
{98},
{82}
};
private static Matrix X = new Matrix(variables);
private static Matrix y = new Matrix(prices);
public static void main(String[] args) {
File file = new File("theta.dat");
if(file.exists()){
System.out.println("Theta has already been calculated!");
return;
}
//inverse(Tra(X)*X)*tra(X)*y
Matrix transposeX = X.transpose();
Matrix inverse = X.times(transposeX).inverse();
System.out.println(y.getArray().length);
System.out.println(X.getArray().length);
Matrix test = inverse.times(transposeX);
Matrix theta = test.times(y);
System.out.println(Arrays.deepToString(theta.getArray()));
}
}
This algorithm basically tries to take housing prices and then get a few constants which are then used to guess prices of houses. However I am getting an exception on the line 'Matrix theta = test.times(y);' The error message is pretty much what's in the question. Is there some sort issue with the dimensions? Both of them have 21 items, so I don't know what's going on.
The mistake you are making is in the following line of code:
Matrix inverse = X.times(transposeX).inverse();
The formula you commented above is:
//inverse(Tra(X)*X)*tra(X)*y
but what you are actually calculating in code is:
(X*Tra(X) instead of Tra(X)*X)
//inverse(X*Tra(X))*tra(X)*y
If the dimension of X is (m,n) where
m = number of rows
n = number of columns
and the dimension of Y is (m,1), using the multiplications you used above you will have the following:
inverse(X * Tra(X)) *Tra(X)*Y = inverse * Tra(X) * Y = result * y
inverse((m,n)(n,m))(n,m)*(m,1)= (m,m) * (n,m) => which results in the error because the inner dimensions for a matrix multiplication must be equal
What would fix your code would be replacing the following line:
Matrix inverse = X.times(transposeX).inverse();
with
Matrix inverse = transposeX.times(X).inverse();
I wanted to use Apache math commons implementation for FFT (FastFourierTransformer class) to process some dummy data whose 8 data samples are contributing to one complete sinusoidal wave. The maximum being amplitude 230. The code snippet that I tried is below :
private double[] transform()
{
double [] input = new double[8];
input[0] = 0.0;
input[1] = 162.6345596729059;
input[2] = 230.0;
input[3] = 162.63455967290594;
input[4] = 2.8166876380389125E-14;
input[5] = -162.6345596729059;
input[6] = -230.0;
input[7] = -162.63455967290597;
double[] tempConversion = new double[input.length];
FastFourierTransformer transformer = new FastFourierTransformer();
try {
Complex[] complx = transformer.transform(input);
for (int i = 0; i < complx.length; i++) {
double rr = (complx[i].getReal());
double ri = (complx[i].getImaginary());
tempConversion[i] = Math.sqrt((rr * rr) + (ri * ri));
}
} catch (IllegalArgumentException e) {
System.out.println(e);
}
return tempConversion;
}
1) Now the data returned by method transform is an array of complex number. Does that array contains the frequency component information about input data? or the tempConversion array that I created will contain the frequency information? The values in tempConversion array is :
2.5483305001488234E-16
920.0
4.0014578493024757E-14
2.2914314707516465E-13
5.658858581079313E-14
2.2914314707516465E-13
4.0014578493024757E-14
920.0
2) I searched a lot but at most of the places there is no clear documentation on what format of data algorithm expects (in terms of sample code to understand better) and how do I use the array of results to calculate the frequencies contained in the signal?
Your output data looks correct. You've calculated the magnitude of the complex FFT output at each frequency bin which corresponds to the energy in the input signal at the corresponding frequency for that bin. Since your input is purely real, the output is complex conjugate symmetric, and the last 3 output values are redundant.
So you have:
Bin Freq Magnitude
0 0 (DC) 2.5483305001488234E-16
1 Fs/8 920.0
2 Fs/4 4.0014578493024757E-14
3 3Fs/8 2.2914314707516465E-13
4 Fs/2 (Nyq) 5.658858581079313E-14
5 3Fs/8 2.2914314707516465E-13 # redundant - mirror image of bin 3
6 Fs/4 4.0014578493024757E-14 # redundant - mirror image of bin 2
7 Fs/8 920.0 # redundant - mirror image of bin 1
All the values are effectively 0 apart from bin 1 (and bin 6) which corresponds to a frequency of Fs/8 as expected.