How does LWJGL store it's matrix3f data? - java

Can someone tell me what order LWJGL stores it's materix3f data?
http://lwjgl.org/javadoc/org/lwjgl/util/vector/Matrix3f.html
I want to recreate the middle matrix (R.y) in this image:
http://upload.wikimedia.org/wikipedia/en/math/5/1/4/5148f88bf9e6811e35615c08d2839793.png
So, like would that -sin(angle) be in m.02, or m.20?

Whereas the storage order may differ when stored as a simple 1D-array, you can always be sure of the order when seeing double-indexed members, like m02. In these, the first number is the row and the second is the column.
This is the mathematical convention and is used by any matrix library I know. You can safely assume LWJGL to behave the same. If it really doesn't, write them a hate mail for doing such mathematically inconsistent rubbish.
So -sin goes into m20.

Related

Read and index data from .raw volume file with Java

I am working with volume data in raw format. I think that it is basically a 3D matrix of voxels that I want to load into a 3D array. I have no experience with this, and I’m unable to find much information on how it is done.
My main problem is that I don’t really understand what the data represents.
So what I’m asking really if anybody can help me to understand the data and load it into a 3D array with Java.
You're correct in thinking the first thing you need to do is understand the data. Here's a good explanation of what you're likely dealing with: https://support.echoview.com/WebHelp/Reference/File_formats/Export_file_formats/Volume_data_set_file_formats.htm#data_file
The .raw file probably contains a sequence of unsigned 8 bit integers. It's hard to say exactly without seeing the file (i.e. does it have a header, what is the matrix size, etc.)
Here's an answer that shows one method of converting from a 3D vector to a volume index and back in Java: https://stackoverflow.com/a/34363187/1973135. You'll need to know the actual dimension of the matrix to get this to work.

JTransforms FFT on Image

I have an image that I want to transform to the frequency domain using FFT, there seems to be a lack of libraries for this for Java but I have found two. One is JTransforms and the other was less well known and doesn't have a name.
With the less well known one the 2D could only have length vales of powers of two but had simple to use methods like FastFourierTransform.fastFT(real, imaginary, true); with the real being the 2D array of doubles full of every pixel values and the imaginary part being a 2D array the same size full of zeroes. The Boolean value would depend on a forward or reverse transform. This made sense to me and it worked except for the power of two requirement which ruined any transform I did (I initially added black space around the image to fit it to the closest power of two), what I am struggling with is working out how to use the equivalent methods for JTransforms and would appreciate any guidance in doing so. I will state what I am currently doing.
I believe the relevant class would be DoubleFFT_2D, its constructor takes a number of rows and columns which I would assume to be the width and height of my image. Because my image has no imaginary parts I think I can use doubleFFT.realForwardFull(real); which treats imaginary parts as zero and pass the real 2D array full of pixels. Unfortunately this doesn't work at all. The JavaDoc states the input array must be of size rows*2*columns, with only the first rows*columns elements filled with real data But I don't see how this related to my image and what I would have to do to meet this requirement.
Sorry about the lengthy and poor explanation, if any additional information is needed I would be happy to provide it.
JTransforms Library and Docs can be found here: https://sites.google.com/site/piotrwendykier/software/jtransforms
It's too bad the documentation for JTransforms isn't available online other than a zipped download. It's very complete and helpful, you should check it out!
To answer your question: DoubleFFT_2D.realForwardFull(double[][] a) takes an array of real numbers (your pixels). However, the result of the FFT will have two output values for each input value - a the real and the imaginary part of each frequency bin. This is why your input array needs to be twice as big as the actual image array, with half of it empty / filled with zeroes.
Note that all the FFT functions use a not only for input, but also for output - this means any image data in there will be lost, so it might be desirable to copy to a different / larger array anyway!
The easy and obvious fix for your scenario would be to use DoubleFFT_2D.realForward(double[][] a) instead. This one will only calculate the positive spectrum, because the negative side will be symmetrical to it. This is because your input values are real.
Also, check out the RealFFTUtils_2D class in JTransforms, which will make it a lot easier for you to retrieve your results from the array afterwards :)

Scattered data set in statistical data analysis

I have some number of statistical data. Some of the data are very scattered to the majority of data set as shown below. What I want to do is minimize the effect of highly scattered data in the data set. I want to compute mean of the data set which has minimized effect of the scattered data in my case.
My data set is as like this:
10.02, 11, 9.12, 7.89, 10.5, 11.3, 10.9, 12, 8.99, 89.23, 328.42.
As shown in figure below:
I need the mean value which is not 46.3 but closer to other data distribution.
Actually, I want to minimize the effect of 89.23 & 328.42 in mean calculation.
Thanks in advance
You might notice that you really dont want the mean. The problem here is that the distribution you've assumed for the data is different from the actual data. If you are trying to fit a normal distribution to this data you'll get bad results. You could try to fit a heavy tailed distribution like the cauchy to this data. If you want to use a normal distribution, then you need to filter out the non-normal samples. If you feel like you know what the standard deviation should be, you could remove everything from the sample above say 3 standard deviations away from the mean (the number 3 would have to depend on the sample size). This process can be done recursively to remove non-normal samples till you are happy with the size of the outlier in terms of the standard deviation.
Unfortunatley the mean of a set of data is just that - the mean value. Are you sure that the point is actually an outlier? Your data contains what appears to be a single outlier with regards to the clustering, but if you take a look at your plot, you will see that this data does seem to have a linear relationship and so is it truly an outlier?
If this reading is really causing you problems, you could remove it entirely. Other than that the only thing that I could suggest to you is to calculate some kind of weighted mean rather than the true mean http://en.wikipedia.org/wiki/Weighted_mean . This way you can assign a lower weighting to the point when calculating your mean (although how you choose a value for the weight is another matter). This is similar to weighted regression, where particular data points have less weight associated to the regression fitting (possibly due to unreliability of certain points for example) http://en.wikipedia.org/wiki/Linear_least_squares_(mathematics)#Weighted_linear_least_squares .
Hope this helps a little, or at least gives you some pointers to other avenues that you can try pursuing.

How do I generate random sample data in my Oracle database?

Does anyone know of a tool that can inspect a specified schema and generate random data based on the tables and columns of that schema?
Another alternative is Swingbench Data Generator
It is useful to use the SAMPLE clause (for example generating order lines for a random combination of orders and products)
This is an interesting question. It is easy enough to generate random values - a simple loop round the data dictionary with calls to DBMS_RANDOM would do the trick.
Except for two things.
One is, as #FrustratedWithForms points out, there is the complication of foreign key constraints. Let's tip lookup values (reference data) into the mix too.
The second is, random isn't very realistic. The main driver for using random data is a need for large volumes of data, probably for performance testing. But real datasets aren't random, they contain skews and clumps, variable string lengths, and of course patterns (especially where dates are concerned).
So, rather than trying to generate random data I suggest you try to get a real dataset. Ideally your user/customer will be able to provide one, preferably anonymized. Otherwise try taking something which is already in the public domain, and massage it to fit your specific requirements. The Info Chimps are the top bananas when it comes to these matters. Check them out.
Allround Automation's PL/SQL Developer has a data generator tool. But be warned: it's a bit flaky - it seems to work fine on a single-table basis but gets tripped up when there are dependencies between tables.
I admit that eventually I just started writing my own SQL scripts to generate data. Turned out to be much more stable.
Have a look at Databene Benerator.
It's a bit complicated to do the initial setup but is quite powerful.
Bit of a wild card this one but thought I would mention it.
If you have data in a production environment that you can't use because it may contain sensitive information, Oracle have a product called "Oracle Data Masking" that will replace the sensitive information with realistic values.
I don't know the cost of this product but if you want more information, it can be found here.

The right way to manage a big matrix in Java

I'm working with a big matrix (not sparse), it contains about 10^10 double.
Of course I cannot keep it in memory, and I need just 1 row at time.
I thought to split it in files, every file 1 row (it requires a lot of files) and just read a file every time I need a row. do you know any more efficient way?
Why do you want to store it in different files? Can't u use a single file?
You could use functions inside RandomAccessFile class to perform the reading from that File.
So, 800KB per file, sounds like a good division. Nothing really stops you from using one giant file, of course. A matrix, at least one like yours that isn't sparse, can be considered a file of fixed length records, making random access a trivial matter.
If you do store it one file per row, I might suggest making a directory tree corresponding to decimal digits, so 0/0/0/0 through 9/9/9/9.
Considerations one way or the other...
is it being backed up? Do you have high-capacity backup media or something ordinary?
does this file ever change?
if it does change and it is backed up, does it change all at once or are changes localized?
It depends on the algorithms you want to execute, but I guess that in most cases a representation where each file contains some square or rectangular region would be better.
For example, matrix multiplication can be done recursively by breaking a matrix into submatrices.
If you are going to be saving it in a file, I believe serializing it will save space/time over storing it as text.
Serializing the doubles will store them as 2 bytes (plus serialization overhead) and means that you will not have to convert these doubles back and forth to and from Strings when saving or loading the file.
I'd suggest to use a disk-persistent cache like Ehcache. Just configure it to keep as many fragments of your matrix in memory as you like and it will take care of the serialization. All you have to do is decide on the way of fragmentation.
Another approach that comes to my mind is using Terracotta (which recently bought Ehache by the way). It's great to get a large network-attached heap that can easily manage your 10^10 double values without caring about it in code at all.

Categories

Resources