I have a piece of code, seen below, that I'm using to invert the pixel data of an image. This code works forward on the initial inversion (black becomes white, white becomes black, etc). However, when I take the inverted image file and rerun it through this code to try and get the original image, the values are nowhere near the original and the image has random dark patches and very high contrast.
So tell me, what is a better way to get the inversion of the inversion? Does the below code need tweaking? Is there another way to go about this entirely?
Code:
//bitsStored is the bit depth. In this test, it is 10.
//imageBytes is the pixel data in a byte array
public static short[] invert(int bitsStored) {
short min = min(imageBytes);//custom method. Gets the minimum value in the byte array.
short range = (short) (2 << bitsStored);
short[] holder = new short[imageBytes.length];
for (int i = 0; i < imageBytes.length; i++) {
holder[i] = (short) (range - imageBytes[i] - min);
}
imageBytes = holder;
return imageBytes;
}
Note: The image I'm using has a 16-bit depth but only uses 10-bits for storage.
Let me know if there is any way I can make my question clearer. Thank you!
EDIT: I have an idea. Could this be happening because the min value changes between the first run and the second? I feel like, in concept, the inversion of an inversion should be the original. But in the math, the only number that is the same between the two runs is the range value. So there has to be a better way to do this. I'll continue to think about it, but any insights you guys have on it would be much appreciated.
Related
I am trying to program an auralization via Ray-Tracing in processing. To edit a sample over the information from the Ray Tracer, i need to convert a .wav File (File-Format: PCM-signed,16bit,stereo,2 bytes/frame, little endian) to an Float Array.
I converted the audio via an audioInputStream and a DataInputStream, where I am loading the audio into an byte Array.
Then I convert the byte Array to a float array like this.
byte[] samples;
float[] audio_data = float(samples);
When I convert the float Array back to a .wav File, I'm getting the sound of the original Audio-File.
But when I'm adding another Float Array to the Original signal and convert it back to a. wav file via the method above(even if I'm adding the same signal), i get a white noise signal instead of the wanted signal (I can hear the original signal under the white noise modulated, but very very silent).
I read about this problem before, that there can be problems by the conversion from the float array to a byte array. That's because float is a 32bit datatype and byte (in java) is only 16 bits and somehow the bytes get mixed together wrong so the white noise is the result. In Processing there is a data type with signed 16bit integers (named: "short") but i can't modify the amplitude anymore, because therefore i need float values, which i can't convert to short.
I also tried to handle the overflow (amplitude) in the float array by modulating the signal from 16 bit values (-32768/32767) to values from -1/1 and back again after mixing (adding) the signals. The result gave me white noise. When i added more than 2 signals it gaves me nothing (nothing to hear).
The concrete Problem I want to solve is to add many signals (more than 1000 with a decent delay to create a kind of reverbation) in the form of float Arrays. Then I want to combine them to one Float Array that i want to save as an audio file without white noise.
I hope you guys can help me.
If you have true PCM data points, there should be no problem using simple addition. The only issue is that on rare occasions (assuming your audio is not too hot to begin with) the values will go out of range. This will tend create a harsh distortion, not white noise. The fact that you are getting white noise suggests to me that maybe you are not converting your PCM sums back to bytes correctly for the format that you are outputting.
Here is some code I use in AudioCue to convert PCM back to bytes. The format is assumed to be 16-bit, 44100 fps, stereo, little-endian. I'm working with PCM as normalized floats. This algorithm does the conversion for a buffer's worth of data at a time.
for (int i = 0, n = buffer.length; i < n; i++)
{
buffer[i] *= 32767;
audioBytes[i*2] = (byte) buffer[i];
audioBytes[i*2 + 1] = (byte)((int)buffer[i] >> 8 );
}
Sometimes, a function like Math.min(Math.max(audioval, -1), 1) or Math.min(Math.max(audioval, -32767), 32767) is used to keep the values in range. More sophisticated limiters or compressor algorithms will scale the volume to fit. But still, if this is not handled, the result should be distortion, not white noise.
If the error is happening at another stage, we will need to see more of your code.
All this said, I wish you luck with the 1000-point echo array reverb. I hadn't heard of this approach working. Maybe there are processors that can handle the computational load now? (Are you trying to do this in real time?) My only success with coding real-time reverberation has been to use the Schroeder method, plugging the structure and values from the CCMRA Freeberb, working off of code from Craig Lindley's now ancient (copyright 2001) book "Digital Audio with Java". Most of that book deals with obsolete GUI code (pre-Swing!), but the code he gives for AllPass and Comb filters is still valid.
I recall when I was working on this that I tracked down references a better reverb to try and code, but I would have to do some real digging to try and find my notes. I was feeling over my head at the time, as the algorithm was presented via block diagrams not coding details or even pseudo-code. Would like to work on this again though and get a better reverb than the Shroeder-type to work. The Schoeder was passable for sounds that were not too percussive.
Getting a solution for real-time ray tracing would be a valuable accomplishment. Many applications in AR/VR and games.
I have a couple of huge images which can't be loaded into the memory in whole. I know that the images are tiled and all the methods in the class ImageReader give me plausible non zero return values for
getTileGridXOffset(int),
getTileGridYOffset(int),
getTileWidth(int) and
getTileHeight(int).
My problem now is that I want to read one tile only to avoid having to load the entire image into memory using the ImageReader.readtTile(int, int, int) method. But how do I determine what the valid values for the tile coordinates are?
There is the method getNumXTiles() and getNumYTiles() in the interface RenderedImage but all attempts to create a rendered image from the source results into a out of memory/java heap space error.
The tile coordinates can theoretically be anything and I tried readtTile(0, -1, -1) which also works for a few images I tested.
I also tried to reach the metadata for those images but I didn't find any useful information regarding the image layout.
Is there anyone who can tell me how to get the values for the tile coordinates without having to read the entire image into memory? Is there another way which does not require an instance of ImageLayout?
Thank you very much for your assistance.
First of all, you should check that the ImageReader in question supports tiling for the given image, using the isImageTiled(imageIndex). If it doesn't, you can't expect useful values from the other methods.
Then if it does, all tiles for a given image must be equal in size (but the last tile in each column/the last row may be truncated). This is also the case for all tiled file formats that I know of (ie. TIFF). So, using this knowledge, the number of tiles in both dimensions can be calculated:
// Calculate number of x tiles/y tiles:
int cols = (int) Math.ceil(reader.getWidth(imageIndex) / (double) reader.getTileWidth(imageIndex));
int rows = (int) Math.ceil(reader.getHeight(imageIndex) / (double) reader.getTileHeight(imageIndex));
You can then, loop over the tile indexes (the first tile is always 0,0):
for (int row = 0; row < rows; row++) {
for (int col = 0; col < cols; col++) {
BufferedImage tile = reader.readTile(imageIndex, col, row);
// ...do more processing...
}
}
Or, if you only want to get a single tile, you obviously don't need the double for loops. :-)
Note: For ImageReaders/images that don't support tiling, the getTileWidth and getTileHeight methods will just return the same as getWidthand getHeight, respectively.
Also, the readTile API docs says:
If the arguments are out of range, an IllegalArgumentException is thrown. If the image is not tiled, the values 0, 0 will return the entire image; any other values will cause an IllegalArgumentException to be thrown.
This means your example, readtTile(0, -1, -1) should always throw an IllegalArgumentException regardless of the tiling... I suspect some implementations may disregard the tile coordinates completely, and give you the entire image anyway.
PS: The RenderedImage interface could in theory help you. But it would require a special implementation in the ImageReader. In most cases you will just get a normal BufferedImage (which implements RenderedImage), and is a single (1x1) tile.
I have to perpare a Trainging set for my Machine Learning Course, in which for a given face image it gives you an answer representing the side of the head ( straight , left , right , up )
For this purpose i need to read a .pgm image file in java and store its pixels in one row of matrix X, and then store the appropriate right answer of this image in a y vector. finally i will save these two arrays in a .mat file.
The problem is when trying to read the pixel values from a (P2 .pgm) image and printing them to console , they don't give identical values with the matlab matrix viewer. what would be the problem?
This is my code:
try{
InputStream f = Main.class.getResourceAsStream("an2i_left_angry_open.pgm");
BufferedReader d = new BufferedReader(new InputStreamReader(f));
String magic = d.readLine(); // first line contains P2 or P5
String line = d.readLine(); // second line contains height and width
while (line.startsWith("#")) { // ignoring comment lines
line = d.readLine();
}
Scanner s = new Scanner(line);
int width = s.nextInt();
int height = s.nextInt();
line = d.readLine();// third line contains maxVal
s = new Scanner(line);
int maxVal = s.nextInt();
for(int i=0;i<30;i++) /* printing first 30 values from the image including spaces*/
System.out.println((byte)d.read());
} catch (EOFException eof) {
eof.printStackTrace(System.out) ;
}
these are the values i get:
50
49
32
50
32
49
32
48
32
50
32
49
56
32
53
57
while this photo is what is indeed in the image from MATLAB Viewer:
(sorry i can't post images because of lack of reputationS)
and this is what you find when you open the .pgm file via notepad++
Take a look at this post in particular. I've experienced similar issues with imread and with Java's ImageIO class and for the longest time, I could not find this link as proof that other people have experienced the same thing... until now. Similarly, someone experienced related issues in this post but it isn't quite the same at what you're experiencing.
Essentially, the reason why images loaded in both Java and MATLAB are different is due to enhancement purposes. MATLAB scales the intensities so the image isn't mostly black. Essentially, the maximum intensity in your PGM gets scaled to 255 while the other intensities are linearly scaled to suit the dynamic range of [0,255]. So for example, if your image had a dynamic range from [0-100] in your PGM file before loading it in with imread, this would get scaled to [0-255] and not be the original scale of [0-100]. As such, you would have to know the maximum intensity value of the image before you loaded it in (by scanning through the file yourself). That is very easily done by reading the third line of the file. In your case, this would be 156. Once you find this, you would need to scale every value in your image so that it is rescaled to what it originally was before you read it in.
To confirm that this is the case, take a look at the first pixel in your image, which has intensity 21 in the original PGM file. MATLAB would thus scale the intensities such that:
scaled = round(val*(255/156));
val would be the input intensity and scaled is the output intensity. As such, if val = 21, then scaled would be:
scaled = round(21*(255/156)) = 34
This matches up with the first pixel when reading it out in MATLAB. Similarly, the sixth pixel in the first row, the original value is 18. MATLAB would scale it such that:
scaled = round(18*(255/156)) = 29
This again matches up with what you see in MATLAB. Starting to see the pattern now? Basically, to undo the scaling, you would need to multiply by the reciprocal of the scaling factor. As such, given that A is the image you loaded in, you need to do:
A_scaled = uint8(double(A)*(max_value/255));
A_scaled is the output image and max_value is the maximum intensity found in your PGM file before you loaded it in with imread. This undoes the scaling, as MATLAB scales the images from [0-255]. Note that I need to cast the image to double first, do the multiplication with the scaling factor as this will most likely produce floating point values, then re-cast back to uint8. Therefore, to bring it back to [0-max_value], you would have to scale in the opposite way.
Specifically in your case, you would need to do:
A_scaled = uint8(double(A)*(156/255));
The disadvantage here is that you need to know what the maximum value is prior to working with your image, which can get annoying. One possibility is to use MATLAB and actually open up the file with file pointers and get the value of the third line yourself. This is also an annoying step, but I have an alternative for you.
Alternative... probably better for you
Alternatively, here are two links to functions written in MATLAB that read and write PGM files without doing that unnecessary scaling, and it'll provide the results that you are expecting (unscaled).
Reading: http://people.sc.fsu.edu/~jburkardt/m_src/pgma_io/pgma_read.m.
Writing: http://people.sc.fsu.edu/~jburkardt/m_src/pgma_io/pgma_write.m
How the read function works is that it opens up the image using file pointers and manually parses in the data and stores the values into a matrix. You probably want to use this function instead of relying on imread. To save the images, file pointers are again used and the values are written such that the PGM standard is maintained and again, your intensities are unscaled.
Your java implementation is printing the ASCII values of the text bytes "21 2 1" etc.
50->2
51->1
32->SPACE
50->2
32->SPACE
51->1
etc.
Some PGM files use a text header, but binary representation for the pixels themselves. These are marked with a different magic string at the beginning. It looks like the java code is reading the file as if it had binary pixels.
Instead, your PGM file has ASCII-coded pixels, where you want to scan a whitespace-separated value for each pixel. You do this the same way you read the width and height.
The debug code might look like this:
line = d.readLine(); // first image line
s = new Scanner(line);
for(int i=0;i<30;i++) /* printing first 30 values from the image including spaces*/
System.out.println((byte)s.nextInt());
I have a (pretty simple) piece of code I've thrown together which plays a sine wave of a specific frequency and plays it - it works no problem:
public class Sine {
private static final int SAMPLE_RATE = 16 * 1024;
private static final int FREQ = 500;
public static void main(String[] args) throws LineUnavailableException {
final AudioFormat af = new AudioFormat(SAMPLE_RATE, 8, 1, true, true);
try(SourceDataLine line = AudioSystem.getSourceDataLine(af)) {
line.open(af, SAMPLE_RATE);
line.start();
play(line);
line.drain();
}
}
private static void play(SourceDataLine line) {
byte[] arr = getData();
line.write(arr, 0, arr.length);
}
private static byte[] getData() {
final int LENGTH = SAMPLE_RATE * 100;
final byte[] arr = new byte[LENGTH];
for(int i = 0; i < arr.length; i++) {
double angle = (2.0 * Math.PI * i) / (SAMPLE_RATE/FREQ);
arr[i] = (byte) (Math.sin(angle) * 127);
}
return arr;
}
}
I can also modify the getData() method to return a byte array that produces a gradual change in pitch as it plays, no problems there.
However, I'm struggling with a way to continuously play a sine wave which I can smoothly update the frequency and amplitude of "live" - i.e. having FREQ in the above example changed by another thread and have the sound update in real time. I've tried creating the byte array and then filling it later in a separate thread based on the required values, but either seem to get nothing or distortion. I've also tried writing to the SourceDataLine in chunks, but this provides "blocks" of discrete frequencies rather than the smooth transition I'm after. A search around doesn't seem to provide much other than what I've already tried.
It's for an emulation of a theramin, so ideally needs to be as smooth low-latency as possible.
I can do it ahead of time no problem - but live is proving tricky. Has anyone any ideas or examples they could share?
I wrote a Java theremin, and it can be played at this url:
http://www.hexara.com/VSL/JTheremin.htm
On that site, there are two links to the Java Gaming forum where there was some discussion on the various issues involved.
I use a wavetable, rather than a sin function, to generate the PCM data, but the method of changing the variable that is fed into the sin function can be set up in a similar manner.
The easiest thing to do is to have a volatile float or double in the base class that is consulted in the innermost while loop where the sound bytes are being created. Your GUI can update this variable, and the while loop can base the pitch calculation on this.
Consulting the pitch variable once per buffer load will not be satisfactory, so the next logical step is to have your while loop check this variable with every frame you process! Yes, that means referring to the pitch variable 44100 times per second, if that is your frame rate.
But even so, the problem remains that response is limited by the manner in which the JVM time slices threads. When the sound thread is not actively looping, it is also not reading the new values that have been placed into the "pitch" variable! Recall that while the sound thread is quite able to keep the frame rate constant, it is not doing so in "real time," but in bursts of activity. Thus the GUI may overwrite the pitch value several times during the period when the sound processing thread is sleeping, resulting in pitch discontinuities.
To get around this, I made a FIFO where I store and timestamp all the GUI-generated pitch changing events. In the innermost sound processing loop, this FIFO is consulted (instead of the volatile double mentioned earlier) to determine the pitch value to be used, on a per-sample basis. Since the pitch values from a GUI will be discrete values and come at varying times, you need a method of interpolating pitch values to fill the gaps. I use the time stamps and values to calculate a per-frame interpolation, and thus update a pitch variable in the innermost loop every sample.
I think there are a lot of issues, still, with the solution I wrote, and am looking forward to revisiting this!
It looks like you are only reading from the data array once, so regardless of whether the data is modified, only one pitch will be produced. I would think you would need to be playing a shorter wave inside a loop that rereads the data array each iteration. I don't know how the SourceDataLine class functions though, so I don't know if this would produce the sound unsegmented.
I've started differentiating two images by counting the number of different pixels using a simple algorithm:
private int returnCountOfDifferentPixels(String pic1, String pic2)
{
Bitmap i1 = loadBitmap(pic1);
Bitmap i2 = loadBitmap(pic2);
int count=0;
for (int y = 0; y < i1.getHeight(); ++y)
for (int x = 0; x < i1.getWidth(); ++x)
if (i1.getPixel(x, y) != i2.getPixel(x, y))
{
count++;
}
return count;
}
However this approach seems to be inefficient in its initial form, as there is always a very high number of pixels which differ even in very similar photos.
I was thinking of a way of to determine if two pixels are really THAT different.
the bitmap.getpixel(x,y) from android returns a Color object.
How can I implement a proper differentiation between two Color objects, to help with my motion detection?
You are right, because of noise and other factors there is usually a lot of raw pixel change in a video stream. Here are some options you might want to consider:
Blurring the image first, ideally with a Gaussian filter or with a simple box filter. This just means that you take the (weighted) average over the neighboring pixel and the pixel itself. This should reduce the sensor noise quite a bit already.
Only adding the difference to count if it's larger than some threshold. This has the effect of only considering pixels that have really changed a lot. This is very easy to implement and might already solve your problem alone.
Thinking about it, try these two options first. If they don't work out, I can give you some more options.
EDIT: I just saw that you're not actually summing up differences but just counting different pixels. This is fine if you combine it with Option 2. Option 1 still works, but it might be an overkill.
Also, to find out the difference between two colors, use the methods of the Color class:
int p1 = i1.getPixel(x, y);
int p2 = i2.getPixel(x, y);
int totalDiff = Color.red(p1) - Color.red(p2) + Color.green(p1) - Color.green(p2) + Color.blue(p1) - Color.blue(p2);
Now you can come up with a threshold the totalDiff must exceed to contribute to count.
Of course, you can play around with these numbers in various ways. The above code for example only computes changes in pixel intensity (brightness). If you also wanted to take into account changes in hue and saturation, you would have to compute totalDifflike this:
int totalDiff = Math.abs(Color.red(p1) - Color.red(p2)) + Math.abs(Color.green(p1) - Color.green(p2)) + Math.abs(Color.blue(p1) - Color.blue(p2));
Also, have a look at the other methods of Color, for example RGBToHSV(...).
I know that this is essentially very similar another answer here but I think be restating it in a different form it might prove useful to those seeking the solution. This involves have more than two images over time. If you only literally then this will not work but an equivilent method will.
Do the history for all pixels on each frame. For example, for each pixel:
history[x, y] = (history[x, y] * (w - 1) + get_pixel(x, y)) / w
Where w might be w = 20. The higher w the larger the spike for motion but the longer motion has to be missing for it to reset.
Then to determine if something has changed you can do this for each pixel:
changed_delta = abs(history[x, y] - get_pixel(x, y))
total_delta += changed_delta
You will find that it stabilizes most of the noise and when motion happens you will get a large difference. You are essentially taking many frames and detecting motion from the many against the newest frame.
Also, for detecting positions of motion consider breaking the image into smaller pieces and doing them individually. Then you can find objects and track them across the screen by treating a single image as a grid of separate images.