I have to perpare a Trainging set for my Machine Learning Course, in which for a given face image it gives you an answer representing the side of the head ( straight , left , right , up )
For this purpose i need to read a .pgm image file in java and store its pixels in one row of matrix X, and then store the appropriate right answer of this image in a y vector. finally i will save these two arrays in a .mat file.
The problem is when trying to read the pixel values from a (P2 .pgm) image and printing them to console , they don't give identical values with the matlab matrix viewer. what would be the problem?
This is my code:
try{
InputStream f = Main.class.getResourceAsStream("an2i_left_angry_open.pgm");
BufferedReader d = new BufferedReader(new InputStreamReader(f));
String magic = d.readLine(); // first line contains P2 or P5
String line = d.readLine(); // second line contains height and width
while (line.startsWith("#")) { // ignoring comment lines
line = d.readLine();
}
Scanner s = new Scanner(line);
int width = s.nextInt();
int height = s.nextInt();
line = d.readLine();// third line contains maxVal
s = new Scanner(line);
int maxVal = s.nextInt();
for(int i=0;i<30;i++) /* printing first 30 values from the image including spaces*/
System.out.println((byte)d.read());
} catch (EOFException eof) {
eof.printStackTrace(System.out) ;
}
these are the values i get:
50
49
32
50
32
49
32
48
32
50
32
49
56
32
53
57
while this photo is what is indeed in the image from MATLAB Viewer:
(sorry i can't post images because of lack of reputationS)
and this is what you find when you open the .pgm file via notepad++
Take a look at this post in particular. I've experienced similar issues with imread and with Java's ImageIO class and for the longest time, I could not find this link as proof that other people have experienced the same thing... until now. Similarly, someone experienced related issues in this post but it isn't quite the same at what you're experiencing.
Essentially, the reason why images loaded in both Java and MATLAB are different is due to enhancement purposes. MATLAB scales the intensities so the image isn't mostly black. Essentially, the maximum intensity in your PGM gets scaled to 255 while the other intensities are linearly scaled to suit the dynamic range of [0,255]. So for example, if your image had a dynamic range from [0-100] in your PGM file before loading it in with imread, this would get scaled to [0-255] and not be the original scale of [0-100]. As such, you would have to know the maximum intensity value of the image before you loaded it in (by scanning through the file yourself). That is very easily done by reading the third line of the file. In your case, this would be 156. Once you find this, you would need to scale every value in your image so that it is rescaled to what it originally was before you read it in.
To confirm that this is the case, take a look at the first pixel in your image, which has intensity 21 in the original PGM file. MATLAB would thus scale the intensities such that:
scaled = round(val*(255/156));
val would be the input intensity and scaled is the output intensity. As such, if val = 21, then scaled would be:
scaled = round(21*(255/156)) = 34
This matches up with the first pixel when reading it out in MATLAB. Similarly, the sixth pixel in the first row, the original value is 18. MATLAB would scale it such that:
scaled = round(18*(255/156)) = 29
This again matches up with what you see in MATLAB. Starting to see the pattern now? Basically, to undo the scaling, you would need to multiply by the reciprocal of the scaling factor. As such, given that A is the image you loaded in, you need to do:
A_scaled = uint8(double(A)*(max_value/255));
A_scaled is the output image and max_value is the maximum intensity found in your PGM file before you loaded it in with imread. This undoes the scaling, as MATLAB scales the images from [0-255]. Note that I need to cast the image to double first, do the multiplication with the scaling factor as this will most likely produce floating point values, then re-cast back to uint8. Therefore, to bring it back to [0-max_value], you would have to scale in the opposite way.
Specifically in your case, you would need to do:
A_scaled = uint8(double(A)*(156/255));
The disadvantage here is that you need to know what the maximum value is prior to working with your image, which can get annoying. One possibility is to use MATLAB and actually open up the file with file pointers and get the value of the third line yourself. This is also an annoying step, but I have an alternative for you.
Alternative... probably better for you
Alternatively, here are two links to functions written in MATLAB that read and write PGM files without doing that unnecessary scaling, and it'll provide the results that you are expecting (unscaled).
Reading: http://people.sc.fsu.edu/~jburkardt/m_src/pgma_io/pgma_read.m.
Writing: http://people.sc.fsu.edu/~jburkardt/m_src/pgma_io/pgma_write.m
How the read function works is that it opens up the image using file pointers and manually parses in the data and stores the values into a matrix. You probably want to use this function instead of relying on imread. To save the images, file pointers are again used and the values are written such that the PGM standard is maintained and again, your intensities are unscaled.
Your java implementation is printing the ASCII values of the text bytes "21 2 1" etc.
50->2
51->1
32->SPACE
50->2
32->SPACE
51->1
etc.
Some PGM files use a text header, but binary representation for the pixels themselves. These are marked with a different magic string at the beginning. It looks like the java code is reading the file as if it had binary pixels.
Instead, your PGM file has ASCII-coded pixels, where you want to scan a whitespace-separated value for each pixel. You do this the same way you read the width and height.
The debug code might look like this:
line = d.readLine(); // first image line
s = new Scanner(line);
for(int i=0;i<30;i++) /* printing first 30 values from the image including spaces*/
System.out.println((byte)s.nextInt());
Related
After searching for over 12 hours, I was unable to find anything regarding this. ALl I could find is how to use functions from the Sound API to measure and change the volume of the device, not the .wav file. It would be great if someone could advise us/tell us how to get and/or change the volume from specific timestamps of a .wav file itself, thank you very much!
Even if it is not possible to change the audio of the .wav file itself, we need to know at least how to measure the volume level at the specific timestamps.
To deal with the amplitude of the sound signal, you will have to inspect the PCM data held in the .wav file. Unfortunately, the Java Clip does not expose the PCM values. Java makes the individual PCM data values available through the AudioInputStream class, but you have to read the data points sequentially. A code example is available at The Java Tutorials: Using Files and Format Converters.
Here's a block quote of the relevant portion of the page:
Suppose you're writing a sound-editing application that allows the
user to load sound data from a file, display a corresponding waveform
or spectrogram, edit the sound, play back the edited data, and save
the result in a new file. Or perhaps your program will read the data
stored in a file, apply some kind of signal processing (such as an
algorithm that slows the sound down without changing its pitch), and
then play the processed audio. In either case, you need to get access
to the data contained in the audio file. Assuming that your program
provides some means for the user to select or specify an input sound
file, reading that file's audio data involves three steps:
Get an AudioInputStream object from the file.
Create a byte array in which you'll store successive chunks of data from the file.
Repeatedly read bytes from the audio input stream into the array. On each iteration, do something useful with the bytes in the array
(for example, you might play them, filter them, analyze them, display
them, or write them to another file).
The following code snippet outlines these steps:
int totalFramesRead = 0;
File fileIn = new File(somePathName);
// somePathName is a pre-existing string whose value was
// based on a user selection.
try {
AudioInputStream audioInputStream =
AudioSystem.getAudioInputStream(fileIn);
int bytesPerFrame =
audioInputStream.getFormat().getFrameSize();
if (bytesPerFrame == AudioSystem.NOT_SPECIFIED) {
// some audio formats may have unspecified frame size
// in that case we may read any amount of bytes
bytesPerFrame = 1;
}
// Set an arbitrary buffer size of 1024 frames.
int numBytes = 1024 * bytesPerFrame;
byte[] audioBytes = new byte[numBytes];
try {
int numBytesRead = 0;
int numFramesRead = 0;
// Try to read numBytes bytes from the file.
while ((numBytesRead =
audioInputStream.read(audioBytes)) != -1) {
// Calculate the number of frames actually read.
numFramesRead = numBytesRead / bytesPerFrame;
totalFramesRead += numFramesRead;
// Here, do something useful with the audio data that's
// now in the audioBytes array...
}
} catch (Exception ex) {
// Handle the error...
}
} catch (Exception e) {
// Handle the error...
}
END OF QUOTE
The values themselves will need another conversion step before they are PCM. If the file uses 16-bit encoding (most common), you will have to concatenate two bytes to make a single PCM value. With two bytes, the range of values is from -32778 to 32767 (a range of 2^16).
It is very common to normalize these values to floats that range from -1 to 1. This is done by float division using 32767 or 32768 in the denominator. I'm not really sure which is more correct (or how much getting this exactly right matters). I just use 32768 to avoid getting a result less than -1 if the signal has any data points that hit the minimum possible value.
I'm not entirely clear on how to convert the PCM values to decibels. I think the formulas are out there for relative adjustments, such as, if you want to lower your volume by 6 dBs. Changing volumes is a matter of multiplying each PCM value by the desired factor that matches the volume change you wish to make.
As far as measuring the volume at a given point, since PCM signal values can range pretty widely as they zig zag back and forth across the 0, the usual operation is to take an average of the absolute value of many PCM values. The process is referred to as getting a root mean square. The number of values to include in a RMS calculation can vary. I think the main consideration is to make the number of values in the rolling average to be large enough such that they are greater than the period of the lowest frequency included in the signal.
There are some good tutorials at the HackAudio site. This link is for the RMS calculation.
I have a full red image I made using MS Paint (red = 255, blue = 0, green = 0)
I read that image into a File object file
Then I extracted the bytes using Files.readAllBytes(file.toPath()) into a byte array byteArray
Now my expectation is that :
a) byteArray[0], when converted to bitstream, should be all 1
b) byteArray[1], when converted to bitstream, should be all 0
c) byteArray[2], when converted to bitstream, should be all 0
because, as I understand, the pixels values are stored in the order RGB with 8 bits for each color.
When I run my code, I don't get expected outcome. byteArray[0] is all 1 alright, but the other 2 aren't 0s.
Where am I going wrong?
Edit
As requested, I'm including image size, saved format and code used to read it.
Size = 1920p x 1080p
Format = JPG
Code:
File file = new File("image_path.jpg");
byte byteArray[]= new byte[(int) file.length()];
try {
byteArray = Files.readAllBytes(file.toPath());
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
int bits[] = new int[8];
for(int j=0; j<8; j++)
{
bits[j] = (b[0] & (1 << j))==0 ? 0:1 ;
//System.out.println("bitsb :"+bitsb[j]);
}
Update
Unfortunately I am unable to make use of other questions containing ImageIO library functions. I'm here partly trying to understand how the image itself is stored, and how I can write my own logic for retrieving and manipulating the image files.
JPEG is a complex image format.
It does not hold the raw image pixel data, but instead has a header, optional metadata and compressed image data.
The algorithm to decompress it to raw pixel values is quite complex, but there are libraries that will do the work for you.
Here is a short tutorial:
https://docs.oracle.com/javase/tutorial/2d/images/loadimage.html
Here is the documentation of the BufferedImage class which will hold the image data:
https://docs.oracle.com/javase/7/docs/api/java/awt/image/BufferedImage.html
You will need to use one of the getRGB functions to access the raw pixel data.
Make sure to check that your image is in 24 bit color format, if you want each color component to take 1 byte exactly!
JPEG supports other formats such as 32 and 16 bits!
Alternatively, save your image as 24 bit uncompressed BMP.
The file will be much larger, but reading it is much simpler so you don't have to use a library.
Just skip the header, then read raw bytes.
An even simpler image format to work with would be PBM/PPM.
I am trying to program an auralization via Ray-Tracing in processing. To edit a sample over the information from the Ray Tracer, i need to convert a .wav File (File-Format: PCM-signed,16bit,stereo,2 bytes/frame, little endian) to an Float Array.
I converted the audio via an audioInputStream and a DataInputStream, where I am loading the audio into an byte Array.
Then I convert the byte Array to a float array like this.
byte[] samples;
float[] audio_data = float(samples);
When I convert the float Array back to a .wav File, I'm getting the sound of the original Audio-File.
But when I'm adding another Float Array to the Original signal and convert it back to a. wav file via the method above(even if I'm adding the same signal), i get a white noise signal instead of the wanted signal (I can hear the original signal under the white noise modulated, but very very silent).
I read about this problem before, that there can be problems by the conversion from the float array to a byte array. That's because float is a 32bit datatype and byte (in java) is only 16 bits and somehow the bytes get mixed together wrong so the white noise is the result. In Processing there is a data type with signed 16bit integers (named: "short") but i can't modify the amplitude anymore, because therefore i need float values, which i can't convert to short.
I also tried to handle the overflow (amplitude) in the float array by modulating the signal from 16 bit values (-32768/32767) to values from -1/1 and back again after mixing (adding) the signals. The result gave me white noise. When i added more than 2 signals it gaves me nothing (nothing to hear).
The concrete Problem I want to solve is to add many signals (more than 1000 with a decent delay to create a kind of reverbation) in the form of float Arrays. Then I want to combine them to one Float Array that i want to save as an audio file without white noise.
I hope you guys can help me.
If you have true PCM data points, there should be no problem using simple addition. The only issue is that on rare occasions (assuming your audio is not too hot to begin with) the values will go out of range. This will tend create a harsh distortion, not white noise. The fact that you are getting white noise suggests to me that maybe you are not converting your PCM sums back to bytes correctly for the format that you are outputting.
Here is some code I use in AudioCue to convert PCM back to bytes. The format is assumed to be 16-bit, 44100 fps, stereo, little-endian. I'm working with PCM as normalized floats. This algorithm does the conversion for a buffer's worth of data at a time.
for (int i = 0, n = buffer.length; i < n; i++)
{
buffer[i] *= 32767;
audioBytes[i*2] = (byte) buffer[i];
audioBytes[i*2 + 1] = (byte)((int)buffer[i] >> 8 );
}
Sometimes, a function like Math.min(Math.max(audioval, -1), 1) or Math.min(Math.max(audioval, -32767), 32767) is used to keep the values in range. More sophisticated limiters or compressor algorithms will scale the volume to fit. But still, if this is not handled, the result should be distortion, not white noise.
If the error is happening at another stage, we will need to see more of your code.
All this said, I wish you luck with the 1000-point echo array reverb. I hadn't heard of this approach working. Maybe there are processors that can handle the computational load now? (Are you trying to do this in real time?) My only success with coding real-time reverberation has been to use the Schroeder method, plugging the structure and values from the CCMRA Freeberb, working off of code from Craig Lindley's now ancient (copyright 2001) book "Digital Audio with Java". Most of that book deals with obsolete GUI code (pre-Swing!), but the code he gives for AllPass and Comb filters is still valid.
I recall when I was working on this that I tracked down references a better reverb to try and code, but I would have to do some real digging to try and find my notes. I was feeling over my head at the time, as the algorithm was presented via block diagrams not coding details or even pseudo-code. Would like to work on this again though and get a better reverb than the Shroeder-type to work. The Schoeder was passable for sounds that were not too percussive.
Getting a solution for real-time ray tracing would be a valuable accomplishment. Many applications in AR/VR and games.
Try as I might, I cannot figure out how to solve this program. I apologize for its cheesiness, I'm a student learning java and was given this as a practice.
Lixnor is a mutant space trader in the Andromeda IV galaxy. He is low on
supplies and funds and has trouble paying for the fuel that his ship
requires. Every SGW (Standard Galactic Week) his partner, Ronxil gives him
the coordinates for 2 valuable lost crates floating in space. The
coordinates are sent in a file named "Coordinates.txt". The files always
have 3 lines, each line containing a coordinate in the format (x,y,z) where
x,y,z are integers. Due to Lixnor's lack of funds, he must first calculate
whether it would be worth it for him to go pick them up. The first
coordinates given are Lixnor's current coordinates, and the next two are the
coordinates of the two crates. Lixnor must pick them up, then return to his
original location. Write a program for Lixnor that calculates the distance
he must travel in order to pick up the crates and return to his original
position.
(24,-34,46)
(1,2,3)
(123,-1,0)
Specifically, I'm having trouble getting Java to read the file. Any help would be fantastic!
Reading a file is simple
Get the file instance using File file = new File(String fileAddress);
Now use Scanner scanner = new Scanner(file); to read the file.
Proceed and show us the work for further help.
I agree with #ImGeorge that before doing this work get some basic knowledge about File Reader.
At SO we cannot help without code.
Look into the javadoc of Files.readAllLines.
Path path = Paths.get(".... .txt");
List<String> lines = Files.readAllLines(path); // Using UTF-8
This is one of the possibilities. The processing is up to you.
I have a bunch of images, to many to do by hand that are 16 color 8 bit PNG format that I need in 16 4 bit format, they all have the same palette.
I am scouring Google for the best library to use, but I am not finding much on this specific problem so I am coming here for hopefully some more targeted solutions.
I am trying to use PIL based on other answers I have found here, but not having any luck.
img = Image.open('DownArrow_focused.png')
img = img.point(lambda i: i * 16, "L")
img.save('DownArrow_focused.png', 'PNG')
but this gives me a grayscale image, not what I want.
PIL won't work, trying PyPNG. GIMP does this, but I have hundreds of these things I need to batch process them. And get batches of these to convert, so it isn't a one time thing.
A Java based solution would be acceptable as well, pretty much anything I can run from the command line on a Linux/OSX machine will be acceptable.
In PNG the palette is always stored in RGB8 (3 bytes for each index=color), with an arbitrary (up to 256) number of entries. If you currently have a 8 bit image with a 16-colors palette (16 total entries), you dont need to alter the pallete, only to repack the pixel bytes (two indexes per byte). If so, I think you could do it with PNGJ with this code (untested):
public static void reencode(String orig, String dest) {
PngReader png1 = FileHelper.createPngReader(new File(orig));
ImageInfo pnginfo1 = png1.imgInfo;
ImageInfo pnginfo2 = new ImageInfo(pnginfo1.cols, pnginfo1.rows, 4, false,false,true);
PngWriter png2 = FileHelper.createPngWriter(new File(dest), pnginfo2, false);
png2.copyChunksFirst(png1, ChunksToWrite.COPY_ALL);
ImageLine l2 = new ImageLine(pnginfo2);
for (int row = 0; row < pnginfo1.rows; row++) {
ImageLine l1 = png1.readRow(row);
l2.tf_pack(l1.scanline, false);
l2.setRown(row);
png2.writeRow(l2);
}
png1.end();
png2.copyChunksLast(png1, ChunksToWrite.COPY_ALL);
png2.end();
System.out.println("Done");
}
Elsewhere, if your current pallette has 16 "used" colors (but its length is greater because it includes unused colors), you need to do some work, modifying the palette chunk (but it also can be done).
Call Netpbm programs
http://netpbm.sourceforge.net/
from a Python script using the following commands:
$ pngtopnm test.png | pnmquant 16 | pnmtopng > test16.png
$ file test16.png
test16.png: PNG image data, 700 x 303, 4-bit colormap, non-interlaced
And GIMP reports test16.png as having Color space: Indexed color (16 colors),
which I guess is what you want.
This is not a pure Python solution but PIL is also not pure Python and has dependencies on shared libraries too. I think you cannot avoid a dependency on some external image software.