I have a few raw PCM audio files. I can successfully read a stream of bytes from these files and play them through an audio playing mechanism which accepts PCM data as input.
When i read data from these files, i store it in byte[]. These tracks have the same size and are complementary in terms of sound (they sound good together). Therefore, I want to add several byte[] containing PCM data into a single byte[] of the same size, representing the final music.
I tried it in an easy thoughtless manner by simply doing it like this:
for(int i=0; i<finalbytes.length; i++)
{
finalbytes[i] = (byte) (music1bytes[i] + music2bytes[i]);
}
It actually wasn't that bad. The final sound is indeed an addition of both tracks. The problem is, when a few tracks are added, sometimes in specific parts of the song, peaks of static noise can be heard. It is probably due to the addition resulting in non-clamped values or something, which I don't know how to solve.
So, how to add two or more byte arrays of PCM data?
I'm assuming the samples of these raw audio files are 8 bit signed.
What's happening is overflow. If both samples add up to greater than 127 or less than -128 you won't get the correct result - you get integer overflow.
You could divide each resulting sample by 2:
finalbytes[i] = (byte) ((music1bytes[i] + music2bytes[i]) / 2);
This way, even if each audio file has a maximum sample value you will not get overflow. Disadvantage is that resulting file might be a bit quiet.
Another option is to clip:
int sample = music1bytes[i] + music2bytes[i];
sample = Math.min(sample, Byte.MAX_VALUE);
sample = Math.max(sample, Byte.MIN_VALUE);
finalbytes[i] = (byte)sample;
If both audio sources are pretty loud then there might be a lot of clipping and it mightn't sound that great.
You could also try using SoftMixingMixer from JavaSound and let it do the mixing for you. Might actually be a lot more work this way since you'd need to define the audio format of the RAW audio files, but it likely will give the best sounding result. With this option, you'll need to use openStream(AudioFormat) with the audio format of the output file and tell the mixer to play your 2 RAW audio files through lines.
Related
After searching for over 12 hours, I was unable to find anything regarding this. ALl I could find is how to use functions from the Sound API to measure and change the volume of the device, not the .wav file. It would be great if someone could advise us/tell us how to get and/or change the volume from specific timestamps of a .wav file itself, thank you very much!
Even if it is not possible to change the audio of the .wav file itself, we need to know at least how to measure the volume level at the specific timestamps.
To deal with the amplitude of the sound signal, you will have to inspect the PCM data held in the .wav file. Unfortunately, the Java Clip does not expose the PCM values. Java makes the individual PCM data values available through the AudioInputStream class, but you have to read the data points sequentially. A code example is available at The Java Tutorials: Using Files and Format Converters.
Here's a block quote of the relevant portion of the page:
Suppose you're writing a sound-editing application that allows the
user to load sound data from a file, display a corresponding waveform
or spectrogram, edit the sound, play back the edited data, and save
the result in a new file. Or perhaps your program will read the data
stored in a file, apply some kind of signal processing (such as an
algorithm that slows the sound down without changing its pitch), and
then play the processed audio. In either case, you need to get access
to the data contained in the audio file. Assuming that your program
provides some means for the user to select or specify an input sound
file, reading that file's audio data involves three steps:
Get an AudioInputStream object from the file.
Create a byte array in which you'll store successive chunks of data from the file.
Repeatedly read bytes from the audio input stream into the array. On each iteration, do something useful with the bytes in the array
(for example, you might play them, filter them, analyze them, display
them, or write them to another file).
The following code snippet outlines these steps:
int totalFramesRead = 0;
File fileIn = new File(somePathName);
// somePathName is a pre-existing string whose value was
// based on a user selection.
try {
AudioInputStream audioInputStream =
AudioSystem.getAudioInputStream(fileIn);
int bytesPerFrame =
audioInputStream.getFormat().getFrameSize();
if (bytesPerFrame == AudioSystem.NOT_SPECIFIED) {
// some audio formats may have unspecified frame size
// in that case we may read any amount of bytes
bytesPerFrame = 1;
}
// Set an arbitrary buffer size of 1024 frames.
int numBytes = 1024 * bytesPerFrame;
byte[] audioBytes = new byte[numBytes];
try {
int numBytesRead = 0;
int numFramesRead = 0;
// Try to read numBytes bytes from the file.
while ((numBytesRead =
audioInputStream.read(audioBytes)) != -1) {
// Calculate the number of frames actually read.
numFramesRead = numBytesRead / bytesPerFrame;
totalFramesRead += numFramesRead;
// Here, do something useful with the audio data that's
// now in the audioBytes array...
}
} catch (Exception ex) {
// Handle the error...
}
} catch (Exception e) {
// Handle the error...
}
END OF QUOTE
The values themselves will need another conversion step before they are PCM. If the file uses 16-bit encoding (most common), you will have to concatenate two bytes to make a single PCM value. With two bytes, the range of values is from -32778 to 32767 (a range of 2^16).
It is very common to normalize these values to floats that range from -1 to 1. This is done by float division using 32767 or 32768 in the denominator. I'm not really sure which is more correct (or how much getting this exactly right matters). I just use 32768 to avoid getting a result less than -1 if the signal has any data points that hit the minimum possible value.
I'm not entirely clear on how to convert the PCM values to decibels. I think the formulas are out there for relative adjustments, such as, if you want to lower your volume by 6 dBs. Changing volumes is a matter of multiplying each PCM value by the desired factor that matches the volume change you wish to make.
As far as measuring the volume at a given point, since PCM signal values can range pretty widely as they zig zag back and forth across the 0, the usual operation is to take an average of the absolute value of many PCM values. The process is referred to as getting a root mean square. The number of values to include in a RMS calculation can vary. I think the main consideration is to make the number of values in the rolling average to be large enough such that they are greater than the period of the lowest frequency included in the signal.
There are some good tutorials at the HackAudio site. This link is for the RMS calculation.
Story
While conducting an experiment I was saving a stream of random Bytes generated by a hardware RNG device. After the experiment was finished, I realized that the saving method was incorrect. I hope I can find the way how to fix the corrupted file so that I obtain the correct stream of random numbers back.
Example
The story of the problem can be explained in the following simple example.
Let's say I have a stream of random numbers in an input file randomInput.bin. I will simulate the stream of random numbers coming from the hardware RNG device by sending the input file to stdout via cat. I found two ways how to save this stream to a file:
A) Harmless saving method
This method gives me exactly the original stream of random Bytes.
import scala.sys.process._
import java.io.File
val res = ("cat randomInput.bin" #> new File(outputFile))!
B) Saving method leading to corruption
Unfortunately, this is the original saving method I chose.
import scala.sys.process._
import java.io.PrintWriter
val randomBits = "cat randomInput.bin".!!
val out = new PrintWriter(outputFile)
out.println(randomBits)
if (out != null) {
out.close()
Seq("chmod", "600", outputFile).!
}
The file saved using method B) is still binary, however, is is approximately 2x larger that the file saved by method A). Further analysis shows that the stream of random Bits is significantly less random.
Summary
I suspect that the saving method B) adds something to almost every byte, however, the understanding of this is behind my expertise in Java/Scala I/O.
I would very much appreciate if somebody explained me the low-level difference between methods A) and B). The goal is to revert the changes created by saving method B) and obtain the original stream of random Bytes.
Thank you very much in advance!
The problem is probably that println is meant for text, and this text is being encoded as Unicode, which uses multiple bytes for some or all characters, depending on which version of Unicode.
If the file is exactly 2x larger than it should be, then you've probably got a null byte every other byte, which could be easy to fix. Otherwise, it may be harder to figure out what you would need to do to recover the binary data. Viewing the corrupted file in a hex editor may help you see what happened. Either way, I think it may be easier to just generate new random data and save it correctly.
Especially if this is for an experiment, if your random data has been corrupted and then fixed, it may be harder to justify that the data is truly random compared to just generating it properly in the first place.
I am trying to program an auralization via Ray-Tracing in processing. To edit a sample over the information from the Ray Tracer, i need to convert a .wav File (File-Format: PCM-signed,16bit,stereo,2 bytes/frame, little endian) to an Float Array.
I converted the audio via an audioInputStream and a DataInputStream, where I am loading the audio into an byte Array.
Then I convert the byte Array to a float array like this.
byte[] samples;
float[] audio_data = float(samples);
When I convert the float Array back to a .wav File, I'm getting the sound of the original Audio-File.
But when I'm adding another Float Array to the Original signal and convert it back to a. wav file via the method above(even if I'm adding the same signal), i get a white noise signal instead of the wanted signal (I can hear the original signal under the white noise modulated, but very very silent).
I read about this problem before, that there can be problems by the conversion from the float array to a byte array. That's because float is a 32bit datatype and byte (in java) is only 16 bits and somehow the bytes get mixed together wrong so the white noise is the result. In Processing there is a data type with signed 16bit integers (named: "short") but i can't modify the amplitude anymore, because therefore i need float values, which i can't convert to short.
I also tried to handle the overflow (amplitude) in the float array by modulating the signal from 16 bit values (-32768/32767) to values from -1/1 and back again after mixing (adding) the signals. The result gave me white noise. When i added more than 2 signals it gaves me nothing (nothing to hear).
The concrete Problem I want to solve is to add many signals (more than 1000 with a decent delay to create a kind of reverbation) in the form of float Arrays. Then I want to combine them to one Float Array that i want to save as an audio file without white noise.
I hope you guys can help me.
If you have true PCM data points, there should be no problem using simple addition. The only issue is that on rare occasions (assuming your audio is not too hot to begin with) the values will go out of range. This will tend create a harsh distortion, not white noise. The fact that you are getting white noise suggests to me that maybe you are not converting your PCM sums back to bytes correctly for the format that you are outputting.
Here is some code I use in AudioCue to convert PCM back to bytes. The format is assumed to be 16-bit, 44100 fps, stereo, little-endian. I'm working with PCM as normalized floats. This algorithm does the conversion for a buffer's worth of data at a time.
for (int i = 0, n = buffer.length; i < n; i++)
{
buffer[i] *= 32767;
audioBytes[i*2] = (byte) buffer[i];
audioBytes[i*2 + 1] = (byte)((int)buffer[i] >> 8 );
}
Sometimes, a function like Math.min(Math.max(audioval, -1), 1) or Math.min(Math.max(audioval, -32767), 32767) is used to keep the values in range. More sophisticated limiters or compressor algorithms will scale the volume to fit. But still, if this is not handled, the result should be distortion, not white noise.
If the error is happening at another stage, we will need to see more of your code.
All this said, I wish you luck with the 1000-point echo array reverb. I hadn't heard of this approach working. Maybe there are processors that can handle the computational load now? (Are you trying to do this in real time?) My only success with coding real-time reverberation has been to use the Schroeder method, plugging the structure and values from the CCMRA Freeberb, working off of code from Craig Lindley's now ancient (copyright 2001) book "Digital Audio with Java". Most of that book deals with obsolete GUI code (pre-Swing!), but the code he gives for AllPass and Comb filters is still valid.
I recall when I was working on this that I tracked down references a better reverb to try and code, but I would have to do some real digging to try and find my notes. I was feeling over my head at the time, as the algorithm was presented via block diagrams not coding details or even pseudo-code. Would like to work on this again though and get a better reverb than the Shroeder-type to work. The Schoeder was passable for sounds that were not too percussive.
Getting a solution for real-time ray tracing would be a valuable accomplishment. Many applications in AR/VR and games.
As tilte,
I tried to capture the file size info which is saved in 4 bytes of data from bitmap's file code,
but if I use byte[] to save it and any byte of file exceeds 127, it'd be misjudged as negative values,
in this case how could we correct these kind of values?
My book just simply plus 256 to it, but could it fixedly be -128?
if not, then the result still isn't correct.
I know we can just use int[] or larger array for it, just wanna know how to deal with these kind of problems!
Thanks a lot for helping!!!
I've run into a bit of a problem when it comes to writing specific bits to a file. I apologise if this is a duplicate of anything but I could not find a reasonable answer with the searches I ran.
I have a number of difficulties with the following:
Writing a header (Long) bit by bit (converted to a byte array so the
FileOutputStream can utilise it) to the file.
Writing single bits to the file. For example, at one stage I am required to write a single bit set to 0 to the file so my initial thought would be to use a BitSet but Java seems to treat this as a null?
BitSet initialPadding = new BitSet();
initialPadding.set(0, false);
fileOutputStream.write(initialPadding.toByteArray());
1)
I create a FileOutputStream as shown below with the necessary file name:
FileOutputStream fileOutputStream = new FileOutputStream(file.getAbsolutePath());
I am attempting to create an ".amr" file so the first step before I perform any bit manipulation is to write a header to the beginning of the file. This has the following value:
Long defaultHeader = 0x2321414d520aL;
I've tried writing this to the file using the following method but I am pretty sure it does not write the correct result:
fileOutputStream.write(defaultHeader.byteValue());
Am I using the correct streams? Are my convertions completely wrong?
2)
I have a public BitSet fileBitSet;which has bits read in from a ".raw" file as the input. I need to be able to extract certain bits from the BitSet in order to write them to the file later. I do this using the following method:
public int getOctetPayloadHeader(int startPoint) {
int readLength = 0;
octetCMR = fileBitSet.get(0, 3);
octetRES = fileBitSet.get(4, 7);
if (octetRES.get(0, 3).isEmpty()) {
/* Keep constructing the payload header. */
octetFBit = fileBitSet.get(8, 8);
octetMode = fileBitSet.get(9, 12);
octetQuality = fileBitSet.get(13, 13);
octetPadding = fileBitSet.get(14, 15);
... }
What would be the best way to go for writing these bits to a file bearing in mind that I may be required to sometimes write a single bit or 81 bits at a particular offset in the fileBitSet ?
There is only one thing you can write to an OutputStream: bytes. You have to do the composing of your bits into bytes yourself; only you know the rules how the bits are to be put together into bytes.
As for stuff like:
Long defaultHeader = 0x2321414d520aL;
fileOutputStream.write(defaultHeader.byteValue());
You should take a close look at the javadocs for the methods you are using. byteValue() returns a single byte; so of course its not doing what you expect. Working with streams is well explained in oracles tutorials: http://docs.oracle.com/javase/tutorial/essential/io/streams.html
For writing single bits or groups of bits, you will need a custom OutputStream that handles grouping the bits into bytes to be written. Thats commonly called a BitStream (there is no such class in the JDK); you have to either write it yourself (which I highly recommend, its a very good excercise to teach you about bits and bytes) or find one on the web.