Tone generator generates a second tone - java

I am trying to make a simple signalgenerator for isochronic pulsating sounds.
Basically a "beatFrequency" is controlling the amplitude variation of the main (pitch) frequency.
It works pretty well except that for some pitch frequencies above 4-5 K Hz, there is a second tone generated with lower frequency.
It's not for all frequencies but for quite many I can definetly hear a second tone.
What can this be? Some kind of resonance? I tried increasing the sampling rate, but its not changing anything, and using 44100 should be enough up to around 20 KHz, if I understand correctly?
I really can't figure it out on my own, so thankful for all help!
Here is an example code with beatFreequency 1 Hz, Pitch frequency 5000 Hz and samplerate 44100.
public void playSound() {
double beatFreq = 1;
double pitch = 5000;
int mSampleRate = 44100;
AudioTrack mAudioTrack = new AudioTrack(AudioManager.STREAM_MUSIC, mSampleRate,
AudioFormat.CHANNEL_OUT_STEREO, AudioFormat.ENCODING_PCM_16BIT,
256, AudioTrack.MODE_STREAM);
int loopLength = 2 * this.mSampleRate;
while (isPlaying) {
double[] mSound = new double[loopLength];
for (int i = 0; i < loopLength; i = i + 1) {
mSound[i] = beatFreq*Math.sin((1.0*Math.PI * i/(this.mSampleRate/pitch)));
mBuffer[i] = (short) (mSound[i] * Short.MAX_VALUE);
}
mAudioTrack.play();
mAudioTrack.write(mBuffer, 0, loopLength);
}
}
Here is (added) an image of the frequencies when I play the tone 4734Hz.. And for example there is a rather large peak at around 1100 Hz as well as many higher.
The code is now just using the pitch, I have removed the beat Freq:

In your code, you are using beatFreq*Math.sin((1.0*Math.PI * i/(this.mSampleRate/pitch)));to determine the frequency (Missing some sort of assignment here)
However, mSampleRate and pitch are int values, leading to an integer division instead of a double division. For special cases for pitch and samplerate , this will result into lower frequencies than intended. For greater pitch values the effect should getting worse.
Try to use double instead of int, that should get rid of the problem.

Rewriting answer, after better understanding question, that the OP wants to do amplitude modulation.
Yes, Java can do Amplitude Modulation. I've made a passable cricket sound, for example, by taking a 4.9KHz tone and modulating the volume at 66Hz, and giving the resulting tone an AR envelope.
In your code, the variable beatFreq remains constant over an entire for-loop. Isn't your intention to vary this as well over the course of time?
I think you should simultaneously compute the beatFreq wave value in its own function (but also using the varying i), and multiply that result (scaled to range from 0 to 1) against the value computed for the faster tone.
** EDIT
To move redundant calculations out of the inner loop the following is a possibility:
Have the following as instance variables:
private double incr;
private double incrSum;
private final double TWO_PI = Math.PI * 2;
Have the following calculation done only one time, in your constructor:
incr = pitch / audioFmt.getSampleRate();
incr *= TWO_PI;
This assumes pitch is a value stated in Hertz, and `audioFormat' is the Java AudioFormat being used. With Android, I don't know how the audio format sample rate is stored or accessed.
With this in place you can have a method that returns the next double PCM value with some very simple code:
private double getNextSinePCM()
{
incrSum += incr;
if (incrSum > TWO_PI)
{
incrSum -= TWO_PI;
}
return Math.sin(incrSum);
}
Note: do not reset incrSum to zero as you stated in your comment. This can introduce a discontinuity.
If you have a second tone, it would get its own increment and running sum. This the two results, you can then multiply them to get amplitude modulation.
Now, as to the question as how to properly convert the PCM double value returned to something Android can use, I can't give you a definitive answer as I am not running Android.
Just a comment: it seems to me you are enthusiastic about working with sound, but maybe self-taught or lagging a bit in basic programming techniques. Moving redundant calculations outside of a loop is kind of fundamental. So is the ability to make simple test cases for testing assumptions and trouble-shooting. As you go forward, I want to encourage you to dedicate some time to developing these fundamentals as well as pursuing the interest in sound! You might check out the StackOverflow code reading group as a resource for more tips. I am also self-taught and have learned a lot from there as well as the code-reading course at JavaRanch called "CattleDrive".

Related

How to find the length of a short array to fill a video with audio using Xuggler?

I'm trying to add audio to a video, where I need a single short array representing the audio. I don't know how to get the length of this array.
I've found an estimate of 91 shorts per millisecond, but I don't how how to get an exact value instead of guessing and checking.
Here's the relevant code:
IMediaWriter writer = ToolFactory.makeWriter(file.getAbsolutePath());
writer.addVideoStream(0, 0, IRational.make(fps, 1), animation.getWidth(), animation.getHeight());
writer.addAudioStream(1, 0, 2, 44100);
...
int scale = 1000 / 11; // TODO
short[] audio = new short[animation.getLength() * scale];
animation.getLength() is the length of the video in milliseconds
What's the formula for calculating the scale variable?
The reason a list of shorts is needed is since this is an animation library that supports adding lots of sounds into the outputted video. Thus, I loop through all the requested sounds, turn the sounds into short lists, and then add their values to the correct spot in the main audio list. Not using a short list would make it so I can't stack several sounds on top of each other and make the timing more difficult.
The scale is what's known as the audio sampling rate which is normally measured in Hertz (Hz) which corresponds to "samples per second".
Assuming each element in your array is a single audio sample, you can estimate the array size by multiplying the audio sampling rate by the animation duration in seconds.
For example, if your audio sampling rate is 48,000 Hz:
int audioSampleRate = 48000;
double samplesPerMillisecond = (double) audioSampleRate / 1000;
int estimatedArrayLength = (int) (animation.getLength() * samplesPerMillisecond);

What could Error in this java program to compute sine?

I have written this code to compute the sine of an angle. This works fine for smaller angles, say upto +-360. But with larger angles it starts giving faulty results. (When I say larger, I mean something like within the range +-720 or +-1080)
In order to get more accurate results I increased the number of times my loop runs. That gave me better results but still that too had its limitations.
So I was wondering if there is any fault in my logic or do I need to fiddle with the conditional part of my loop? How can I overcome this shortcoming of my code? The inbuilt java sine function gives correct results for all the angles I have tested..so where am I going wrong?
Also can anyone give me an idea as to how do I modify the condition of my loop so that it runs until I get a desired decimal precision?
import java.util.Scanner;
class SineFunctionManual
{
public static void main(String a[])
{
System.out.print("Enter the angle for which you want to compute sine : ");
Scanner input = new Scanner(System.in);
int degreeAngle = input.nextInt(); //Angle in degree.
input.close();
double radianAngle = Math.toRadians(degreeAngle); //Sine computation is done in terms of radian angle
System.out.println(radianAngle);
double sineOfAngle = radianAngle,prevVal = radianAngle; //SineofAngle contains actual result, prevVal contains the next term to be added
//double fractionalPart = 0.1; // This variable is used to check the answer to a certain number of decimal places, as seen in the for loop
for(int i=3;i<=20;i+=2)
{
prevVal = (-prevVal)*((radianAngle*radianAngle)/(i*(i-1))); //x^3/3! can be written as ((x^2)/(3*2))*((x^1)/1!), similarly x^5/5! can be written as ((x^2)/(5*4))*((x^3)/3!) and so on. The negative sign is added because each successive term has alternate sign.
sineOfAngle+=prevVal;
//int iPart = (int)sineOfAngle;
//fractionalPart = sineOfAngle - iPart; //Extracting the fractional part to check the number of decimal places.
}
System.out.println("The value of sin of "+degreeAngle+" is : "+sineOfAngle);
}
}
The polynomial approximation for sine diverges widely for large positive and large negative values. Remember, since varies from -1 to 1 over all real numbers. Polynomials, on the other hand, particularly ones with higher orders, can't do that.
I would recommend using the periodicity of sine to your advantage.
int degreeAngle = input.nextInt() % 360;
This will give accurate answers, even for very, very large angles, without requiring an absurd number of terms.
The further you get from x=0, the more terms you need, of the Taylor expansion for sin x, to get within a particular accuracy of the correct answer. You're stopping around the 20th term, which is fine for small angles. If you want better accuracy for large angles, you'll just need to add more terms.

How to convert pcm samples in byte array as floating point numbers in the range -1.0 to 1.0 and back?

The resampling algorithm i use expects float array containing input samples in the range -1.0 to 1.0 . The audio data is 16 bit PCM with samplerate 22khz.
I want to downsample the audio from 22khz to 8khz, how to represent the samples in byte array as floating point numbers >= -1 and <= 1 and back to byte array?
You ask two questions:
How to downsample from 22kHz to 8kHz?
How to convert from float [-1,1] to 16-bit int and back?
Note that the question has been updated to indicate that #1 is taken care of elsewhere, but I'll leave that part of my answer in in case it helps someone else.
1. How to downsample from 22kHz to 8kHz?
A commenter hinted that this can be solved with the FFT. This is incorrect (One step in resampling is filtering. I mention why not to use the FFT for filtering here, in case you are interested: http://blog.bjornroche.com/2012/08/when-to-not-use-fft.html).
One very good way to resample a signal is with a polyphase filter. However, this is quite complex, even for someone experienced in signal processing. You have several other options:
use a library that implements high quality resampling, like libsamplerate
do something quick and dirty
It sounds like you have already gone with the first approach, which is great.
A quick and dirty solution won't sound as good, but since you are going down to 8 kHz, I'm guessing sound quality isn't your first priority. One quick and dirty option is to:
Apply a low pass filter to the signal. Try to get rid of as much audio above 4 kHz as you can. You can use the filters described here (although ideally you want something much steeper than those filters, they are at least better than nothing).
select every 2.75th sample from the original signal to produce the new, resampled signal. When you need a non-integer sample, use linear interpolation. If you need help with linear interpolation, try here.
This technique should be more than good enough for voice applications. However, I haven't tried it, so I don't know for sure, so I strongly recommend using someone else's library.
If you really want to implement your own high quality sample rate conversion, such as a polyphase filter, you should research it, and then ask whatever questions you have on https://dsp.stackexchange.com/, not here.
2. How to convert from float [-1,1] to 16-bit int and back?
This was started by c.fogelklou already, but let me embellish.
To start with, the range of 16 bit integers is -32768 to 32767 (usually 16-bit audio is signed). To convert from int to float you do this:
float f;
int16 i = ...;
f = ((float) i) / (float) 32768
if( f > 1 ) f = 1;
if( f < -1 ) f = -1;
You usually do not need to do that extra "bounding", (in fact you don't if you really are using a 16-bit integer) but it's there in case you have some >16-bit integers for some reason.
To convert back, you do this:
float f = ...;
int16 i;
f = f * 32768 ;
if( f > 32767 ) f = 32767;
if( f < -32768 ) f = -32768;
i = (int16) f;
In this case, it usually is necessary to watch out for out of range values, especially values greater than 32767. You might complain that this introduces some distortion for f = 1. This issue is hotly debated. For some (incomplete) discussion of this, see this blog post.
This is more than "good enough for government work". In other words, it will work fine except in the case where you are concerned about ultimate sound quality. Since you are going to 8kHz, I think we have established that's not the case, so this answer is fine.
However, for completeness, I must add this: if you are trying to keep things absolutely pristine, keep in mind that this conversion introduces distortion. Why? Because the error when converting from float to int is correlated with the signal. It turns out that the correlation of that error is terrible and you can actually hear it, even though it's very small. (fortunately it's small enough that for things like speech and low-dynamic range music it doesn't matter much) To eliminate this error, you must use something called dither in the conversion from float to int. Again, if that's something you care about, research it and ask relevant, specific questions on https://dsp.stackexchange.com/, not here.
You might also be interested in the slides from my talk on the basics of digital audio programming, which has a slide on this topic, although it basically says the same thing (maybe even less than what I just said): http://blog.bjornroche.com/2011/11/slides-from-fundamentals-of-audio.html
16 bit PCM has a range - 32768 to 32767. So, multiply each of your PCM samples by (1.0f/32768.0f) into a new array of floats, and pass that to your resample.
Going back to float after resampling, multiply by 32768.0, saturate (clip anything outside the range - 32768 to 32767), round (or dither as Björn mentioned) and then cast back to short.
Test code that shows conversion forward and back using multiplies with no bit errors:
// PcmConvertTest.cpp : Defines the entry point for the console application.
//
#include <assert.h>
#include <string.h>
#include <stdint.h>
#define SZ 65536
#define MAX(x,y) ((x)>(y)) ? (x) : (y)
#define MIN(x,y) ((x)<(y)) ? (x) : (y)
int main(int argc, char* argv[])
{
int16_t *pIntBuf1 = new int16_t[SZ];
int16_t *pIntBuf2 = new int16_t[SZ];
float *pFloatBuf = new float[SZ];
// Create an initial short buffer for testing
for( int i = 0; i < SZ; i++) {
pIntBuf1[i] = (int16_t)(-32768 + i);
}
// Convert the buffer to floats. (before resampling)
const float div = (1.0f/32768.0f);
for( int i = 0; i < SZ; i++) {
pFloatBuf[i] = div * (float)pIntBuf1[i];
}
// Convert back to shorts
const float mul = (32768.0f);
for( int i = 0; i < SZ; i++) {
int32_t tmp = (int32_t)(mul * pFloatBuf[i]);
tmp = MAX( tmp, -32768 ); // CLIP < 32768
tmp = MIN( tmp, 32767 ); // CLIP > 32767
pIntBuf2[i] = tmp;
}
// Check that the conversion went int16_t to float and back to int for every PCM value without any errors.
assert( 0 == memcmp( pIntBuf1, pIntBuf2, sizeof(int16_t) * SZ) );
delete pIntBuf1;
delete pIntBuf2;
delete pFloatBuf;
return 0;
}

Generate a single period of a frequency?

I would like to be able to take a frequency (eg. 1000hz, 250hz, 100hz) and play it out through the phone hardware.
I know that Android's AudioTrack will allow me to play a 16-bit PCM if I can calculate an array of bits or shorts. I would like to calculate only a single period so that later I can loop it without any issues, and so I can keep the calculation time down.
How could this be achieved?
Looping a single period isn't necessarily a good idea - the cycle may not fit nicely into an exact number of samples so you might get an undesirable discontinuity at the end of each cycle, or worse, the audible frequency may end up slightly off.
That said, the math isn't hard:
float sample_rate = 44100;
float samples_per_cycle = sample_rate / frequency;
int samples_to_produce = ....
for (int i = 0; i < samples_to_produce; ++i) {
sample[i] = Math.floor(32767.0 * Math.sin(2 * Math.PI * i / samples_per_cycle));
}
To see what I meant above about the frequency, take the standard tuning pitch of 440 Hz.
Strictly this needs 100.227 samples, but the code above would produce 100. So if you repeat your 100 samples over and over you'll actually play the sample 441 times per second, so your pitch will be off by 1 Hz.
To avoid the problem you'd really need to calculate several periods of the waveform, although I don't know many is needed to fool the ear into hearing the right pitch.
Ideally it would be as many as are needed such that:
i / samples_per_cycle
is a whole number, so that the last sample (technically the one after the last sample) ends exactly on a cycle boundary. I think if your input frequencies are all whole numbers then producing one second's worth exactly would work.

Display audio waveform and zoom

I'm able to display waveform but I don't know how to implement zoom in on the waveform.
Any idea?
Thanks piccolo
By Zoom, I presume you mean horizontal zoom rather than vertical. The way audio editors do this is to scan the wavform breaking it up into time windows where each pixel in X represents some number of samples. It can be a fractional number, but you can get away with dis-allowing fractional zoom ratios without annoying the user too much. Once you zoom out a bit the max value is always a positive integer and the min value is always a negative integer.
for each pixel on the screen, you need to have to know the minimum sample value for that pixel and the maximum sample value. So you need a function that scans the waveform data in chunks and keeps track of the accumulated max and min for that chunk.
This is slow process, so professional audio editors keep a pre-calculated table of min and max values at some fixed zoom ratio. It might be at 512/1 or 1024/1. When you are drawing with a zoom ration of > 1024 samples/pixel, then you use the pre-calculated table. if you are below that ratio you get the data directly from the file. If you don't do this you will find that you drawing code gets to be too slow when you zoom out.
Its worthwhile to write code that handles all of the channels of the file in an single pass when doing this scanning, slowness here will make your whole program feel sluggish, it's the disk IO that matters here, the CPU has no trouble keeping up, so straightforward C++ code is fine for building the min/max tables, but you don't want to go through the file more than once and you want to do it sequentially.
Once you have the min/max tables, keep them around. You want to go back to the disk as little as possible and many of the reasons for wanting to repaint your window will not require you to rescan your min/max tables. The memory cost of holding on to them is not that high compared to the disk io cost of building them in the first place.
Then you draw the waveform by drawing a series of 1 pixel wide vertical lines between the max value and the min value for the time represented by that pixel. This should be quite fast if you are drawing from pre built min/max tables.
Answered by https://stackoverflow.com/users/234815/John%20Knoeller
Working on this right now, c# with a little linq but should be easy enough to read and understand. The idea here is to have a array of float values from -1 to 1 representing the amplitude for every sample in the wav file. Then knowing how many samples per second, we then need a scaling factor - segments per second. At this point you simply are reducing the datapoints and smoothing them out. to zoom in really tight give a samples per second of 1000, to zoom way out maybe 5-10. Note right now im just doing normal averaing, where this needs to be updated to be much more efficent and probably use RMS (root-mean-squared) averaging to make it perfect.
private List<float> BuildAverageSegments(float[] aryRawValues, int iSamplesPerSecond, int iSegmentsPerSecond)
{
double nDurationInSeconds = aryRawValues.Length/(double) iSamplesPerSecond;
int iNumSegments = (int)Math.Round(iSegmentsPerSecond*nDurationInSeconds);
int iSamplesPerSegment = (int) Math.Round(aryRawValues.Length/(double) iNumSegments); // total number of samples divided by the total number of segments
List<float> colAvgSegVals = new List<float>();
for(int i=0; i<iNumSegments-1; i++)
{
int iStartIndex = i * iSamplesPerSegment;
int iEndIndex = (i + 1) * iSamplesPerSegment;
float fAverageSegVal = aryRawValues.Skip(iStartIndex).Take(iEndIndex - iStartIndex).Average();
colAvgSegVals.Add(fAverageSegVal);
}
return colAvgSegVals;
}
Outside of this you need to get your audio into a wav format, you should be able to find source everywhere to read that data, then use something like this to convert the raw byte data to floats - again this is horribly rough and inefficent but clear
public float[] GetFloatData()
{
//Scale Factor - SignificantBitsPerSample
if (Data != null && Data.Length > 0)
{
float nMaxValue = (float) Math.Pow((double) 2, SignificantBitsPerSample);
float[] aryFloats = new float[Data[0].Length];
for (int i = 0; i < Data[0].Length; i++ )
{
aryFloats[i] = Data[0][i]/nMaxValue;
}
return aryFloats;
}
else
{
return null;
}
}

Categories

Resources