I have a layout that has about 60 buttons and each one, when pressed, plays a different audio file. I have all my audio files as mp3s in my assets folder and to play them I'm basically using the same code as is used in the Google NDK samples "native-audio" project:
https://github.com/googlesamples/android-ndk
I have 10 identical native functions (just with uniquely named variables) that work like this..
function to play sound:
jboolean Java_com_example_nativeaudio_Fretboard_player7play(JNIEnv* env, jclass clazz, jobject assetManager, jstring filename)
{
SLresult result;
// convert Java string to UTF-8
const char *utf8 = (*env)->GetStringUTFChars(env, filename, NULL);
assert(NULL != utf8);
// use asset manager to open asset by filename
AAssetManager* mgr = AAssetManager_fromJava(env, assetManager);
assert(NULL != mgr);
AAsset* asset = AAssetManager_open(mgr, utf8, AASSET_MODE_UNKNOWN);
// release the Java string and UTF-8
(*env)->ReleaseStringUTFChars(env, filename, utf8);
// the asset might not be found
if (NULL == asset) {
return JNI_FALSE;
}
// open asset as file descriptor
off_t start, length;
int fd = AAsset_openFileDescriptor(asset, &start, &length);
assert(0 <= fd);
AAsset_close(asset);
// configure audio source
SLDataLocator_AndroidFD loc_fd = {SL_DATALOCATOR_ANDROIDFD, fd, start, length};
SLDataFormat_MIME format_mime = {SL_DATAFORMAT_MIME, NULL, SL_CONTAINERTYPE_UNSPECIFIED};
SLDataSource audioSrc = {&loc_fd, &format_mime};
// configure audio sink
SLDataLocator_OutputMix loc_outmix = {SL_DATALOCATOR_OUTPUTMIX, outputMixObject};
SLDataSink audioSnk = {&loc_outmix, NULL};
// create audio player
const SLInterfaceID ids[3] = {SL_IID_SEEK, SL_IID_MUTESOLO, SL_IID_VOLUME};
const SLboolean req[3] = {SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE};
result = (*engineEngine)->CreateAudioPlayer(engineEngine, &p7PlayerObject, &audioSrc, &audioSnk,
3, ids, req);
assert(SL_RESULT_SUCCESS == result);
(void)result;
// realize the player
result = (*p7PlayerObject)->Realize(p7PlayerObject, SL_BOOLEAN_FALSE);
assert(SL_RESULT_SUCCESS == result);
(void)result;
// get the play interface
result = (*p7PlayerObject)->GetInterface(p7PlayerObject, SL_IID_PLAY, &p7PlayerPlay);
assert(SL_RESULT_SUCCESS == result);
(void)result;
if (NULL != p7PlayerPlay) {
// play
result = (*p7PlayerPlay)->SetPlayState(p7PlayerPlay, SL_PLAYSTATE_PLAYING);
assert(SL_RESULT_SUCCESS == result);
(void)result;
}
return JNI_TRUE;
}
function to stop that sound:
void Java_com_example_nativeaudio_Fretboard_player7stop(JNIEnv* env, jclass clazz)
{
SLresult result;
// make sure the asset audio player was created
if (NULL != p7PlayerPlay) {
// set the player's state
result = (*p7PlayerPlay)->SetPlayState(p7PlayerPlay, SL_PLAYSTATE_STOPPED);
assert(SL_RESULT_SUCCESS == result);
(void)result;
// destroy file descriptor audio player object, and invalidate all associated interfaces
(*p7PlayerObject)->Destroy(p7PlayerObject);
p7PlayerObject = NULL;
p7PlayerPlay = NULL;
}
}
this is easy to deal with, but I want to minimize latency and avoid having to do (*engineEngine)->CreateAudioPlayer() every time I want to play a different file. Is there any way to just change the audioSrc used by the audio player without having to destroy and recreate it from scratch every time?
As a bonus, where can I read more about this stuff? Seems pretty difficult to find any information on OpenSL ES anywhere.
We're in the same boat, I'm currently familiarizing myself too with the NDK and OpenSL ES. My answer is based on my experience entirely consisting of ~2 days of experimentation so there might be better approaches but the information might help you on your way.
I have 10 identical native functions (just with uniquely named variables) that work like this..
If I understood your case correctly, you don't need to have duplicate functions for this. The only thing which differs in these calls is the button pressed and ultimately the sound to play and this can be passed as parameters through the JNI call. You can store the created player and data in a globally accessible structure so you can retrieve it when you need to stop/replay it, maybe using the buttonId as a key to a map.
[..]but I want to minimize latency and avoid having to do (*engineEngine)->CreateAudioPlayer() every time I want to play a different file. Is there any way to just change the audioSrc used by the audio player without having to destroy and recreate it from scratch every time?
Yes, constantly creating and destroying players is costly and can lead to fragmentation of the heap (as stated in the OpenSL ES 1.0 Specification). First, I thought he DynamicSourceItf would allow you to switch data sources but it seems that this interface is not intended to be used like that, at least on Android 6 this returns 'feature unsupported'.
I doubt that creating a player for each unique sound would be a good solution especially since playing the same sound multiple times on top of each other (as it's common in a game for example) would require an arbitrary amount of additional players for that same sound.
Buffer Queues
BufferQueues are queues of individual buffers which a player will process when playing. When all the buffers have been processed, the player 'stops' (it's official state is still 'playing' though) but will resume as soon as new buffers are being enqueued.
What this allows you to do is to create as many players as overlapping sounds you require. When you want to play a sound, you iterate over these players until you've found one which is not currently processing buffers (BufferQueueItf->GetState(...) provides this information or a callback can be registered so you can tag players as being 'free'). Then, you enqueue as many buffers as your sound needs which will start playing immediately.
The format of a BufferQueue is, as far as I know, locked at creation. So you have to make sure that you either have all your input buffers in the same format or you create different BufferQueue (and players) for each format.
Android Simple BufferQueue
According to the Android NDK documentation, the BufferQueue interface is expected to have significant changes in the future. They have extracted a simplified interface with most of BufferQueue's functionality and called it AndroidSimpleBufferQueue. This interface is not expected to change and thus makes your code more future proof.
The main functionality you loose by using the AndroidSimpleBufferQueue is to be able to use non-PCM source data, so you'd have to decode your files before use. This can be done in OpenSL ES using a AndroidSimpleBufferQueue as a sink. More recent APIs have additional support using the MediaCodec and it's NDK implementation NDKMedia (checkout the native-codec example).
Resources
The NDK documentation does contain some important information which are hard to find anywhere else. Here's the OpenSL ES specific page.
It might be close to 600 pages and hard to digest, but the OpenSL ES 1.0 Specification should be your primary resource of information. I highly recommend reading chapter 4 as it gives a good overview of how things work. Chapter 3 has a bit more information on the specific design. Then, I just jump around using the search function to read up on interfaces and objects as I go.
Understanding OpenSL ES
Once you have understood the basic principles of how OpenSL works, it seems to be quite straightforward. There are media objects (players and recorders, etc) and data sources (inputs) and data sinks (outputs). You essentially connect an input to a media object which routes the processed data to its connected output.
Sources, Sinks and Media Objects are all documented in the specification including their interfaces. With that information, it really is just about picking the building blocks you require and plugging them together.
Update 07/29/16
From my tests, it seems as if both BufferQueue and AndroidSimpleBufferQueue do not support non-PCM data, at least not on the systems I've tested (Nexus 7 # 6.01, NVidia Shield K1 # 6.0.1) so you will need to decode your data before you can use it.
I tried using the NDK versions of the MediaExtractor and MediaCodec but there are several caveats to watch out for:
MediaExtractor does not seem to correctly return the UUID information required for decoding with crypto, at least not for the files I've tested. AMediaExtractor_getPsshInfo returns a nullptr.
The API does not always behave as the comments in the header claim. Checking for EOS (end of stream) in the MediaExtractor for example seems to be most reliable by checking the amount of bytes returned instead of checking the AMediaExtractor_advance function's return value.
I'd recommend staying in Java for the decoding process as these APIs are more mature, definitely more tested and you might get more functionality out of it. Once you have the buffers of raw PCM data, you can pass it to native code which allows you to reduce latency.
Is there any way to detect system sound instead of microphone sound? I want to be able to detect whenever my system makes a sound instead of when the microphone picks up the actual sound.
One way I found to do this use an "audio loop-back in either software or hardware (e.g. connect a lead from the speaker 'out' jack to the microphone 'in' jack)."
Capturing speaker output in Java
I am building a program that plays an mp3 file whenever a system sound happens but I don't want it to go off if the dog barks.
Thanks!
What about something with pyaudio (http://people.csail.mit.edu/hubert/pyaudio/)
Like this:
import pyaudio
chunk = 1024
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16,
channels=1,
rate=44100,
input=True,
frames_per_buffer=chunk)
data = stream.read(chunk)
And then you could calculate the root-mean-square(RMS) of the audio sample and go from there.
Edited:
You can see what kind of devices you can use by doing something like the following. (http://people.csail.mit.edu/hubert/pyaudio/docs/#pyaudio.PyAudio.get_device_info_by_index)
import pyaudio
p = pyaudio.PyAudio()
for i in xrange(0,10):
try:
p.get_device_info_by_index(i)
except Exception,e:print e
All I need is a peice of java code that can detect DTMF from microphone print out the characters to System.out. I've been searching forever and I couldn't find it.
Oracle Docs on Capturing Audio in Java:
http://docs.oracle.com/javase/tutorial/sound/capturing.html
As discussed in Overview of the Sampled Package, a typical audio-input system in an implementation of the Java Sound API consists of:
An input port, such as a microphone port or a line-in port, which feeds its incoming audio data into:
A mixer, which places the input data in:
One or more target data lines, from which an application can retrieve the data.
(Emphasis Mine)
Also see:
Java (J2SE) DTMF tone detection
I think usually this is done in hardware, so you may end up writing code yourself to analyze the audio you've captured.
Also:
http://sourceforge.net/projects/java-dtmf/
So, I went over the Java's sound tutorial and I did not find it all so helpful.
Anyways, what I understood from the tutorial for recording sound from a mic is this:
Although they do show how to get a target data line and so on, they do not tell how you can actually record sound [or maybe I didn't get it all well].
My understanding so far has been this:
Mixer can be your sound card or sound software drivers that can be used to process the sound, whether input or output
TargetDataLine is used when you want to output your sound into the computer. Like save it to the disk
Port is where your external devices like mic, etc are connected
Problems that remain
How do I select the proper mixer? Java's tut says that you get all the available mixers and query each one to see if it has what you want. That's quite vague for a beginner
How do I get the port on which my integrated mic is? Specifically, how do I get input from it into the mixer?
How do I output this to the disk?
Using the AudioSystem.getTargetDataLine(AudioFormat format) method you will get
... a target data line that can be used for recording audio data in the format specified by the AudioFormat object. The returned line will be provided by the default system mixer, or, if not possible, by any other mixer installed in the system that supports a matching TargetDataLine object.
See the accepted answer for Java Sound API - capturing microphone for an example of this.
If you want more control of which data line to use you can enumerate all the mixers and the data lines they support and pick the one you want. Here is some more information regarding how you would go about doing that: Java - recording from mixer
Once you've obtained the TargetDataLine you should open() it, and then call read() repeatedly to obtain data from that data line. The byte[] that you fill up with data with each call to read() can be written to disk e.g. through a FileOutputStream.
I'm using the javax.sound.sampled package in a radio data mode decoding program. To use the program the user feeds audio from their radio receiver into their PC's line input. The user is also required to use their mixer program to select the line in as the recording input. The trouble is some users don't know how to do this and also sometimes other programs alter the recording input setting. So my question is how can my program detect if the line in is set as the recording input ? Also is it possible for my program to change the recording input setting if it detects it is incorrect ?
Thanks for your time.
Ian
To answer your first question, you can check if the Line.Info object for your recording input matches Port.Info.LINE_IN like this:
public static boolean isLineIn(Line.Info lineInfo) {
Line.Info[] detected = AudioSystem.getSourceLineInfo(Port.Info.LINE_IN);
for (Line.Info lineIn : detected) {
if (lineIn.matches(lineInfo)) {
return true;
}
}
return false;
}
However, this doesn't work with operating systems or soundcard driver APIs that don't provide the type of each available mixer channel. So when I test it on Windows it works, but not on Linux or Mac. For more information and recommendations, see this FAQ.
Regarding your second question, you can try changing the recording input settings through a Control class. In particular, see FloatControl.Type for some common settings. Keep in mind that the availability of these controls depends on the operating system and soundcard drivers, just like line-in detection.