I have a layout that has about 60 buttons and each one, when pressed, plays a different audio file. I have all my audio files as mp3s in my assets folder and to play them I'm basically using the same code as is used in the Google NDK samples "native-audio" project:
https://github.com/googlesamples/android-ndk
I have 10 identical native functions (just with uniquely named variables) that work like this..
function to play sound:
jboolean Java_com_example_nativeaudio_Fretboard_player7play(JNIEnv* env, jclass clazz, jobject assetManager, jstring filename)
{
SLresult result;
// convert Java string to UTF-8
const char *utf8 = (*env)->GetStringUTFChars(env, filename, NULL);
assert(NULL != utf8);
// use asset manager to open asset by filename
AAssetManager* mgr = AAssetManager_fromJava(env, assetManager);
assert(NULL != mgr);
AAsset* asset = AAssetManager_open(mgr, utf8, AASSET_MODE_UNKNOWN);
// release the Java string and UTF-8
(*env)->ReleaseStringUTFChars(env, filename, utf8);
// the asset might not be found
if (NULL == asset) {
return JNI_FALSE;
}
// open asset as file descriptor
off_t start, length;
int fd = AAsset_openFileDescriptor(asset, &start, &length);
assert(0 <= fd);
AAsset_close(asset);
// configure audio source
SLDataLocator_AndroidFD loc_fd = {SL_DATALOCATOR_ANDROIDFD, fd, start, length};
SLDataFormat_MIME format_mime = {SL_DATAFORMAT_MIME, NULL, SL_CONTAINERTYPE_UNSPECIFIED};
SLDataSource audioSrc = {&loc_fd, &format_mime};
// configure audio sink
SLDataLocator_OutputMix loc_outmix = {SL_DATALOCATOR_OUTPUTMIX, outputMixObject};
SLDataSink audioSnk = {&loc_outmix, NULL};
// create audio player
const SLInterfaceID ids[3] = {SL_IID_SEEK, SL_IID_MUTESOLO, SL_IID_VOLUME};
const SLboolean req[3] = {SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE};
result = (*engineEngine)->CreateAudioPlayer(engineEngine, &p7PlayerObject, &audioSrc, &audioSnk,
3, ids, req);
assert(SL_RESULT_SUCCESS == result);
(void)result;
// realize the player
result = (*p7PlayerObject)->Realize(p7PlayerObject, SL_BOOLEAN_FALSE);
assert(SL_RESULT_SUCCESS == result);
(void)result;
// get the play interface
result = (*p7PlayerObject)->GetInterface(p7PlayerObject, SL_IID_PLAY, &p7PlayerPlay);
assert(SL_RESULT_SUCCESS == result);
(void)result;
if (NULL != p7PlayerPlay) {
// play
result = (*p7PlayerPlay)->SetPlayState(p7PlayerPlay, SL_PLAYSTATE_PLAYING);
assert(SL_RESULT_SUCCESS == result);
(void)result;
}
return JNI_TRUE;
}
function to stop that sound:
void Java_com_example_nativeaudio_Fretboard_player7stop(JNIEnv* env, jclass clazz)
{
SLresult result;
// make sure the asset audio player was created
if (NULL != p7PlayerPlay) {
// set the player's state
result = (*p7PlayerPlay)->SetPlayState(p7PlayerPlay, SL_PLAYSTATE_STOPPED);
assert(SL_RESULT_SUCCESS == result);
(void)result;
// destroy file descriptor audio player object, and invalidate all associated interfaces
(*p7PlayerObject)->Destroy(p7PlayerObject);
p7PlayerObject = NULL;
p7PlayerPlay = NULL;
}
}
this is easy to deal with, but I want to minimize latency and avoid having to do (*engineEngine)->CreateAudioPlayer() every time I want to play a different file. Is there any way to just change the audioSrc used by the audio player without having to destroy and recreate it from scratch every time?
As a bonus, where can I read more about this stuff? Seems pretty difficult to find any information on OpenSL ES anywhere.
We're in the same boat, I'm currently familiarizing myself too with the NDK and OpenSL ES. My answer is based on my experience entirely consisting of ~2 days of experimentation so there might be better approaches but the information might help you on your way.
I have 10 identical native functions (just with uniquely named variables) that work like this..
If I understood your case correctly, you don't need to have duplicate functions for this. The only thing which differs in these calls is the button pressed and ultimately the sound to play and this can be passed as parameters through the JNI call. You can store the created player and data in a globally accessible structure so you can retrieve it when you need to stop/replay it, maybe using the buttonId as a key to a map.
[..]but I want to minimize latency and avoid having to do (*engineEngine)->CreateAudioPlayer() every time I want to play a different file. Is there any way to just change the audioSrc used by the audio player without having to destroy and recreate it from scratch every time?
Yes, constantly creating and destroying players is costly and can lead to fragmentation of the heap (as stated in the OpenSL ES 1.0 Specification). First, I thought he DynamicSourceItf would allow you to switch data sources but it seems that this interface is not intended to be used like that, at least on Android 6 this returns 'feature unsupported'.
I doubt that creating a player for each unique sound would be a good solution especially since playing the same sound multiple times on top of each other (as it's common in a game for example) would require an arbitrary amount of additional players for that same sound.
Buffer Queues
BufferQueues are queues of individual buffers which a player will process when playing. When all the buffers have been processed, the player 'stops' (it's official state is still 'playing' though) but will resume as soon as new buffers are being enqueued.
What this allows you to do is to create as many players as overlapping sounds you require. When you want to play a sound, you iterate over these players until you've found one which is not currently processing buffers (BufferQueueItf->GetState(...) provides this information or a callback can be registered so you can tag players as being 'free'). Then, you enqueue as many buffers as your sound needs which will start playing immediately.
The format of a BufferQueue is, as far as I know, locked at creation. So you have to make sure that you either have all your input buffers in the same format or you create different BufferQueue (and players) for each format.
Android Simple BufferQueue
According to the Android NDK documentation, the BufferQueue interface is expected to have significant changes in the future. They have extracted a simplified interface with most of BufferQueue's functionality and called it AndroidSimpleBufferQueue. This interface is not expected to change and thus makes your code more future proof.
The main functionality you loose by using the AndroidSimpleBufferQueue is to be able to use non-PCM source data, so you'd have to decode your files before use. This can be done in OpenSL ES using a AndroidSimpleBufferQueue as a sink. More recent APIs have additional support using the MediaCodec and it's NDK implementation NDKMedia (checkout the native-codec example).
Resources
The NDK documentation does contain some important information which are hard to find anywhere else. Here's the OpenSL ES specific page.
It might be close to 600 pages and hard to digest, but the OpenSL ES 1.0 Specification should be your primary resource of information. I highly recommend reading chapter 4 as it gives a good overview of how things work. Chapter 3 has a bit more information on the specific design. Then, I just jump around using the search function to read up on interfaces and objects as I go.
Understanding OpenSL ES
Once you have understood the basic principles of how OpenSL works, it seems to be quite straightforward. There are media objects (players and recorders, etc) and data sources (inputs) and data sinks (outputs). You essentially connect an input to a media object which routes the processed data to its connected output.
Sources, Sinks and Media Objects are all documented in the specification including their interfaces. With that information, it really is just about picking the building blocks you require and plugging them together.
Update 07/29/16
From my tests, it seems as if both BufferQueue and AndroidSimpleBufferQueue do not support non-PCM data, at least not on the systems I've tested (Nexus 7 # 6.01, NVidia Shield K1 # 6.0.1) so you will need to decode your data before you can use it.
I tried using the NDK versions of the MediaExtractor and MediaCodec but there are several caveats to watch out for:
MediaExtractor does not seem to correctly return the UUID information required for decoding with crypto, at least not for the files I've tested. AMediaExtractor_getPsshInfo returns a nullptr.
The API does not always behave as the comments in the header claim. Checking for EOS (end of stream) in the MediaExtractor for example seems to be most reliable by checking the amount of bytes returned instead of checking the AMediaExtractor_advance function's return value.
I'd recommend staying in Java for the decoding process as these APIs are more mature, definitely more tested and you might get more functionality out of it. Once you have the buffers of raw PCM data, you can pass it to native code which allows you to reduce latency.
Related
I am trying to perform a simple task, select an input device and set the output device.
The use case is as follows, I have 3.5mm jacks and my user can select the output device (headphones or speaker) from a list.
I can play a sound on a given device (with clip), I can control the input device (mute/volume), but I haven't found any way to specify the target line, it's always the system default.
I can get the mixer
Optional<Mixer.Info> optJackInMixerInfo = Arrays.stream(AudioSystem.getMixerInfo())
.filter(mixerInfo -> {
// Filter based on the device name.
})
.findFirst();
Mixer m = AudioSystem.getMixer(jackInMixerInfo);
// The target
Line.Info[] lineInfos = m.getTargetLineInfo();
for (Line.Info lineInfo : lineInfos) {
m.getLine(lineInfo);
System.out.println("ici");
}
I got only the "master volume control".
How can I select the output device ? I can be happy with changing the system default device too.
The naming of TargetDataLine and SourceDataLine is kind of backwards. Outputs to the local sound system for playback are directed to a SourceDataLine and inputs to Java like microphone lines use TargetDataLine. I used to know why they were named this way but it's slipped my mind at the moment.
There is a tutorial Accessing Audio System Resources with specifics.
Most computers only have a limited number of float controls available, with "master volume" being the one most likely to be implemented. You would use this to alter the volume of the output. Another tutorial in the series, Processing Audio with Controls covers this topic. For myself, I generally convert the audio stream to PCM and handle volume directly (multiply each value by a factor that ranges from 0 to 1) and then convert back to a byte stream, rather than rely on controls which may or may not be present.
A MIDI channel administers parameters such as sound, panning, volume etc.; thus for ensemble music, each of its real instrument should be represented by a channel of its own. If more than 15 non-percussion instruments are involved, a single MIDI line is not enough.
The Java software I write is intended for users most of whom will use the Java built-in software synthesizer. I want to allow for more than 16 instruments. Given the existing API as far as I know it, I need several MidiReceiver objects that work independently.
First try: the soft synthesizer asserts "getMaxReceivers() == -1", i.e. unlimited, so I create as many as I need. Unfortunately, they all use the same channels – failure.
Second try: I create two MidiDevice objects for the same Info object, and a MidiReceiver for each. When I try to open the second one, I get an exception saying that no further audio line is available.
Third try: Same as second, but for opening the devices, I use a special method of the SoftSynthesizer class that allows me to open it with a given audio line; I do so using the same line. No exception thrown – but chaotic audio output. Since the two objects don't know about each other, they cannot add their output gracefully. Failure again.
Questions:
A) Have I overlooked something?
B) If not, would someone who has the contacts and reputation please alarm the authors of the Java interface and the SoftSynthesizer? My proposal, minimally invasive: A (Soft)Synthesizer object should be endowed with an additional method such as "MidiDevice getSubdevice()", on which getReceiver() offers fresh channels as required.
(Upon re-editing: Could it be that the ordinary getReceiver() method is actually meant for this purpose, as described in my "First try" above, and has simply been misimplemented by the SoftSynthesizer "Gervill"? If so, Gervill should be informed, who, however, is not easy to find by googling. You may know how to contact him/her/them.)
public boolean GetTwoIndependenttReceivers (Receiver [] inhereplease)
{
for (MidiDevice.Info info : MidiSystem.getMidiDeviceInfo ()) try
{
MidiDevice device = MidiSystem.getMidiDevice (info);
if ( device instanceof Synthesizer
&& ( device.getMaxReceivers () < 0
|| device.getMaxReceivers () >= 2)) try
{
device.open ();
inhereplease [0] = device.getReceiver ();
inhereplease [1] = device.getReceiver ();
// will be distinct as objects, but with Gervill not independent
return true;
} catch (Exception ex) {}
} catch (Exception ex) {}
return false;
}
Note that, for example, the free software MuseScore manages the problem all right with its own software synthesizer. It exports MIDI files with "MIDI port" MIDI messages, as intended by the MIDI standard for exactly that purpose, and imports them gracefully. The built-in Java sequencer simply ignores those port messages and therefore plays the files incorrectly. This may be an additional incentive to attack the problem: one Receiver object for each port.
The MIDI standard only supports 16 channels. Full stop.
So, anything you want to do to control more channels than that goes outside the normal MIDI specification. The regular Windows GM synthesizer supports what it supports and isn't going to change. If you need additional capabilities, you'll have to use a different synthesizer, inside your application.
Currently, I am working to get foreground(top) window/process in MS Windows. I need to do something similar in macOS using JNA.
What is the equivalent code in macOS?
byte[] windowText = new byte[512];
PointerType hwnd = User32.INSTANCE.GetForegroundWindow();
User32.INSTANCE.GetWindowTextA(hwnd, windowText, 512);
System.out.println(Native.toString(windowText));
There are actually two questions here, foreground window and foreground process. I'll try to answer both.
For the foreground process an easy way using JNA is to map Application Services API. Note that these functions were introduced in 10.9 and are now deprecated, but still work as of 10.15. The newer version is in the AppKit Library, see below.
Create this class, mapping the two functions you'll need:
public interface ApplicationServices extends Library {
ApplicationServices INSTANCE = Native.load("ApplicationServices", ApplicationServices.class);
int GetFrontProcess(LongByReference processSerialNumber);
int GetProcessPID(LongByReference processSerialNumber, IntByReference pid);
}
The "foreground" process can be obtained with GetFrontProcess(). That returns something called a ProcessSerialNumber, a unique 64-bit value used throughout the Application Services API. To translate it for your userspace use, you probably want the Process ID, and GetProcessPID() does that translation for you.
LongByReference psn = new LongByReference();
IntByReference pid = new IntByReference();
ApplicationServices.INSTANCE.GetFrontProcess(psn);
ApplicationServices.INSTANCE.GetProcessPID(psn, pid);
System.out.println("Front process pid: " + pid.getValue());
While the above works, it is deprecated. A new application should use the AppKit Library:
public interface AppKit extends Library {
AppKit INSTANCE = Native.load("AppKit", AppKit.class);
}
There are multiple other StackOverflow questions regarding the topmost application using this library, such as this one. Mapping all the imports and objects needed is far more work than I have time to do in an answer here, but you might find it useful. It's probably easier to figure out how to use the Rococoa framework (which uses JNA under the hood but has already mapped all of AppKit via JNAerator) to access this API. Some javadocs are here.
There are also solutions using AppleScript that you can execute from Java via command line using Runtime.exec() and capturing output.
With regard to foreground window on the screen, it's a bit more complicated. In my answer to your earlier question on iterating all windows on macOS, I answered how to get a list of all the windows using CoreGraphics via JNA, including a CFDictionary containing more information.
One of those dictionary keys is kCGWindowLayer which will return a CFNumber representing the window layer number. The docs state this is 32-bit, so intValue() is appropriate. The number is the "drawing order" so a higher number will overwrite a lower number. So you can iterate over all the retrieved windows and find the maximum number. This will be the "foreground" layer.
There are some caveats:
There are actually only 20 layers available. Many things share a layer.
Layer 1000 is the screensaver. You can ignore layers 1000 and higher.
Layer 24 is the Dock, usually on top, with Layer 25 (the icons on the dock) at a higher level.
Layer 0 appears to be the rest of the desktop.
Which window is "on top" depends on where on the screen you look. Over the dock, the dock will be in the foreground (or the application icon). On the rest of the screen, you need to check the pixel you're evaluating vs. the screen rectangle obtained from the CoreGraphics window. (Use the kCGWindowBounds key which returns a CGRect (a structure with 4 doubles, X, Y, width, height).
You will need to filter to onscreen windows. If you already fetched the list you could use the kCGWindowIsOnscreen key to determine whether the window is visible. It returns a CFBoolean. Since that key is optional you will need to test for null. However, if you are starting from nothing, it would be better to use the kCGWindowListOptionOnScreenOnly Window Option Constant when you initially call CGWindowListCopyWindowInfo().
In addition to iterating all windows, the CGWindowListCopyWindowInfo() function takes a CGWindowID parameter relativeToWindow and you can add (with bitwise or) kCGWindowListOptionOnScreenAboveWindow to the options.
Finally, you might find that limiting to windows associated with the current session may be useful, and you should map CGWindowListCreate() using similar syntax to the CopyInfo() variant. It returns an array of window numbers that you could limit your dictionary search to, or pass that array as an argument to CGWindowListCreateDescriptionFromArray().
As mentioned in my previous answer, you "own" every object you create using Create or Copy functions, and are responsible for releasing them when you are done with them, to avoid memory leaks.
AppleScriptEngine appleEngine = new apple.applescript.AppleScriptEngine();
ArrayList<String> processNames = null;
try {
String processName = null;
processNames = (ArrayList<String>) appleEngine
.eval("tell application \"System Events\" to get name of application processes whose frontmost is true and visible is true");
if (processNames.size() > 0) {
processName = processNames.get(0);// the front most process name
}
return processName;
} catch (ScriptException e) {
log.debug("no app running");
}
I am currently programming a game and now I also want to add sound. My current method works fine but I am not happy with it.
new Sound(new Resource().readAndGetStream("small_click.wav")).play();
This line of code reads the file small_click.wav whenever it is getting executed. But I think it is not very efficient to always read the resource file when it's needed.
So what I want to do now is caching a sound in a variable or something to not have to load the sound from file again. But I also want to create a new object from the sound, so I can play it mutiple times and it overlaps in the speakers.
I can't find a way to do this. I already tried to use Threads but.. this code works without any threads.
If you want to know, here is the code of the Sound class:
public Sound(InputStream audioSrc) {
try {
InputStream bufferedIn = new BufferedInputStream(audioSrc);
AudioInputStream audioStream = AudioSystem.getAudioInputStream(bufferedIn);
clip = AudioSystem.getClip();
clip.open(audioStream);
} catch {
...exception handling...
}
}
public void play() {
clip.setFramePosition(0);
clip.start();
}
And If you want to know what the "new Resource().readAndGetStream()" does:
It basically loads a resource and returns an InputStream of that resource with getResourceAsStream().
With the Sound class that you have, you can easily create a "cache". For example, create an array of type Sound[] soundCache and execute the first part of the code line you gave in your example.
soundCache[0] = new Sound(new Resource().readAndGetStream("small_click.wav"));
You could even consider making constants to correspond to each sound.
final int SMALL_CLICK = 0;
Then, when it is time to play the sound, execute your play function.
soundCache[SMALL_CLICK].play();
Going from here to having concurrent playbacks of a given sound is quite a bit more involved. If continuing to work with Clip as your basis, I don't know of any way to get overlapping playbacks of the same sound resource except by making and managing as many copies of the Clip as you might want to allow to be heard at once.
When faced with this coding challenge, I ended up writing my own library, AudioCue. It is available on Github, and has a very permissive license.
The basic concept is to store the audio data in an array of signed PCM floats, and manage the concurrent playback by having multiple "cursors" that can independently iterate through the PCM data and feed it to a SourceDataLine.
These cursors can be managed in real time. You can change the rate at which they travel through the data and scale the volume level of the data, allowing frequency and volume changes.
I did my best to keep the API as similar to a Clip as practical, so that the class would be easy to use for coders familiar with Java's Clip.
Feel free to examine the code for ideas/examples or make use of this library in your project.
I'm trying to record from the microphone to a wav file as per this example. At the same time, I need to be able to test for input level/volume and send an alert if it's too low. I've tried what's described in this link and seems to work ok.
The issue comes when trying to record and read bytes at the same time using one TargetDataLine (bytes read for monitoring are being skipped for recording and vice-versa.
Another thing is that these are long processes (hours probably) so memory usage should be considered.
How should I proceed here? Any way to clone TargetDataLine? Can I buffer a number of bytes while writing them with AudioSystem.write()? Is there any other way to write to a .wav file without filling the system memory?
Thanks!
If you are using a TargetDataLine for capturing audio similar to the example given in the Java Tutorials, then you have access to a byte array called "data". You can loop through this array to test the volume level before outputting it.
To do the volume testing, you will have to convert the bytes to some sort of sensible PCM data. For example, if the format is 16-bit stereo little-endian, you might take two bytes and assemble to either a signed short or a signed, normalized float, and then test.
I apologize for not looking more closely at your examples before posting my "solution".
I'm going to suggest that you extend InputStream, making a customized version that also performs the volume test. Override the 'read' method so that it obtains the byte that it returns from the code you have that tests the volume. You'll have to modify the volume-testing code to work on a per-byte basis and to pass through the required byte.
You should then be able to use this extended InputStream as an argument when you create the AudioInputStream for the output-to-wav stage.
I've used this approach to save audio successfully via two data sources: once from an array that is populated beforehand, once from a streaming audio mix passing through a "mixer" I wrote to combine audio data sources. The latter would be more like what you need to do. I haven't done it from a microphone source, though. But the same approach should work, as far as I can tell.