Reading and playing audio file - java

I'm having some trouble with reading and playing certain audio clips on Android 2.0.1 (Motorola Droid A855). Below is the code segment that I use. It works fine for some files, but for other files it just doesn't exit the while loop. I have tried checking
InputStream.available()
method but with no luck. I even printed out the number of bytes it reads properly before getting stuck. It seems that it gets stuck in the loop at the last round of read (have less than < 512 bytes left), and doesn't exit the loop.
int sampleFreq = 44100;
int minBufferSize = AudioTrack.getMinBufferSize(sampleFreq, AudioFormat.CHANNEL_IN_STEREO, AudioFormat.ENCODING_PCM_16BIT);
int bufferSize = 512;
AudioTrack at = new AudioTrack(AudioManager.STREAM_MUSIC, sampleFreq, AudioFormat.CHANNEL_IN_STEREO, AudioFormat.ENCODING_PCM_16BIT, minBufferSize, AudioTrack.MODE_STREAM);
InputStream input;
try {
File fileID=new File(Environment.getExternalStorageDirectory(),resourceID);
input = new FileInputStream( fileID);
int filesize=(int)fileID.length();
int i=0,byteread=0;
byte[] s = new byte[bufferSize];
at.play();
while((i = input.read(s, 0, bufferSize))>-1){
at.write(s, 0, i);
//at.flush();
byteread+=i;
Log.i(TAG,"playing audio "+byteread+"\t"+filesize);
}
at.stop();
at.release();
input.close();
} catch (FileNotFoundException e) {
// TODO
e.printStackTrace();
} catch (IOException e) {
// TODO
e.printStackTrace();
}
Audio files are around 1-2MB in size and are in wav format. Following is an example of the logging-
> : playing audio 1057280 1058474
> : playing audio 1057792 1058474
> : playing audio 1058304 1058474
Any idea why this is happening as it runs perfectly for some of the audio files.

Make sure your call to write() always delivers a byte size which is an integral number of samples.
For your 16 bit stereo mode, that should be an integral multiple of 4 bytes.
Additionally, at least before the final write, for stutter-free operation you should really respect the minimum buffer size of the audio subsystem and deliver at least that much data in each call to the audio write method.
If your source data is a .wav file, make sure you actually skip the header and read samples only starting from a valid payload chunk.

Related

Java Sound API: Attempt to Do Live Microphone Input Monitoring is Slow

I think I have a performance (latency) issue with the Java Sound API.
Audio Monitor
The following code does indeed work for me. It correctly opens up the microphone, and outputs the audio input through my speakers in real time (i.e. monitoring). But my concern is the speed of which the playback happens... it is half a second behind from when I speak into my microphone till playback through my speakers.
How do I increase performance? How do I lower the latency?
private void initForLiveMonitor() {
AudioFormat format = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 44100, 16, 2, 4, 44100, false);
try {
//Speaker
DataLine.Info info = new DataLine.Info(SourceDataLine.class, format);
SourceDataLine sourceLine = (SourceDataLine) AudioSystem.getLine(info);
sourceLine.open();
//Microphone
info = new DataLine.Info(TargetDataLine.class, format);
TargetDataLine targetLine = (TargetDataLine) AudioSystem.getLine(info);
targetLine.open();
Thread monitorThread = new Thread() {
#Override
public void run() {
targetLine.start();
sourceLine.start();
byte[] data = new byte[targetLine.getBufferSize() / 5];
int readBytes;
while (true) {
readBytes = targetLine.read(data, 0, data.length);
sourceLine.write(data, 0, readBytes);
}
}
};
System.out.println( "Start LIVE Monitor for 15 seconds" );
monitorThread.start();
Thread.sleep(15000);
targetLine.stop();
targetLine.close();
System.out.println( "End LIVE Monitor" );
}
catch(LineUnavailableException lue) { lue.printStackTrace(); }
catch(InterruptedException ie) { ie.printStackTrace(); }
}
Additional Notes
With this code, the playback is smooth (no pops nor jitters), just half a second delayed.
I also know that my computer and USB Audio interface are capable to handle real-time monitoring through the computer, because when I do a side-by-side comparison with Logic Pro X there are minimal delays--I perceive no delay at all.
My attempts at making smaller/larger the byte[] size haven't helped the issue.
My conclusion, is that this is a Java code issue I have. Thanks in advance.
There is more than one buffer involved!
When you open the SourceDataLine and TargetDataLine, I'd recommend using the form where you specify the buffer size. But I don't know what size to recommend. I haven't played around with this enough to know what the optimum size is for safely piping microphone input--my experience is more with real-time synthesis.
Anyway, how about this: define the length of data[] and use the same length in your line opening methods. Try numbers like 1024 or multiples (while making sure the number of bytes can be evenly divided by the per-frame number of bytes which looks to be 4 according to the format you are using).
int bufferLen = 1024 * 4; // experiment with buffer size here
byte[] data = new byte[bufferLen];
sourceLine.open(bufferLen);
targetLine.open(bufferLen);
Also, maybe code in your run() would be better placed elsewhere so as not to add to the required processing before the piping can even start. The array data[] and int readBytes could be instance variables and ready to roll rather than being dinked with in the run(), potentially adding to the latency.
Those are things I'd try, anyway.

What exactly does AudioInputStream.read method return?

I have some problems finding out, what I actually read with the AudioInputStream. The program below just prints the byte-array I get but I actually don't even know, if the bytes are actually the samples, so the byte-array is the audio wave.
File fileIn;
AudioInputStream audio_in;
byte[] audioBytes;
int numBytesRead;
int numFramesRead;
int numBytes;
int totalFramesRead;
int bytesPerFrame;
try {
audio_in = AudioSystem.getAudioInputStream(fileIn);
bytesPerFrame = audio_in.getFormat().getFrameSize();
if (bytesPerFrame == AudioSystem.NOT_SPECIFIED) {
bytesPerFrame = 1;
}
numBytes = 1024 * bytesPerFrame;
audioBytes = new byte[numBytes];
try {
numBytesRead = 0;
numFramesRead = 0;
} catch (Exception ex) {
System.out.println("Something went completely wrong");
}
} catch (Exception e) {
System.out.println("Something went completely wrong");
}
and in some other part, I read some bytes with this:
try {
if ((numBytesRead = audio_in.read(audioBytes)) != -1) {
numFramesRead = numBytesRead / bytesPerFrame;
totalFramesRead += numFramesRead;
}
} catch (Exception e) {
System.out.println("Had problems reading new content");
}
So first of all, this code is not from me. This is my first time, reading audio-files so I got some help from the inter-webs. (Found the link:
Java - reading, manipulating and writing WAV files
stackoverflow, who would have known.
The question is, what are the bytes in audioBytes representing? Since the source is a 44kHz, stereo, there have to be 2 waves hiding in there somewhere, am I right? so how do I filter the important informations out of these bytes?
// EDIT
So what I added is this function:
public short[] Get_Sample() {
if(samplesRead == 1024) {
Read_Buffer();
samplesRead = 4;
} else {
samplesRead = samplesRead + 4;
}
short sample[] = new short[2];
sample[0] = (short)(audioBytes[samplesRead-4] + 256*audioBytes[samplesRead-3]);
sample[1] = (short)(audioBytes[samplesRead-2] + 256*audioBytes[samplesRead-1]);
return sample;
}
where Read_Buffer() reads the next 1024 (or less) Bytes and loads them into audioBytes. sample[0] is used for the left side, sample[1] for the right side. But I'm still not sure since the waves i get from this look quite "noisy". (Edit: the used WAV actually used little-endian byte order so I had to change the calculation.)
AudioInputStream read() method returns the raw audio data. You don't know what is the 'construction' of data before you read the audio format with getFormat() which returns AudioFormat. From AudioFormat you can getChannels() and getSampleSizeInBits() and more... This is because the AudioInputStream is made for known format.
If you calculate a sample value you have different possibilities with signes and
endianness of the data (in case of 16-bit sample). To make a more generic code
use your AudioFormat object returned from AudioInputStream to get more info
about the data buffer:
encoding() : PCM_SIGNED, PCM_UNSIGNED ...
bigEndian() : true or false
As you already discovered the incorrect sample building may lead to some disturbed sound. If you work with various files it may case a problems in the future. If you won't provide a support for some formats just check what says AudioFormat and throw exception (e.g. javax.sound.sampled.UnsupportedAudioFileException). It will save your time.

Choose sound card channel for output

I am going to output some audio in a java program, to a sound card with multiple chanels (8 channels).
For this I want to send different sounds on different channels on the sound card (so it can be played on different speakers) The audio being sent is different files (maybe mp3).
Do anyone have any suggestions on how to do this with java?
What I have tried so far is the javax.sound.sampled library. I did manage to just send some sound out on the speakers, but not to decide what channels to use.
I have tried using Port.Info, but can't seem to handle its syntax.
Here is the code so far, and this work:
// open up an audio stream
private static void init() {
try {
// 44,100 samples per second, 16-bit audio, mono, signed PCM, little Endian
AudioFormat format = new AudioFormat((float) SAMPLE_RATE, BITS_PER_SAMPLE, 1, true, false);
DataLine.Info info = new DataLine.Info(SourceDataLine.class, format);
//Port.Info port = new Port.Info((Port.class), Port.Info.SPEAKER, false);
//Port.Info pi = new Port.Info((Port.class), "HEADPHONES", false);
line = (SourceDataLine) AudioSystem.getLine(info);
//line = (SourceDataLine) AudioSystem.getLine(port);
line.open(format, SAMPLE_BUFFER_SIZE * BYTES_PER_SAMPLE);
// the internal buffer is a fraction of the actual buffer size, this choice is arbitrary
// it gets divided because we can't expect the buffered data to line up exactly with when
// the sound card decides to push out its samples.
buffer = new byte[SAMPLE_BUFFER_SIZE * BYTES_PER_SAMPLE/3];
} catch (Exception e) {
System.out.println(e.getMessage());
System.exit(1);
}
// no sound gets made before this call
line.start();
}

Sound wave from TargetDataLine

Currently I am trying to record a sound wave from a mic and display amplitude values in realtime in Java. I came across Targetdataline but I am having a bit of trouble understanding I get data from it.
Sample code from Oracle states:
line = (TargetDataLine) AudioSystem.getLine(info);
line.open(format, line.getBufferSize());
ByteArrayOutputStream out = new ByteArrayOutputStream();
int numBytesRead;
byte[] data = new byte[line.getBufferSize() / 5];
// Begin audio capture.
line.start();
// Here, stopped is a global boolean set by another thread.
while (!stopped) {
// Read the next chunk of data from the TargetDataLine.
numBytesRead = line.read(data, 0, data.length);
****ADDED CODE HERE*****
// Save this chunk of data.
out.write(data, 0, numBytesRead);
}
So I am currently trying to add code to get a input stream of amplitude values however I get a ton of bytes when I print what the variable data is at the added code line.
for (int j=0; j<data.length; j++) {
System.out.format("%02X ", data[j]);
}
Does anyone who has used TargetDataLine before know how I can make use of it?
For anyone who has trouble using TargetDataLine for sound extraction in the future, the class WaveData by Ganesh Tiwari contains a very helpful method that turns bytes into a float array (http://code.google.com/p/speech-recognition-java-hidden-markov-model-vq-mfcc/source/browse/trunk/SpeechRecognitionHMM/src/org/ioe/tprsa/audio/WaveData.java):
public float[] extractFloatDataFromAudioInputStream(AudioInputStream audioInputStream) {
format = audioInputStream.getFormat();
audioBytes = new byte[(int) (audioInputStream.getFrameLength() * format.getFrameSize())];
// calculate durationSec
float milliseconds = (long) ((audioInputStream.getFrameLength() * 1000) / audioInputStream.getFormat().getFrameRate());
durationSec = milliseconds / 1000.0;
// System.out.println("The current signal has duration "+durationSec+" Sec");
try {
audioInputStream.read(audioBytes);
} catch (IOException e) {
System.out.println("IOException during reading audioBytes");
e.printStackTrace();
}
return extractFloatDataFromAmplitudeByteArray(format, audioBytes);
}
Using this I can get sound amplitude data.

TargetDataLine and Xuggler to record audio with a video of the screen

TargetDataLine is, for me so far, the easiest way to capture microphone input in Java. I want to encode the audio that I capture with a video of the screen [in a screen recorder software] so that the user can create a tutorial, slide case etc.
I use Xuggler to encode the video.
They do have a tutorial on encoding audio with video but they take their audio from a file. In my case, the audio is live.
To encode the video I use com.xuggle.mediaTool.IMediaWriter. The IMediaWriter object allows me to add a video stream and has an
encodeAudio(int streamIndex, short[] samples, long timeStamp, TimeUnit timeUnit)
I can use that if I can get the samples from target data line as short[]. It returns byte[]
So two questions are:
How can I encode the live audio with video?
How do I maintain the proper timing of the audio packets so that they are encoded at the proper time?
References:
1. DavaDoc for TargetDataLine: http://docs.oracle.com/javase/1.4.2/docs/api/javax/sound/sampled/TargetDataLine.html
2. Xuggler Documentation: http://build.xuggle.com/view/Stable/job/xuggler_jdk5_stable/javadoc/java/api/index.html
Update
My code for capturing video
public void run(){
final IRational FRAME_RATE = IRational.make(frameRate, 1);
final IMediaWriter writer = ToolFactory.makeWriter(completeFileName);
writer.addVideoStream(0, 0,FRAME_RATE, recordingArea.width, recordingArea.height);
long startTime = System.nanoTime();
while(keepCapturing==true){
image = bot.createScreenCapture(recordingArea);
PointerInfo pointerInfo = MouseInfo.getPointerInfo();
Point globalPosition = pointerInfo.getLocation();
int relativeX = globalPosition.x - recordingArea.x;
int relativeY = globalPosition.y - recordingArea.y;
BufferedImage bgr = convertToType(image,BufferedImage.TYPE_3BYTE_BGR);
if(cursor!=null){
bgr.getGraphics().drawImage(((ImageIcon)cursor).getImage(), relativeX,relativeY,null);
}
try{
writer.encodeVideo(0,bgr,System.nanoTime()-startTime,TimeUnit.NANOSECONDS);
}catch(Exception e){
writer.close();
JOptionPane.showMessageDialog(null,
"Recording will stop abruptly because" +
"an error has occured", "Error",JOptionPane.ERROR_MESSAGE,null);
}
try{
sleep(sleepTime);
}catch(InterruptedException e){
e.printStackTrace();
}
}
writer.close();
}
I answered most of that recently under this question: Xuggler encoding and muxing
Code sample:
writer.addVideoStream(videoStreamIndex, 0, videoCodec, width, height);
writer.addAudioStream(audioStreamIndex, 0, audioCodec, channelCount, sampleRate);
while (... have more data ...)
{
BufferedImage videoFrame = ...;
long videoFrameTime = ...; // this is the time to display this frame
writer.encodeVideo(videoStreamIndex, videoFrame, videoFrameTime, DEFAULT_TIME_UNIT);
short[] audioSamples = ...; // the size of this array should be number of samples * channelCount
long audioSamplesTime = ...; // this is the time to play back this bit of audio
writer.encodeAudio(audioStreamIndex, audioSamples, audioSamplesTime, DEFAULT_TIME_UNIT);
}
In the case of TargetDataLine, getMicrosecondPosition() will tell you the time you need for audioSamplesTime. This appears to start from the time the TargetDataLine was opened. You need to figure out how to get a video timestamp referenced to the same clock, which depends on the video device and/or how you capture video. The absolute values do not matter as long as they are both using the same clock. You could subtract the initial value (at start of stream) from both your video and your audio times so that the timestamps match, but that is only a somewhat approximate match (probably close enough in practice).
You need to call encodeVideo and encodeAudio in strictly increasing order of time; you may have to buffer some audio and some video to make sure you can do that. More details here.

Categories

Resources