I am going to output some audio in a java program, to a sound card with multiple chanels (8 channels).
For this I want to send different sounds on different channels on the sound card (so it can be played on different speakers) The audio being sent is different files (maybe mp3).
Do anyone have any suggestions on how to do this with java?
What I have tried so far is the javax.sound.sampled library. I did manage to just send some sound out on the speakers, but not to decide what channels to use.
I have tried using Port.Info, but can't seem to handle its syntax.
Here is the code so far, and this work:
// open up an audio stream
private static void init() {
try {
// 44,100 samples per second, 16-bit audio, mono, signed PCM, little Endian
AudioFormat format = new AudioFormat((float) SAMPLE_RATE, BITS_PER_SAMPLE, 1, true, false);
DataLine.Info info = new DataLine.Info(SourceDataLine.class, format);
//Port.Info port = new Port.Info((Port.class), Port.Info.SPEAKER, false);
//Port.Info pi = new Port.Info((Port.class), "HEADPHONES", false);
line = (SourceDataLine) AudioSystem.getLine(info);
//line = (SourceDataLine) AudioSystem.getLine(port);
line.open(format, SAMPLE_BUFFER_SIZE * BYTES_PER_SAMPLE);
// the internal buffer is a fraction of the actual buffer size, this choice is arbitrary
// it gets divided because we can't expect the buffered data to line up exactly with when
// the sound card decides to push out its samples.
buffer = new byte[SAMPLE_BUFFER_SIZE * BYTES_PER_SAMPLE/3];
} catch (Exception e) {
System.out.println(e.getMessage());
System.exit(1);
}
// no sound gets made before this call
line.start();
}
Related
Good morning folks, I am trying to send audio data from a microphone attached to an ESP32 board over wifi to my desktop running some Java code. if I run the audio data using Java's AudioSystems library its a bit staticy but is legible. switching to use the Sphinx-4 library which converts audio to text it only sometimes recognizes the words.
This is the first time I've had to mess with raw audio data so it may not even be possible since the board can only read up to 12 bit signals which means converting a 16 bit, every single 12 bit value maps at 15 16bit values. it could also be due to the roughly 115 microsecond delay to down sample to 16kHz
How can I smooth out the audio playback enough that it can be easily recognized by the Sphinx4 library? The current implementation has very small breaks and some noise that I think is throwing it off
ESP32 Code:
BUFFERMAX = 8000
ONE_SECOND = 1000000
int writeBuffer[BUFFERMAX];
void writeAudio(){
for(int i=0; i< BUFFERMAX;i=i+1){
//data read in is 12 bits so I mapped the value to 16 bits ( 2 bytes)
sensorValue = (map(analogRead(sensorPin), 0, 4096, -32000, 32000));
//none to minimal sound is around -7000 so try to zero out additional noise with average
int prevAvg = avg;
avg = (avg + sensorValue)/2;
sensorValue = (abs(prevAvg) + sensorValue);
if(abs(sensorValue) < 1000){sensorValue = 0;}
writeBuffer[i] = ((sensorValue));
// delay so that 8000 INTs (16000 bytes) takes one second to record
delayMicroseconds(delayMicro);
}
client.write((byte*)writeBuffer, sizeof(writeBuffer));
Java Sphinx:
StreamSpeechRecognizer recognizer = new StreamSpeechRecognizer(configuration);
// Start recognition process pruning previously cached data.
recognizer.startRecognition(socket.getInputStream() );
System.out.print("awaiting command...");
SpeechResult result = recognizer.getResult();
System.out.println(result.getHypothesis().toLowerCase());
Java play audio:
private static void init() throws LineUnavailableException {
// specifying the audio format
AudioFormat _format = new AudioFormat(16000.F,// Sample Rate
16, // Size of SampleBits
1, // Number of Channels
true, // Is Signed?
false // Is Big Endian?
);
// creating the DataLine Info for the speaker format
DataLine.Info speakerInfo = new DataLine.Info(SourceDataLine.class, _format);
// getting the mixer for the speaker
_speaker = (SourceDataLine) AudioSystem.getLine(speakerInfo);
_speaker.open(_format);
}
_streamIn = socket.getInputStream();
_speaker.start();
byte[] data = new byte[16000];
System.out.println("Waiting for data...");
while (_running) {
long start = new Date().getTime();
// checking if the data is available to speak
if (_streamIn.available() <= 0)
continue; // data not available so continue back to start of loop
// count of the data bytes read
int readCount= _streamIn.read(data, 0, data.length);
if(readCount > 0 && (readCount%2) == 0){
System.out.println(readCount);
_speaker.write(data, 0, readCount);
readCount=0;
}
System.out.println("Time: " + (new Date().getTime() - start));
}
I think I have a performance (latency) issue with the Java Sound API.
Audio Monitor
The following code does indeed work for me. It correctly opens up the microphone, and outputs the audio input through my speakers in real time (i.e. monitoring). But my concern is the speed of which the playback happens... it is half a second behind from when I speak into my microphone till playback through my speakers.
How do I increase performance? How do I lower the latency?
private void initForLiveMonitor() {
AudioFormat format = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 44100, 16, 2, 4, 44100, false);
try {
//Speaker
DataLine.Info info = new DataLine.Info(SourceDataLine.class, format);
SourceDataLine sourceLine = (SourceDataLine) AudioSystem.getLine(info);
sourceLine.open();
//Microphone
info = new DataLine.Info(TargetDataLine.class, format);
TargetDataLine targetLine = (TargetDataLine) AudioSystem.getLine(info);
targetLine.open();
Thread monitorThread = new Thread() {
#Override
public void run() {
targetLine.start();
sourceLine.start();
byte[] data = new byte[targetLine.getBufferSize() / 5];
int readBytes;
while (true) {
readBytes = targetLine.read(data, 0, data.length);
sourceLine.write(data, 0, readBytes);
}
}
};
System.out.println( "Start LIVE Monitor for 15 seconds" );
monitorThread.start();
Thread.sleep(15000);
targetLine.stop();
targetLine.close();
System.out.println( "End LIVE Monitor" );
}
catch(LineUnavailableException lue) { lue.printStackTrace(); }
catch(InterruptedException ie) { ie.printStackTrace(); }
}
Additional Notes
With this code, the playback is smooth (no pops nor jitters), just half a second delayed.
I also know that my computer and USB Audio interface are capable to handle real-time monitoring through the computer, because when I do a side-by-side comparison with Logic Pro X there are minimal delays--I perceive no delay at all.
My attempts at making smaller/larger the byte[] size haven't helped the issue.
My conclusion, is that this is a Java code issue I have. Thanks in advance.
There is more than one buffer involved!
When you open the SourceDataLine and TargetDataLine, I'd recommend using the form where you specify the buffer size. But I don't know what size to recommend. I haven't played around with this enough to know what the optimum size is for safely piping microphone input--my experience is more with real-time synthesis.
Anyway, how about this: define the length of data[] and use the same length in your line opening methods. Try numbers like 1024 or multiples (while making sure the number of bytes can be evenly divided by the per-frame number of bytes which looks to be 4 according to the format you are using).
int bufferLen = 1024 * 4; // experiment with buffer size here
byte[] data = new byte[bufferLen];
sourceLine.open(bufferLen);
targetLine.open(bufferLen);
Also, maybe code in your run() would be better placed elsewhere so as not to add to the required processing before the piping can even start. The array data[] and int readBytes could be instance variables and ready to roll rather than being dinked with in the run(), potentially adding to the latency.
Those are things I'd try, anyway.
I'm having some trouble with reading and playing certain audio clips on Android 2.0.1 (Motorola Droid A855). Below is the code segment that I use. It works fine for some files, but for other files it just doesn't exit the while loop. I have tried checking
InputStream.available()
method but with no luck. I even printed out the number of bytes it reads properly before getting stuck. It seems that it gets stuck in the loop at the last round of read (have less than < 512 bytes left), and doesn't exit the loop.
int sampleFreq = 44100;
int minBufferSize = AudioTrack.getMinBufferSize(sampleFreq, AudioFormat.CHANNEL_IN_STEREO, AudioFormat.ENCODING_PCM_16BIT);
int bufferSize = 512;
AudioTrack at = new AudioTrack(AudioManager.STREAM_MUSIC, sampleFreq, AudioFormat.CHANNEL_IN_STEREO, AudioFormat.ENCODING_PCM_16BIT, minBufferSize, AudioTrack.MODE_STREAM);
InputStream input;
try {
File fileID=new File(Environment.getExternalStorageDirectory(),resourceID);
input = new FileInputStream( fileID);
int filesize=(int)fileID.length();
int i=0,byteread=0;
byte[] s = new byte[bufferSize];
at.play();
while((i = input.read(s, 0, bufferSize))>-1){
at.write(s, 0, i);
//at.flush();
byteread+=i;
Log.i(TAG,"playing audio "+byteread+"\t"+filesize);
}
at.stop();
at.release();
input.close();
} catch (FileNotFoundException e) {
// TODO
e.printStackTrace();
} catch (IOException e) {
// TODO
e.printStackTrace();
}
Audio files are around 1-2MB in size and are in wav format. Following is an example of the logging-
> : playing audio 1057280 1058474
> : playing audio 1057792 1058474
> : playing audio 1058304 1058474
Any idea why this is happening as it runs perfectly for some of the audio files.
Make sure your call to write() always delivers a byte size which is an integral number of samples.
For your 16 bit stereo mode, that should be an integral multiple of 4 bytes.
Additionally, at least before the final write, for stutter-free operation you should really respect the minimum buffer size of the audio subsystem and deliver at least that much data in each call to the audio write method.
If your source data is a .wav file, make sure you actually skip the header and read samples only starting from a valid payload chunk.
TargetDataLine is, for me so far, the easiest way to capture microphone input in Java. I want to encode the audio that I capture with a video of the screen [in a screen recorder software] so that the user can create a tutorial, slide case etc.
I use Xuggler to encode the video.
They do have a tutorial on encoding audio with video but they take their audio from a file. In my case, the audio is live.
To encode the video I use com.xuggle.mediaTool.IMediaWriter. The IMediaWriter object allows me to add a video stream and has an
encodeAudio(int streamIndex, short[] samples, long timeStamp, TimeUnit timeUnit)
I can use that if I can get the samples from target data line as short[]. It returns byte[]
So two questions are:
How can I encode the live audio with video?
How do I maintain the proper timing of the audio packets so that they are encoded at the proper time?
References:
1. DavaDoc for TargetDataLine: http://docs.oracle.com/javase/1.4.2/docs/api/javax/sound/sampled/TargetDataLine.html
2. Xuggler Documentation: http://build.xuggle.com/view/Stable/job/xuggler_jdk5_stable/javadoc/java/api/index.html
Update
My code for capturing video
public void run(){
final IRational FRAME_RATE = IRational.make(frameRate, 1);
final IMediaWriter writer = ToolFactory.makeWriter(completeFileName);
writer.addVideoStream(0, 0,FRAME_RATE, recordingArea.width, recordingArea.height);
long startTime = System.nanoTime();
while(keepCapturing==true){
image = bot.createScreenCapture(recordingArea);
PointerInfo pointerInfo = MouseInfo.getPointerInfo();
Point globalPosition = pointerInfo.getLocation();
int relativeX = globalPosition.x - recordingArea.x;
int relativeY = globalPosition.y - recordingArea.y;
BufferedImage bgr = convertToType(image,BufferedImage.TYPE_3BYTE_BGR);
if(cursor!=null){
bgr.getGraphics().drawImage(((ImageIcon)cursor).getImage(), relativeX,relativeY,null);
}
try{
writer.encodeVideo(0,bgr,System.nanoTime()-startTime,TimeUnit.NANOSECONDS);
}catch(Exception e){
writer.close();
JOptionPane.showMessageDialog(null,
"Recording will stop abruptly because" +
"an error has occured", "Error",JOptionPane.ERROR_MESSAGE,null);
}
try{
sleep(sleepTime);
}catch(InterruptedException e){
e.printStackTrace();
}
}
writer.close();
}
I answered most of that recently under this question: Xuggler encoding and muxing
Code sample:
writer.addVideoStream(videoStreamIndex, 0, videoCodec, width, height);
writer.addAudioStream(audioStreamIndex, 0, audioCodec, channelCount, sampleRate);
while (... have more data ...)
{
BufferedImage videoFrame = ...;
long videoFrameTime = ...; // this is the time to display this frame
writer.encodeVideo(videoStreamIndex, videoFrame, videoFrameTime, DEFAULT_TIME_UNIT);
short[] audioSamples = ...; // the size of this array should be number of samples * channelCount
long audioSamplesTime = ...; // this is the time to play back this bit of audio
writer.encodeAudio(audioStreamIndex, audioSamples, audioSamplesTime, DEFAULT_TIME_UNIT);
}
In the case of TargetDataLine, getMicrosecondPosition() will tell you the time you need for audioSamplesTime. This appears to start from the time the TargetDataLine was opened. You need to figure out how to get a video timestamp referenced to the same clock, which depends on the video device and/or how you capture video. The absolute values do not matter as long as they are both using the same clock. You could subtract the initial value (at start of stream) from both your video and your audio times so that the timestamps match, but that is only a somewhat approximate match (probably close enough in practice).
You need to call encodeVideo and encodeAudio in strictly increasing order of time; you may have to buffer some audio and some video to make sure you can do that. More details here.
This is a code that will attempt to record an audio sample : but i've a not constructed AudioFormat object ( that has been passed to DataLine.Info) because i don't know the sample rate.
EDIT
I have seen that just randomly placing sample rate of 8000 works . But is it fine ? Can i keep any value of sample rate ?
boolean lineIsStopped = false;
TargetDataLine line = null;
AudioFormat af; // object not constructed through out
DataLine.Info info = new DataLine.Info(TargetDataLine.class, af); // af not initialized
try {
line = (TargetDataLine)AudioSystem.getLine(info);
line.open( af );
} catch( LineUnavailableException ex ) {
// handle the error
}
// now we are ready for an input
// call start to start accepting data from mic
byte data[] = new byte[ line.getBufferSize() / 5 ];
line.start(); // this statement starts delivering data into the line buffer
// start retreiving data from the line buffer
int numBytesRead;
int offset = 0;
ByteArrayOutputStream out = new ByteArrayOutputStream();
while( ! lineIsStopped ) { // when the line is not stopped i.e is active
numBytesRead = line.read( data , offset , data.length );
// now save the data
try {
out.write(data); // writes data to this output stream !
} catch( Exception exc) {
System.out.println(exc);
}
}
In this how can i construct audio format object without getting any audio sample ?
After reading your comments, you are recording from the mic. In which case you want to set the audio format according to the quality you want from the mic. If you want telephone quality 8k hz would be fine. If you want tape quality 22khz, and if you want CD quality audio 44.1khz. Of course, if you transmitting that over the network then 8khz is probably going to be good enough.
It's always a good idea to have this be a setting if your application so the user can control what quality they want.