I'm trying to write a small program that react when the user is speaking. like have a circle get bigger or something like that.
im using this code to access the microphone, but how do I make it react only when the user is speaking? e.g. when the recorded volume is larger than some amount.
TargetDataLine line = null;
AudioFormat format = new AudioFormat(16000, 16, 1, true, true);
DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);
if(! AudioSystem.isLineSupported(info)){
System.out.println("Line is not supported");
}
try{
line = (TargetDataLine) AudioSystem.getLine(info);
line.open();
}catch(LineUnavailableException e){
System.out.println("Failed to get line");
System.exit(-1);
}
ByteArrayOutputStream out = new ByteArrayOutputStream();
int numBytesRead;
byte[] data = new byte[line.getBufferSize() / 5];
// Begin audio capture.
line.start();
int i = 0;
// Here, stopped is a global boolean set by another thread.
while (i<100) {
// Read the next chunk of data from the TargetDataLine.
numBytesRead = line.read(data, 0, data.length);
// Save this chunk of data.
out.write(data, 0, numBytesRead);
i++;
System.out.println(i);
}
Within the last while loop, you are collecting sound data in a buffer variable called "data". What you need to do is to take those bytes and assemble them into usable DSP values. The code for doing so depends on the format. Most common is 16-bit encoding, stereo, little-endian. In this case you would have to assemble pairs of bytes into values, where the first byte is the lower bits and the second byte are the higher bits. There are several posts on this subject with the details of how to handle this.
The values will range from something like -32768 to 32767 (I am writing from memory and might be off, but it is the range of a short). It is hard to say where you will want your threshold to be, as the volume depends not only on the absolute value (larger is louder), but the amount of time spent at the larger values. It is possible for a "quiet" sound to have transients that are very large values. Also, the numbers don't correspond directly with decibels, and a conversion formula is needed.
So, there are a couple issues to deal with, but if you just get into the while loop and decode "data" you might be able to get something quick and dirty that works "well enough".
Related
In my project I have implemented an audio record option. For reading real-time voice I used TargetDataLine.I want to record audio with high volume. How can I do that?
TargetDataLine line;
DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);
File file = new File("RecordedVoice.raw");
if(file.exists())
file.delete();
file.createNewFile();
fos = new FileOutputStream(file);
if (!AudioSystem.isLineSupported(info)) {
System.out.println("Line not supported: " + format.toString());
System.exit(0);
} else {
try {
line = (TargetDataLine) AudioSystem.getLine(info);
line.open(format);
out = new ByteArrayOutputStream();
int numBytesRead;
byte[] data = new byte[line.getBufferSize()/5];
line.start();
while (!isCancelled()) {
numBytesRead = line.read(data, 0, data.length);
out.write(data, 0, numBytesRead);
fos.write(data, 0, data.length);
}
out.close();
} catch (Exception excp) {
System.out.println("Error! Could not open Audio System line!");
excp.printStackTrace();
}
}
In the while loop, these steps:
convert the bytes to PCM values (depends on format, e.g., 16 bit is two bytes, 24 bit is three bytes, also pay attention whether it is big-endian or little-endian as this gives you the order of the bytes
multiply the PCM values by a volume factor (e.g., 1.1 to raise the volume a bit). You might want to also include a Math.min() to prevent the values from exceeding the range of the bits used to encode the values.
convert the PCM values back to bytes, according to your audio format specs
There may be a way to do this instead via Controls. The plan I wrote about is basically how to do it manually, as written about in the linked tutorial in the last section "Manipulating the Audio Data Directly". Benefits: this is all within Java, instead of being dependent upon the given PC and OS (audio Controls are not guaranteed to exist). Also you have frame-level granularity. I think the Controls in the article tend to only enact changes at buffer boundaries.
As a hobby project, I'm writing an android voip client. When writing voice data to the socket (Vars.mediaSocket), many times, the data isn't immediately sent out over the wifi but just stalls and then all at once it will send 20 seconds worth of voice. Then it will stall again and wait for 30 seconds and then send 30 seconds of voice. The wait is not consistent but after a while it will continuously send voice data immediately. I've tried everything from using DataOutputStream to setting the socket output buffer size, setting the sendbuffer size huge, small, and lastly, buffering the voice data from its 32 byte chunks to anything from 128bytes to 32kb.
Utils.logcat(Const.LOGD, encTag, "MediaCodec encoder thread has started");
isEncoding = true;
byte[] amrbuffer = new byte[32];
short[] wavbuffer = new short[160];
int outputCounter = 0;
//setup the wave audio recorder. since it is released and restarted, it needs to be setup here and not onCreate
wavRecorder = null; //remove pointer to the old recorder for safety
wavRecorder = new AudioRecord(MediaRecorder.AudioSource.MIC, SAMPLESWAV, AudioFormat.CHANNEL_IN_MONO, FORMAT, 160);
wavRecorder.startRecording();
AmrEncoder.init(0);
while(!micMute)
{
int totalRead = 0, dataRead;
while(totalRead < 160)
{//although unlikely to be necessary, buffer the mic input
dataRead = wavRecorder.read(wavbuffer, totalRead, 160 - totalRead);
totalRead = totalRead + dataRead;
}
int encodeLength = AmrEncoder.encode(AmrEncoder.Mode.MR122.ordinal(), wavbuffer, amrbuffer);
try
{
Vars.mediaSocket.getOutputStream().write(amrbuffer);
Vars.mediaSocket.getOutputStream().flush();
}
catch (IOException i)
{
Utils.logcat(Const.LOGE, encTag, "Cannot send amr out the media socket");
Utils.dumpException(tag, i);
}
Is there something I'm missing? To simulate a second cell phone, I have another client which just simply reads the voice data, throws it away, and reads again in a loop. I can confirm in the simulated second cell phone when the real cell phone stops sending voice, the simulated one's socket.read hangs until the real one starts sending voice again.
I'm really hoping not to have to write a jni for the socket as I don't know anything about that and was hoping I could write the app as a standard java app.
CASE CLOSED: turned out to be a server side bug but the simplifying back to basics suggestions is still a good idea.
You are adding most of the latency yourself by reading large amounts of data before writing any of it. You should just use the standard Java copy loop:
byte[] buffer = new byte[8192];
int count;
while ((count = in.read(buffer)) > 0)
{
out.write(buffer, 0, count);
}
You need to adapt this to incorporate your codec step. Note that you don't need a buffer the size of the entire input. You can tune its size to suit yourself but 8192 is a good starting point. You can increase it to say 32k but don't decrease it. If your codec needs the data in fixed-size chunks, use a buffer of that size and DataInputStream.readFully(). But the larger the buffer the more the latency.
EDIT Specific issues with your code:
byte[] amrbuffer = new byte[AMRBUFFERSIZE];
byte[] outputbuffer = new byte [outputBufferSize];
Remove (see below).
short[] wavbuffer = new short[WAVBUFFERSIZE];
int outputCounter = 0;
Remove outputCounter.
//setup the wave audio recorder. since it is released and restarted, it needs to be setup here and not onCreate
wavRecorder = null; //remove pointer to the old recorder for safety
Pointless. Remove.
wavRecorder = new AudioRecord(MediaRecorder.AudioSource.MIC, SAMPLESWAV, AudioFormat.CHANNEL_IN_MONO, FORMAT, WAVBUFFERSIZE);
wavRecorder.startRecording();
AmrEncoder.init(0);
OK.
try
{
Vars.mediaSocket.setSendBufferSize(outputBufferSize);
}
catch (SocketException e)
{
e.printStackTrace();
}
Pointless. Remove. The socket send buffer should be as large as possible. Unless you know that its default size is < outputBufferSize there is no benefit to this. In any case we are getting rid of outputBuffer altogether.
while(!micMute)
{
int totalRead = 0, dataRead;
while(totalRead < WAVBUFFERSIZE)
{//although unlikely to be necessary, buffer the mic input
dataRead = wavRecorder.read(wavbuffer, totalRead, WAVBUFFERSIZE - totalRead);
totalRead = totalRead + dataRead;
}
int encodeLength = AmrEncoder.encode(AmrEncoder.Mode.MR122.ordinal(), wavbuffer, amrbuffer);
OK.
if(outputCounter == outputBufferSize)
{
Utils.logcat(Const.LOGD, encTag, "Sending output buffer");
try
{
Vars.mediaSocket.getOutputStream().write(outputbuffer);
Vars.mediaSocket.getOutputStream().flush();
}
catch (IOException i)
{
Utils.logcat(Const.LOGE, encTag, "Cannot send amr out the media socket");
Utils.dumpException(tag, i);
}
outputCounter = 0;
}
System.arraycopy(amrbuffer, 0, outputbuffer, outputCounter, encodeLength);
outputCounter = outputCounter + encodeLength;
Utils.logcat(Const.LOGD, encTag, "Output buffer fill: " + outputCounter);
Remove all the above and substitute
Vars.mediaSocket.getOutputStream().write(amrbuffer, 0, encodeLength);
This also means you can get rid of 'outputBuffer' as promised.
NB Don't flush inside loops. As a matter of fact flushing a socket output stream does nothing, but the general principle still holds.
I am willing to register some audio from my microphone using javax.sound API, but the generated file cannot be read by my audio players.
I wrote a test method that starts a thread to register, waits some seconds, notify to interrupt registration, waits some more seconds and then persists the recorded audio to disk.
Here's the code (excluding exception management).
public void record() {
VoiceRecorder voiceRecorder = new VoiceRecorder();
Future<ByteArrayOutputStream> result = executor.submit(voiceRecorder);
Thread.sleep(3000);
voiceRecorder.signalStopRecording();
Thread.sleep(1000);
ByteArrayOutputStream audio = result.get();
FileOutputStream stream = new FileOutputStream("./" + filename + ".mp3");
stream.write(audio.toByteArray());
stream.close();
}
VoiceRecorder is a class of mine, whose core code is this:
public ByteArrayOutputStream call() {
AudioFormat standardFormat = new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 128, 16, 1, 2, 128, false);
TargetDataLine microphone = null;
microphone = AudioSystem.getTargetDataLine(format);
microphone.open(format);
int numBytesRead;
byte[] data = new byte[microphone.getBufferSize() / 5];
// Begin audio capture.
microphone.start();
ByteArrayOutputStream recordedAudioRawData = new ByteArrayOutputStream();
while (!stopped) {
// Read the next chunk of data from the TargetDataLine.
numBytesRead = microphone.read(data, 0, data.length);
// Save this chunk of data.
recordedAudioRawData.write(data, 0, numBytesRead);
}
return recordedAudioRawData;
}
This code is run by my executor and registration happens, in fact a non-empty file is generated (684 bytes for 3 seconds, 988 bytes for 4 seconds), but it does not get opened with my players (e.g. VLC).
Where should I look for the issue? Is there any alternative to this approach you would recommend? Next step will be to reproduce the recorded audio. Thanks.
It looks like you are just writing the raw PCM bytes you have read to the file, this is not a format that most players will know how to deal with.
You need to use something like AudioSystem.write to write the file in a recognized format.
I am working on sound processing with the use of Java now. Within my project, I have to deal with the stream. So I have a lot of staffs to do with DataLine and OutputStream or InputStream.
But to me, they are too similar:(
Is there someone who can help me with this question? Thanks in advance!
Here are some code I used :
TargetDataLine line;
ByteArrayOutputStream out = new ByteArrayOutputStream();
int frameSizeInBytes = format.getFrameSize();
int bufferLengthInFrames = line.getBufferSize() / 8;
int bufferLengthInBytes = bufferLengthInFrames * frameSizeInBytes;
byte[] data = new byte[bufferLengthInBytes];
int numBytesRead;
try {
line = (TargetDataLine) AudioSystem.getLine(info);
line.open(format, line.getBufferSize());
} catch (LineUnavailableException ex) {
shutDown("Unable to open the line: " + ex);
return;
} catch (SecurityException ex) {
shutDown(ex.toString());
return;
} catch (Exception ex) {
shutDown(ex.toString());
return;
}
line.start();
while (thread != null) {
if ((numBytesRead = line.read(data, 0, bufferLengthInBytes)) == -1) {
break;
}
out.write(data, 0, numBytesRead);
}
I have read the documentation of the class TargetDataLine, it is said :"'read(byte[] b, int off, int len)' Reads audio data from the data line's input buffer."
But where do we define it?
Also the line of type TargetDataLine has not been attached to any mixer, so how can we know for which mixer it is for???
A DataLine is an interface related to handling sampled sound (a.k.a PCM data) in Java. I don't really know a lot of that.
An OutputStream is an interface that represents anything that can get bytes written to it. A simple sample of an OutputStream is a FileOutputStream: all bytes written to that stream will be written to the file it was opened for.
An InputStream is the other end: it's an interface that represents anything from which bytes can be read. A simple sample of an InputStream is a FileInputStream: it can be used to read the data from a file.
So if you were to read audio data from the hard disk, you'd eventually use a FileInputStream to read the data. If you manipulate it and later want to write the resulting data back to the hard disk, you'd use a FileOutputStream to do the actual writing.
An InputStream represents a stream of bytes, where we can read bytes one be one (or in blocks) until it is empty. An OutputStream is the other direction - we write bytes one be one (or in blocks) until we have nothing more to write.
Streams are used to send or receive unstructured byte data.
DataLine handles audio data, in other words, bytes with a special meaning. And it offers some special methods to control the line (start/stop), get the actual format of the audio data and some other characteristics.
I need to pass audio data into a 3rd party system as a "16bit integer array" (from the limited documentation I have).
This is what I've tried so far (the system reads it in from the resulting bytes.dat file).
AudioInputStream inputStream = AudioSystem.getAudioInputStream(new File("c:\\all.wav"));
int numBytes = inputStream.available();
byte[] buffer = new byte[numBytes];
inputStream.read(buffer, 0, numBytes);
BufferedWriter fileOut = new BufferedWriter(new FileWriter(new File("c:\\temp\\bytes.dat")));
ByteBuffer bb = ByteBuffer.wrap(buffer);
while (bb.remaining() > 1) {
short current = bb.getShort();
fileOut.write(String.valueOf(current));
fileOut.newLine();
}
This doesn't seem to work - the 3rd party system doesn't recognise it and I also can't import the file into Audacity as raw audio.
Is there anything obvious I'm doing wrong, or is there a better way to do it?
Extra info: the wave file is 16bit, 44100Hz, mono.
I've just managed to sort this out.
I had to add this line after creating the ByteBuffer.
bb.order(ByteOrder.LITTLE_ENDIAN);
Edit 2:
I rarely use AudioInputStream but the way you write out the raw data seems to be rather complicated. A file is just a bunch of subsequent bytes so you could write your audio byte array with one single FileOutputStream.write() call. The system might use big endian format whereas the WAV file is stored in little endian (?). Then your audio might play but extremely silently for example.
Edit 3
Removed the code sample.
Is there a reason you are writing the audio bytes as strings into the file with newlines?
I would think the system expects the audio data in binary format, not in string format.
AudioFileFormat audioFileFormat;
try {
File file = new File("path/to/wav/file");
audioFileFormat = AudioSystem.getAudioFileFormat(file);
int intervalMSec = 10; // 20 or 30
byte[] buffer = new byte[160]; // 320 or 480.
AudioInputStream audioInputStream = new AudioInputStream(new FileInputStream(file),
audioFileFormat.getFormat(), (long) audioFileFormat.getFrameLength());
int off = 0;
while (audioInputStream.available() > 0) {
audioInputStream.read(buffer, off, 160);
off += 160;
intervalMSec += 10;
ByteBuffer wrap = ByteBuffer.wrap(buffer);
int[] array = wrap.asIntBuffer().array();
}
audioInputStream.close();
} catch (UnsupportedAudioFileException | IOException e) {
e.printStackTrace();
}