I am trying to play a buffer of audio using Java on Linux.
I am getting the following exception when attempting to open the line (not when I write the audio to it)...
Exception in thread "main" java.lang.IllegalArgumentException: No line matching interface SourceDataLine supporting format PCM_FLOAT 44100.0 Hz, 16 bit, mono, 2 bytes/frame, is supported.
public boolean open()
{
try {
int smpSizeInBits = bytesPerSmp * 8;
int frameSize = bytesPerSmp * channels; // just an fyi, frameSize does not always == bytesPerSmp * channels for non PCM encodings
int frameRate = (int)smpRate; // again this might not be the case for non PCM encodings.
boolean isBigEndian = false;
AudioFormat af = new AudioFormat(AudioFormat.Encoding.PCM_FLOAT , smpRate, smpSizeInBits, channels, frameSize, frameRate, isBigEndian);
DataLine.Info info = new DataLine.Info(SourceDataLine.class, af);
int bufferSizeInBytes = bufferSizeInFrames * channels * bytesPerSmp;
line = (SourceDataLine) AudioSystem.getLine(info);
line.open(af, bufferSizeInBytes);
open = true;
}
catch(LineUnavailableException e) {
System.out.println("PcmFloatPlayer: Unable to open, line unavailble.");
}
return open;
}
I am wondering if my assumptions about what PCM_FLOAT encoding is, are actually incorrect.
I have some code that reads in a wav file. The wavfile is mono, 16bit, uncompressed format. I then convert the audio to floats in range of -1.0 to 1.0 for processing.
I assumed the PCM_FLOAT encoding is just raw PCM data that has been converted to float values between -1.0 and 1.0. Is this correct?
I then assumed that the SourceDataLine would convert the float audio to the appropriate format based on my passed format info (mono, 16bit, 2bytes/frame). Again is this assumption incorrect?
Must I convert my float -1.0 to 1.0 audio back to my desired output format, and set the SourceDataLine to PCM_SIGNED (assuming that is my desired format)?
EDIT:
In addition, when I called AudioSystem.getTargetEncodings(), with PCM_FLOAT, it returns three encodings. Does that mean that it will accept PCM_FLOAT, and be capable to converting to the returned encodings, based on what the underlying audio system supports?
AudioFormat.Encoding[] encodings = AudioSystem.getTargetEncodings(AudioFormat.Encoding.PCM_FLOAT);
for(AudioFormat.Encoding e : encodings)
System.out.println(e);
results in...
PCM_SIGNED
PCM_UNSIGNED
PCM_FLOAT
I don't know that I'll be able to answer your direct questions. But maybe the code I can show you, which I know works (including on Linux), will help you arrive at a workable solution. I have programs that generate audio signals via incoming cues, but also custom-made Synths, and I do all the mixing and effects with PCM floats in the range -1 to 1. To output, I convert the floats to a standard "CD Quality" format that Java supports.
Here is the format I use for the outputting SourceDataLine:
AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 44100, 16, 2, 4, 44100, false);
You'll probably want to make this mono instead of stereo. But I should say, it seems to me that if you are able to read an incoming wav file with a different format, you should be able to play back that same format, assuming you reverse all the steps taken to convert the incoming data to PCM.
For the standard "CD Quality" format, to go from pcm signed floats to bytes, there is an intermediate step of inflating to the range of a signed short (-32768 to 32767).
public static byte[] fromBufferToAudioBytes(byte[] audioBytes, float[] buffer)
{
for (int i = 0, n = buffer.length; i < n; i++)
{
buffer[i] *= 32767;
audioBytes[i*2] = (byte) buffer[i];
audioBytes[i*2 + 1] = (byte)((int)buffer[i] >> 8 );
}
return audioBytes;
}
This is taken from the AudioCue library that I wrote and posted on github.
I find it reduces headaches to just deal with the one AudioFormat, to make conversions with Audacity to the one format, and not try make provisions for multiple formats. But that is just a personal preference, and I don't know if that strategy would work for your situation or not.
Hope there is something here that helps!
public class Main {
public static void main(String[] args) throws InterruptedException {
Thread t1 = new Thread2();
t1.start();
Thread t2 = new thread3();
t2.start();
Thread.sleep(5000);
}
}
import javax.sound.sampled.*;
import java.io.File;
import java.io.IOException;
import java.util.logging.Level;
import java.util.logging.Logger;
public class Thread2 extends Thread implements Runnable {
#Override
public void run() {
playWav("C:/Windows/Media/feel_good_x.wav");
}
private static void playWav(String soundFilePath) {
File sFile = new File(soundFilePath);
if (!sFile.exists()) {
String ls = System.lineSeparator();
System.err.println("немає в директорії»+
ls + "(" + soundFilePath + ")" + ls);
return;
}
try {
Clip clip;
try (AudioInputStream audioInputStream = AudioSystem.
getAudioInputStream(sFile.getAbsoluteFile())) {
clip = AudioSystem.getClip();
clip.setFramePosition(0);
clip.open(audioInputStream);
}
clip.start();
}
catch (UnsupportedAudioFileException | IOException | LineUnavailableException ex) {
Logger.getLogger("playWav()").log(Level.SEVERE, null, ex);
}
}
}
Related
I am creating an object that can play synthesised audio in Java but I need to be able to set it to the AudioFormat with the Operating system's highest possible audio bitrate it can play.
(Synth generates 64-bit float audio and can bit-crush it to 32-bit float or PCM, 24-bit, 16-bit and 8-bit PCM audio.)
I will need to filter all the Operating system's valid AudioFormats and pick the format with the highest bitrate the system can use.
How can I get the approtriate array of all the AudioFormats that the system can play without error?
public class AudioSettings {
// instance variables
private int sampleRate;
private AudioFormat audioFormat;
private SourceDataLine sourceDataLine;
public AudioSettings(int sampleRate) {
this.sampleRate = sampleRate;
// get highest possible quality bitrate for system
int highestBitRate = 16;
AudioFormat currentFormat = new AudioFormat(new Encoding("PCM_SIGNED"), (float) sampleRate, highestBitRate,
2, highestBitRate / 8 * 2, sampleRate, true);
for (AudioFormat format : /* What goes here? */) {
if (format.getSampleSizeInBits() > highestBitRate
&& format.isBigEndian()
&& format.getChannels() == 2) {
currentFormat = format;
highestBitRate = format.getSampleSizeInBits();
}
}
audioFormat = currentFormat;
}
}
According to this document frpm the Java 8 days, Java Sound Technology, Java supports a max of 16-bit encoding, and a highest sample rate of 48 kHz.
IDK if there's been any advancement since then. There must be a specification for Java 17, for example, where the specs are listed.
As far as querying the system for supported file types, there is a mention of in the tutorial Using File and Format Converters, in the last section: Learning What Conversions Are Available.
A related AudioSystem method, getAudioFileTypes(AudioInputStream),
returns the complete list of supported file types for the given
stream, as an array of AudioFileFormat.Type instances.
Thanks to #gpasch I found my answer from his link. Although I think you only need to read one instance of the Line.Info[] array because it seems to print out three groups that are exactly the same.
public static void main(String[] args) {
Line.Info desired = new Line.Info(SourceDataLine.class);
Line.Info[] infos = AudioSystem.getSourceLineInfo(desired);
for (Line.Info info : infos) {
if (info instanceof DataLine.Info) {
AudioFormat[] forms = ((DataLine.Info) info).getFormats();
for (AudioFormat format : forms) {
System.out.println(format);
}
}
}
}
Good morning folks, I am trying to send audio data from a microphone attached to an ESP32 board over wifi to my desktop running some Java code. if I run the audio data using Java's AudioSystems library its a bit staticy but is legible. switching to use the Sphinx-4 library which converts audio to text it only sometimes recognizes the words.
This is the first time I've had to mess with raw audio data so it may not even be possible since the board can only read up to 12 bit signals which means converting a 16 bit, every single 12 bit value maps at 15 16bit values. it could also be due to the roughly 115 microsecond delay to down sample to 16kHz
How can I smooth out the audio playback enough that it can be easily recognized by the Sphinx4 library? The current implementation has very small breaks and some noise that I think is throwing it off
ESP32 Code:
BUFFERMAX = 8000
ONE_SECOND = 1000000
int writeBuffer[BUFFERMAX];
void writeAudio(){
for(int i=0; i< BUFFERMAX;i=i+1){
//data read in is 12 bits so I mapped the value to 16 bits ( 2 bytes)
sensorValue = (map(analogRead(sensorPin), 0, 4096, -32000, 32000));
//none to minimal sound is around -7000 so try to zero out additional noise with average
int prevAvg = avg;
avg = (avg + sensorValue)/2;
sensorValue = (abs(prevAvg) + sensorValue);
if(abs(sensorValue) < 1000){sensorValue = 0;}
writeBuffer[i] = ((sensorValue));
// delay so that 8000 INTs (16000 bytes) takes one second to record
delayMicroseconds(delayMicro);
}
client.write((byte*)writeBuffer, sizeof(writeBuffer));
Java Sphinx:
StreamSpeechRecognizer recognizer = new StreamSpeechRecognizer(configuration);
// Start recognition process pruning previously cached data.
recognizer.startRecognition(socket.getInputStream() );
System.out.print("awaiting command...");
SpeechResult result = recognizer.getResult();
System.out.println(result.getHypothesis().toLowerCase());
Java play audio:
private static void init() throws LineUnavailableException {
// specifying the audio format
AudioFormat _format = new AudioFormat(16000.F,// Sample Rate
16, // Size of SampleBits
1, // Number of Channels
true, // Is Signed?
false // Is Big Endian?
);
// creating the DataLine Info for the speaker format
DataLine.Info speakerInfo = new DataLine.Info(SourceDataLine.class, _format);
// getting the mixer for the speaker
_speaker = (SourceDataLine) AudioSystem.getLine(speakerInfo);
_speaker.open(_format);
}
_streamIn = socket.getInputStream();
_speaker.start();
byte[] data = new byte[16000];
System.out.println("Waiting for data...");
while (_running) {
long start = new Date().getTime();
// checking if the data is available to speak
if (_streamIn.available() <= 0)
continue; // data not available so continue back to start of loop
// count of the data bytes read
int readCount= _streamIn.read(data, 0, data.length);
if(readCount > 0 && (readCount%2) == 0){
System.out.println(readCount);
_speaker.write(data, 0, readCount);
readCount=0;
}
System.out.println("Time: " + (new Date().getTime() - start));
}
I would like to use some features of TarsosDSP on sound data. The incoming data is Stereo, but Tarsos does only support mono, so I tried to transfer it to mono as follows, but the result still sounds like stereo data interpreted as mono, i.e. the conversion via MultichannelToMono doesn't seem to have any effect, although its implementation looks good upon a quick glance.
#Test
public void testPlayStereoFile() throws IOException, UnsupportedAudioFileException, LineUnavailableException {
AudioDispatcher dispatcher = AudioDispatcherFactory.fromFile(FILE,4096,0);
dispatcher.addAudioProcessor(new MultichannelToMono(dispatcher.getFormat().getChannels(), false));
dispatcher.addAudioProcessor(new AudioPlayer(dispatcher.getFormat()));
dispatcher.run();
}
Is there anything that I do wrong here? Why does the MultichannelToMono processor not transfer the data to mono?
The only way I found which works is to use the Java Audio System for performing this conversion before sending the data to TarsosDSP, it seems it does not convert the framesize correctly
I found the following snippet at https://www.experts-exchange.com/questions/26925195/java-stereo-to-mono-conversion-unsupported-conversion-error.html which I use to convert to mono before applying more advanced audio transformations with TarsosDSP.
public static AudioInputStream convertToMono(AudioInputStream sourceStream) {
AudioFormat sourceFormat = sourceStream.getFormat();
// is already mono?
if(sourceFormat.getChannels() == 1) {
return sourceStream;
}
AudioFormat targetFormat = new AudioFormat(
sourceFormat.getEncoding(),
sourceFormat.getSampleRate(),
sourceFormat.getSampleSizeInBits(),
1,
// this is the important bit, the framesize needs to change as well,
// for framesize 4, this calculation leads to new framesize 2
(sourceFormat.getSampleSizeInBits() + 7) / 8,
sourceFormat.getFrameRate(),
sourceFormat.isBigEndian());
return AudioSystem.getAudioInputStream(targetFormat, sourceStream);
}
I am having hard time trying to port some Java code to C# for my simple project. The Java code makes use of format.isBigEndian and checks if the audio file data is signed or not. My C# project makes use of NAudio for handling audio files.
Here is the Java code
public void LoadAudioStream(AudioInputStream inputStream) {
AudioFormat format = inputStream.getFormat();
sampleRate = (int) format.getSampleRate();
bigEndian = format.isBigEndian();
AudioFormat.Encoding encoding = format.getEncoding();
if (encoding.equals(AudioFormat.Encoding.PCM_SIGNED))
dataIsSigned = true;
else if (encoding.equals(AudioFormat.Encoding.PCM_UNSIGNED))
dataIsSigned = false;
}
and the C# code that I am working with..
public void LoadAudioStream(WaveFileReader reader)
{
var format = reader.WaveFormat;
sampleRate = format.SampleRate;
//bigEndian = ??
var encoding = format.Encoding;
if (encoding.Equals( /*????*/))
{
dataIsSigned = true;
}
else if (encoding.Equals( /*?????*/))
{
dataIsSigned = false;
}
}
How can I check if the Audio file data is big-endian or not? and lastly is there a way to check if the AudioFormat is PCM signed or unsigned?
PCM WAV files use little endian. The most common bit depth is 16 bit, and this will be signed (ie short or Int16 in C#).
I have some problems finding out, what I actually read with the AudioInputStream. The program below just prints the byte-array I get but I actually don't even know, if the bytes are actually the samples, so the byte-array is the audio wave.
File fileIn;
AudioInputStream audio_in;
byte[] audioBytes;
int numBytesRead;
int numFramesRead;
int numBytes;
int totalFramesRead;
int bytesPerFrame;
try {
audio_in = AudioSystem.getAudioInputStream(fileIn);
bytesPerFrame = audio_in.getFormat().getFrameSize();
if (bytesPerFrame == AudioSystem.NOT_SPECIFIED) {
bytesPerFrame = 1;
}
numBytes = 1024 * bytesPerFrame;
audioBytes = new byte[numBytes];
try {
numBytesRead = 0;
numFramesRead = 0;
} catch (Exception ex) {
System.out.println("Something went completely wrong");
}
} catch (Exception e) {
System.out.println("Something went completely wrong");
}
and in some other part, I read some bytes with this:
try {
if ((numBytesRead = audio_in.read(audioBytes)) != -1) {
numFramesRead = numBytesRead / bytesPerFrame;
totalFramesRead += numFramesRead;
}
} catch (Exception e) {
System.out.println("Had problems reading new content");
}
So first of all, this code is not from me. This is my first time, reading audio-files so I got some help from the inter-webs. (Found the link:
Java - reading, manipulating and writing WAV files
stackoverflow, who would have known.
The question is, what are the bytes in audioBytes representing? Since the source is a 44kHz, stereo, there have to be 2 waves hiding in there somewhere, am I right? so how do I filter the important informations out of these bytes?
// EDIT
So what I added is this function:
public short[] Get_Sample() {
if(samplesRead == 1024) {
Read_Buffer();
samplesRead = 4;
} else {
samplesRead = samplesRead + 4;
}
short sample[] = new short[2];
sample[0] = (short)(audioBytes[samplesRead-4] + 256*audioBytes[samplesRead-3]);
sample[1] = (short)(audioBytes[samplesRead-2] + 256*audioBytes[samplesRead-1]);
return sample;
}
where Read_Buffer() reads the next 1024 (or less) Bytes and loads them into audioBytes. sample[0] is used for the left side, sample[1] for the right side. But I'm still not sure since the waves i get from this look quite "noisy". (Edit: the used WAV actually used little-endian byte order so I had to change the calculation.)
AudioInputStream read() method returns the raw audio data. You don't know what is the 'construction' of data before you read the audio format with getFormat() which returns AudioFormat. From AudioFormat you can getChannels() and getSampleSizeInBits() and more... This is because the AudioInputStream is made for known format.
If you calculate a sample value you have different possibilities with signes and
endianness of the data (in case of 16-bit sample). To make a more generic code
use your AudioFormat object returned from AudioInputStream to get more info
about the data buffer:
encoding() : PCM_SIGNED, PCM_UNSIGNED ...
bigEndian() : true or false
As you already discovered the incorrect sample building may lead to some disturbed sound. If you work with various files it may case a problems in the future. If you won't provide a support for some formats just check what says AudioFormat and throw exception (e.g. javax.sound.sampled.UnsupportedAudioFileException). It will save your time.