Capturing audio streams in JAVA

Capturing audio streams in JAVA - java

I am new to Java, and although I have mastered the syntaxes and constructs, I am having difficult time in getting processed digital audio samples from a mic.
What I am trying to achieve is very simple, while in the long run, I am trying to create a very simple spectrogram, but just to understand/master the audio manipulation process I am trying to start from scratch.
Here is my problem
When a microphone detects a single beep or any sound, I want to capture that binary data, and just simply display it, in its raw format.
Is that really too much to ask from JAVA?
I have read about analog/digital signals, FFT, matlab and I have searched many links in so like these one:
Is there a way of recording audio of streaming broadcast from a webpage?
Working with audio in Java
OpenAL playback captured audio data c++
and the famous introduction from oracle
http://docs.oracle.com/javase/tutorial/sound/capturing.html
and this is actually a good tutorial http://www.developer.com/java/other/article.php/3380031/Spectrum-Analysis-using-Java-Sampling-Frequency-Folding-Frequency-and-the-FFT-Algorithm.htm
But they all fall short of providing a solution to my answer.
I am not asking for a code, although it would it would be awesome, just to read every line and understand about the mechanics involved, but a simple hint would be nice as well.
And here is a simple code, to capture bytes, but only from an existing wav file
import java.io.FileInputStream;
import java.io.IOException;
public class Boo{
public static void main(String[] arguments){
try {
FileInputStream file = new FileInputStream("beep.wav");
boolean eof = false;
int count = 0;
while(!eof){
int input = file.read();
System.out.print(input + "");
if(input == -1)
eof = true;
else
count++;
}
file.close();
System.out.println("\nBytes read: " + count);
}catch (IOException e){
System.out.println("Error - " + e.toString());
}
}
}
After Bounty
-For better clarity-
All I am trying to make it just a simple program to read from mic. and show the binary data of the sound it caputures in a real time.
Think of it like a spectrogram, when sound is captured the graph goes up and down depending on the variety of the signal level, but in this case, there is no need to convert the binary data to audio graph, just only to show any raw data itself. No need to write/read files. Just capture from mic, and show what is read from the mic.
If the above provides to be difficult, as I have searched all over the web, and couldn't find anything helpful, you can just give me guides/directions..
thanks

javax.sound.sampled package should have all you need.
Example:
int duration = 5; // sample for 5 seconds
TargetDataLine line = null;
// find a DataLine that can be read
// (maybe hardcode this if you have multiple microphones)
Info[] mixerInfo = AudioSystem.getMixerInfo();
for (int i = 0; i < mixerInfo.length; i++) {
Mixer mixer = AudioSystem.getMixer(mixerInfo[i]);
Line.Info[] targetLineInfo = mixer.getTargetLineInfo();
if (targetLineInfo.length > 0) {
line = (TargetDataLine) mixer.getLine(targetLineInfo[0]);
break;
}
}
if (line == null)
throw new UnsupportedOperationException("No recording device found");
AudioFormat af = new AudioFormat(11000, 8, 1, true, false);
line.open(af);
line.start();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buf = new byte[(int)af.getSampleRate() * af.getFrameSize()];
long end = System.currentTimeMillis() + 1000 * duration;
int len;
while (System.currentTimeMillis() < end && ((len = line.read(buf, 0, buf.length)) != -1)) {
baos.write(buf, 0, len);
}
line.stop();
line.close();
baos.close();
Afterwards, you can dig the bytes out of your byte array output stream. Or you can of course process them inside the while loop if you prefer.

Here is some code copy-pasted from Sphinx4 microphone support. I hope it will be useful.
And a link to Sphinx homepage: http://cmusphinx.sourceforge.net/sphinx4/.
/**
* <p/> A Microphone captures audio data from the system's underlying audio input systems. Converts these audio data
* into Data objects. When the method <code>startRecording()</code> is called, a new thread will be created and used to
* capture audio, and will stop when <code>stopRecording()</code> is called. Calling <code>getData()</code> returns the
* captured audio data as Data objects. </p> <p/> This Microphone will attempt to obtain an audio device with the format
* specified in the configuration. If such a device with that format cannot be obtained, it will try to obtain a device
* with an audio format that has a higher sample rate than the configured sample rate, while the other parameters of the
* format (i.e., sample size, endianness, sign, and channel) remain the same. If, again, no such device can be obtained,
* it flags an error, and a call <code>startRecording</code> returns false. </p>
*/
public class Microphone extends BaseDataProcessor {
/**
* The property for the sample rate of the data.
*/
#S4Integer(defaultValue = 16000)
public static final String PROP_SAMPLE_RATE = "sampleRate";
/**
* The property that specifies whether or not the microphone will release the audio between utterances. On
* certain systems (Linux for one), closing and reopening the audio does not work too well. The default is false for
* Linux systems, true for others.
*/
#S4Boolean(defaultValue = true)
public final static String PROP_CLOSE_BETWEEN_UTTERANCES = "closeBetweenUtterances";
/**
* The property that specifies the number of milliseconds of audio data to read each time from the underlying
* Java Sound audio device.
*/
#S4Integer(defaultValue = 10)
public final static String PROP_MSEC_PER_READ = "msecPerRead";
/**
* The property for the number of bits per value.
*/
#S4Integer(defaultValue = 16)
public static final String PROP_BITS_PER_SAMPLE = "bitsPerSample";
/**
* The property specifying the number of channels.
*/
#S4Integer(defaultValue = 1)
public static final String PROP_CHANNELS = "channels";
/**
* The property specify the endianness of the data.
*/
#S4Boolean(defaultValue = true)
public static final String PROP_BIG_ENDIAN = "bigEndian";
/**
* The property specify whether the data is signed.
*/
#S4Boolean(defaultValue = true)
public static final String PROP_SIGNED = "signed";
/**
* The property that specifies whether to keep the audio data of an utterance around until the next utterance
* is recorded.
*/
#S4Boolean(defaultValue = false)
public final static String PROP_KEEP_LAST_AUDIO = "keepLastAudio";
/**
* The property that specifies how to convert stereo audio to mono. Currently, the possible values are
* "average", which averages the samples from at each channel, or "selectChannel", which chooses audio only from
* that channel. If you choose "selectChannel", you should also specify which channel to use with the
* "selectChannel" property.
*/
#S4String(defaultValue = "average", range = {"average", "selectChannel"})
public final static String PROP_STEREO_TO_MONO = "stereoToMono";
/**
* The property that specifies the channel to use if the audio is stereo
*/
#S4Integer(defaultValue = 0)
public final static String PROP_SELECT_CHANNEL = "selectChannel";
/**
* The property that specifies the mixer to use. The value can be "default," (which means let the
* AudioSystem decide), "last," (which means select the last Mixer supported by the AudioSystem), which appears to
* be what is often used for USB headsets, or an integer value which represents the index of the Mixer.Info that is
* returned by AudioSystem.getMixerInfo(). To get the list of Mixer.Info objects, run the AudioTool application with
* a command line argument of "-dumpMixers".
*
* #see edu.cmu.sphinx.tools.audio.AudioTool
*/
#S4String(defaultValue = "default")
public final static String PROP_SELECT_MIXER = "selectMixer";
private AudioFormat finalFormat;
private AudioInputStream audioStream;
private TargetDataLine audioLine;
private BlockingQueue<Data> audioList;
private Utterance currentUtterance;
private boolean doConversion;
private final int audioBufferSize = 160000;
private volatile boolean recording;
private volatile boolean utteranceEndReached = true;
private RecordingThread recorder;
// Configuration data
private AudioFormat desiredFormat;
private Logger logger;
private boolean closeBetweenUtterances;
private boolean keepDataReference;
private boolean signed;
private boolean bigEndian;
private int frameSizeInBytes;
private int msecPerRead;
private int selectedChannel;
private String selectedMixerIndex;
private String stereoToMono;
private int sampleRate;
/**
* #param sampleRate sample rate of the data
* #param bitsPerSample number of bits per value.
* #param channels number of channels.
* #param bigEndian the endianness of the data
* #param signed whether the data is signed.
* #param closeBetweenUtterances whether or not the microphone will release the audio between utterances. On
* certain systems (Linux for one), closing and reopening the audio does not work too well. The default is false for
* Linux systems, true for others
* #param msecPerRead the number of milliseconds of audio data to read each time from the underlying
* Java Sound audio device.
* #param keepLastAudio whether to keep the audio data of an utterance around until the next utterance
* is recorded.
* #param stereoToMono how to convert stereo audio to mono. Currently, the possible values are
* "average", which averages the samples from at each channel, or "selectChannel", which chooses audio only from
* that channel. If you choose "selectChannel", you should also specify which channel to use with the
* "selectChannel" property.
* #param selectedChannel the channel to use if the audio is stereo
* #param selectedMixerIndex the mixer to use. The value can be "default," (which means let the
* AudioSystem decide), "last," (which means select the last Mixer supported by the AudioSystem), which appears to
* be what is often used for USB headsets, or an integer value which represents the index of the Mixer.Info that is
* returned by AudioSystem.getMixerInfo(). To get the list of Mixer.Info objects, run the AudioTool application with
* a command line argument of "-dumpMixers".
*/
public Microphone(int sampleRate, int bitsPerSample, int channels,
boolean bigEndian, boolean signed, boolean closeBetweenUtterances, int msecPerRead, boolean keepLastAudio,
String stereoToMono, int selectedChannel, String selectedMixerIndex) {
initLogger();
this.bigEndian = bigEndian;
this.signed = signed;
this.desiredFormat = new AudioFormat
((float) sampleRate, bitsPerSample, channels, signed, bigEndian);
this.closeBetweenUtterances = closeBetweenUtterances;
this.msecPerRead = msecPerRead;
this.keepDataReference = keepLastAudio;
this.stereoToMono = stereoToMono;
this.selectedChannel = selectedChannel;
this.selectedMixerIndex = selectedMixerIndex;
}
public Microphone() {
}
/*
* (non-Javadoc)
*
* #see edu.cmu.sphinx.util.props.Configurable#newProperties(edu.cmu.sphinx.util.props.PropertySheet)
*/
#Override
public void newProperties(PropertySheet ps) throws PropertyException {
super.newProperties(ps);
logger = ps.getLogger();
sampleRate = ps.getInt(PROP_SAMPLE_RATE);
int sampleSizeInBits = ps.getInt(PROP_BITS_PER_SAMPLE);
int channels = ps.getInt(PROP_CHANNELS);
bigEndian = ps.getBoolean(PROP_BIG_ENDIAN);
signed = ps.getBoolean(PROP_SIGNED);
desiredFormat = new AudioFormat
((float) sampleRate, sampleSizeInBits, channels, signed, bigEndian);
closeBetweenUtterances = ps.getBoolean(PROP_CLOSE_BETWEEN_UTTERANCES);
msecPerRead = ps.getInt(PROP_MSEC_PER_READ);
keepDataReference = ps.getBoolean(PROP_KEEP_LAST_AUDIO);
stereoToMono = ps.getString(PROP_STEREO_TO_MONO);
selectedChannel = ps.getInt(PROP_SELECT_CHANNEL);
selectedMixerIndex = ps.getString(PROP_SELECT_MIXER);
}
/**
* Constructs a Microphone with the given InputStream.
*/
#Override
public void initialize() {
super.initialize();
audioList = new LinkedBlockingQueue<Data>();
DataLine.Info info
= new DataLine.Info(TargetDataLine.class, desiredFormat);
/* If we cannot get an audio line that matches the desired
* characteristics, shoot for one that matches almost
* everything we want, but has a higher sample rate.
*/
if (!AudioSystem.isLineSupported(info)) {
logger.info(desiredFormat + " not supported");
AudioFormat nativeFormat
= DataUtil.getNativeAudioFormat(desiredFormat,
getSelectedMixer());
if (nativeFormat == null) {
logger.severe("couldn't find suitable target audio format");
} else {
finalFormat = nativeFormat;
/* convert from native to the desired format if supported */
doConversion = AudioSystem.isConversionSupported
(desiredFormat, nativeFormat);
if (doConversion) {
logger.info
("Converting from " + finalFormat.getSampleRate()
+ "Hz to " + desiredFormat.getSampleRate() + "Hz");
} else {
logger.info
("Using native format: Cannot convert from " +
finalFormat.getSampleRate() + "Hz to " +
desiredFormat.getSampleRate() + "Hz");
}
}
} else {
logger.info("Desired format: " + desiredFormat + " supported.");
finalFormat = desiredFormat;
}
}
/**
* Gets the Mixer to use. Depends upon selectedMixerIndex being defined.
*
* #see #newProperties
*/
private Mixer getSelectedMixer() {
if (selectedMixerIndex.equals("default")) {
return null;
} else {
Mixer.Info[] mixerInfo = AudioSystem.getMixerInfo();
if (selectedMixerIndex.equals("last")) {
return AudioSystem.getMixer(mixerInfo[mixerInfo.length - 1]);
} else {
int index = Integer.parseInt(selectedMixerIndex);
return AudioSystem.getMixer(mixerInfo[index]);
}
}
}
/**
* Creates the audioLine if necessary and returns it.
*/
private TargetDataLine getAudioLine() {
if (audioLine != null) {
return audioLine;
}
/* Obtain and open the line and stream.
*/
try {
/* The finalFormat was decided in the initialize() method
* and is based upon the capabilities of the underlying
* audio system. The final format will have all the
* desired audio characteristics, but may have a sample
* rate that is higher than desired. The idea here is
* that we'll let the processors in the front end (e.g.,
* the FFT) handle some form of downsampling for us.
*/
logger.info("Final format: " + finalFormat);
DataLine.Info info = new DataLine.Info(TargetDataLine.class,
finalFormat);
/* We either get the audio from the AudioSystem (our
* default choice), or use a specific Mixer if the
* selectedMixerIndex property has been set.
*/
Mixer selectedMixer = getSelectedMixer();
if (selectedMixer == null) {
audioLine = (TargetDataLine) AudioSystem.getLine(info);
} else {
audioLine = (TargetDataLine) selectedMixer.getLine(info);
}
/* Add a line listener that just traces
* the line states.
*/
audioLine.addLineListener(new LineListener() {
#Override
public void update(LineEvent event) {
logger.info("line listener " + event);
}
});
} catch (LineUnavailableException e) {
logger.severe("microphone unavailable " + e.getMessage());
}
return audioLine;
}
/**
* Opens the audio capturing device so that it will be ready for capturing audio. Attempts to create a converter if
* the requested audio format is not directly available.
*
* #return true if the audio capturing device is opened successfully; false otherwise
*/
private boolean open() {
TargetDataLine audioLine = getAudioLine();
if (audioLine != null) {
if (!audioLine.isOpen()) {
logger.info("open");
try {
audioLine.open(finalFormat, audioBufferSize);
} catch (LineUnavailableException e) {
logger.severe("Can't open microphone " + e.getMessage());
return false;
}
audioStream = new AudioInputStream(audioLine);
if (doConversion) {
audioStream = AudioSystem.getAudioInputStream
(desiredFormat, audioStream);
assert (audioStream != null);
}
/* Set the frame size depending on the sample rate.
*/
float sec = ((float) msecPerRead) / 1000.f;
frameSizeInBytes =
(audioStream.getFormat().getSampleSizeInBits() / 8) *
(int) (sec * audioStream.getFormat().getSampleRate());
logger.info("Frame size: " + frameSizeInBytes + " bytes");
}
return true;
} else {
logger.severe("Can't find microphone");
return false;
}
}
/**
* Returns the format of the audio recorded by this Microphone. Note that this might be different from the
* configured format.
*
* #return the current AudioFormat
*/
public AudioFormat getAudioFormat() {
return finalFormat;
}
/**
* Returns the current Utterance.
*
* #return the current Utterance
*/
public Utterance getUtterance() {
return currentUtterance;
}
/**
* Returns true if this Microphone is recording.
*
* #return true if this Microphone is recording, false otherwise
*/
public boolean isRecording() {
return recording;
}
/**
* Starts recording audio. This method will return only when a START event is received, meaning that this Microphone
* has started capturing audio.
*
* #return true if the recording started successfully; false otherwise
*/
public synchronized boolean startRecording() {
if (recording) {
return false;
}
if (!open()) {
return false;
}
utteranceEndReached = false;
if (audioLine.isRunning()) {
logger.severe("Whoops: audio line is running");
}
assert (recorder == null);
recorder = new RecordingThread("Microphone");
recorder.start();
recording = true;
return true;
}
/**
* Stops recording audio. This method does not return until recording has been stopped and all data has been read
* from the audio line.
*/
public synchronized void stopRecording() {
if (audioLine != null) {
if (recorder != null) {
recorder.stopRecording();
recorder = null;
}
recording = false;
}
}
/**
* This Thread records audio, and caches them in an audio buffer.
*/
class RecordingThread extends Thread {
private boolean done;
private volatile boolean started;
private long totalSamplesRead;
private final Object lock = new Object();
/**
* Creates the thread with the given name
*
* #param name the name of the thread
*/
public RecordingThread(String name) {
super(name);
}
/**
* Starts the thread, and waits for recorder to be ready
*/
#Override
public void start() {
started = false;
super.start();
waitForStart();
}
/**
* Stops the thread. This method does not return until recording has actually stopped, and all the data has been
* read from the audio line.
*/
public void stopRecording() {
audioLine.stop();
try {
synchronized (lock) {
while (!done) {
lock.wait();
}
}
} catch (InterruptedException e) {
e.printStackTrace();
}
// flush can not be called here because the audio-line might has been set to null already by the mic-thread
// audioLine.flush();
}
/**
* Implements the run() method of the Thread class. Records audio, and cache them in the audio buffer.
*/
#Override
public void run() {
totalSamplesRead = 0;
logger.info("started recording");
if (keepDataReference) {
currentUtterance = new Utterance
("Microphone", audioStream.getFormat());
}
audioList.add(new DataStartSignal(sampleRate));
logger.info("DataStartSignal added");
try {
audioLine.start();
while (!done) {
Data data = readData(currentUtterance);
if (data == null) {
done = true;
break;
}
audioList.add(data);
}
audioLine.flush();
if (closeBetweenUtterances) {
/* Closing the audio stream *should* (we think)
* also close the audio line, but it doesn't
* appear to do this on the Mac. In addition,
* once the audio line is closed, re-opening it
* on the Mac causes some issues. The Java sound
* spec is also kind of ambiguous about whether a
* closed line can be re-opened. So...we'll go
* for the conservative route and never attempt
* to re-open a closed line.
*/
audioStream.close();
audioLine.close();
System.err.println("set to null");
audioLine = null;
}
} catch (IOException ioe) {
logger.warning("IO Exception " + ioe.getMessage());
ioe.printStackTrace();
}
long duration = (long)
(((double) totalSamplesRead /
(double) audioStream.getFormat().getSampleRate()) * 1000.0);
audioList.add(new DataEndSignal(duration));
logger.info("DataEndSignal ended");
logger.info("stopped recording");
synchronized (lock) {
lock.notify();
}
}
/**
* Waits for the recorder to start
*/
private synchronized void waitForStart() {
// note that in theory we coulde use a LineEvent START
// to tell us when the microphone is ready, but we have
// found that some javasound implementations do not always
// issue this event when a line is opened, so this is a
// WORKAROUND.
try {
while (!started) {
wait();
}
} catch (InterruptedException ie) {
logger.warning("wait was interrupted");
}
}
/**
* Reads one frame of audio data, and adds it to the given Utterance.
*
* #param utterance
* #return an Data object containing the audio data
* #throws java.io.IOException
*/
private Data readData(Utterance utterance) throws IOException {
// Read the next chunk of data from the TargetDataLine.
byte[] data = new byte[frameSizeInBytes];
int channels = audioStream.getFormat().getChannels();
long collectTime = System.currentTimeMillis();
long firstSampleNumber = totalSamplesRead / channels;
int numBytesRead = audioStream.read(data, 0, data.length);
// notify the waiters upon start
if (!started) {
synchronized (this) {
started = true;
notifyAll();
}
}
if (logger.isLoggable(Level.FINE)) {
logger.info("Read " + numBytesRead
+ " bytes from audio stream.");
}
if (numBytesRead <= 0) {
return null;
}
int sampleSizeInBytes =
audioStream.getFormat().getSampleSizeInBits() / 8;
totalSamplesRead += (numBytesRead / sampleSizeInBytes);
if (numBytesRead != frameSizeInBytes) {
if (numBytesRead % sampleSizeInBytes != 0) {
throw new Error("Incomplete sample read.");
}
data = Arrays.copyOf(data, numBytesRead);
}
if (keepDataReference) {
utterance.add(data);
}
double[] samples;
if (bigEndian) {
samples = DataUtil.bytesToValues
(data, 0, data.length, sampleSizeInBytes, signed);
} else {
samples = DataUtil.littleEndianBytesToValues
(data, 0, data.length, sampleSizeInBytes, signed);
}
if (channels > 1) {
samples = convertStereoToMono(samples, channels);
}
return (new DoubleData
(samples, (int) audioStream.getFormat().getSampleRate(),
collectTime, firstSampleNumber));
}
}
/**
* Converts stereo audio to mono.
*
* #param samples the audio samples, each double in the array is one sample
* #param channels the number of channels in the stereo audio
*/
private double[] convertStereoToMono(double[] samples, int channels) {
assert (samples.length % channels == 0);
double[] finalSamples = new double[samples.length / channels];
if (stereoToMono.equals("average")) {
for (int i = 0, j = 0; i < samples.length; j++) {
double sum = samples[i++];
for (int c = 1; c < channels; c++) {
sum += samples[i++];
}
finalSamples[j] = sum / channels;
}
} else if (stereoToMono.equals("selectChannel")) {
for (int i = selectedChannel, j = 0; i < samples.length;
i += channels, j++) {
finalSamples[j] = samples[i];
}
} else {
throw new Error("Unsupported stereo to mono conversion: " +
stereoToMono);
}
return finalSamples;
}
/**
* Clears all cached audio data.
*/
public void clear() {
audioList.clear();
}
/**
* Reads and returns the next Data object from this Microphone, return null if there is no more audio data. All
* audio data captured in-between <code>startRecording()</code> and <code>stopRecording()</code> is cached in an
* Utterance object. Calling this method basically returns the next chunk of audio data cached in this Utterance.
*
* #return the next Data or <code>null</code> if none is available
*/
#Override
public Data getData() throws DataProcessingException {
getTimer().start();
Data output = null;
if (!utteranceEndReached) {
try {
output = audioList.take();
} catch (InterruptedException ie) {
throw new DataProcessingException("cannot take Data from audioList", ie);
}
if (output instanceof DataEndSignal) {
utteranceEndReached = true;
}
}
getTimer().stop();
// signalCheck(output);
return output;
}
/**
* Returns true if there is more data in the Microphone.
* This happens either if the a DataEndSignal data was not taken from the buffer,
* or if the buffer in the Microphone is not yet empty.
*
* #return true if there is more data in the Microphone
*/
public boolean hasMoreData() {
return !(utteranceEndReached && audioList.isEmpty());
}
}

Have a look at xuggle. They dropped support for the project but it still has all the documentation and the google-group will sometimes give you a good answer.
As for specifically reading audio off hardware, start with these demos and work your way from there.
The hardest part about this is streaming the data from your mic, which is pretty well explained in the article you linked.
So stream it into xuggle using the oracle docs, then manipulate the audio how you see fit.

I didn't try followings, I just post them for just helping you. you may try those and may help you.
Record streaming audio in java?
http://docs.blackberry.com/en/developers/deliverables/11942/CS_recording_audio_from_a_player_740038_11.jsp
http://ganeshtiwaridotcomdotnp.blogspot.com/2011/12/java-sound-capture-from-microphone.html
http://edenti.deis.unibo.it/utils/Java-tips/Capturing%20Audio%20with%20Java%20Sound%20API.txt
http://docs.oracle.com/javase/tutorial/sound/capturing.html

Related

WAVE file unexpected behaviour

I am currently trying to make a .wav file that will play sos in morse.
The way I went about this is: I have a byte array that contains one wave of a beep. I then repeated that until I had the desired length.
After that I inserted those bytes into a new array and put bytes containing 00 (in hexadecimal) to separate the beeps.
If I add 1 beep to a WAVE file, it creates the file correctly (i.e. I get a beep of the desired length).
Here is a picture of the waves zoomed in (I opened the file in Audacity):
And here is a picture of the entire wave part:
The problem now is that when I add a second beep, the second one becomes completely distorted:
So this is what the entire file looks like now:
If I add another beep, it will be the correct beep again, If I add yet another beep it's going to be distorted again, etc.
So basically, every other wave is distorted.
Does anyone know why this happens?
Here is a link to a .txt file I generated containing the the audio data of the wave file I created: byteTest19.txt
And here is a lint to a .txt file that I generated using file format.info that is a hexadecimal representation of the bytes in the .wav file I generated containing 5 beeps (with two of them, the even beeps being distorted): test3.txt
You can tell when a new beep starts because it is preceded by a lot of 00's.
As far as I can see, the bytes of the second beep does not differ from the first one, which is why I am asking this question.
If anyone knows why this happens, please help me. If you need more information, don't hesitate to ask. I hope I explained well what I'm doing, if not, that's my bad.
EDIT
Here is my code:
// First I calculate the byte array for a single beep
// This file is just a single wave of the audio (up and down)
// (see below for the fileToAudioByteArray method) (In my
// actual code I only take in half of the wave and then I
// invert it, but I didn't want to make this too complicated,
// I'll put the full code below
final byte[] wave = fileToAudioByteArray(new File("path to my wav file");
// This is how long that audio fragment is in seconds
final double secondsPerWave = 0.0022195;
// This is the amount of seconds a beep takes up (e.g. the seconds picture)
double secondsPerBeep = 0.25;
final int amountWaveInBeep = (int) Math.ceil((secondsPerBeep/secondsPerWave));
// this is the byte array containing the audio data of
// 1 beep (see below for the repeatArray method)
final byte[] beep = repeatArray(wave, amountWaveInBeep);
// Now for the silence between the beeps
final byte silenceByte = 0x00,
// The amount of seconds a silence byte takes up
final double secondsPerSilenceByte = 0.00002;
// The amount of silence bytes I need to make one second
final int amountOfSilenceBytesForOneSecond = (int) (Math.ceil((1/secondsPerSilenceByte)));
// The space between 2 beeps will be 0.25 * secondsPerBeep
double amountOfBeepsEquivalent = 0.25;
// This is the amount of bytes of silence I need
// between my beeps
final int amntSilenceBytesPerSpaceBetween = (int) Math.ceil(secondsPerBeep * amountOfBeepsEquivalent * amountOfSilenceBytesForOneSecond);
final byte[] spaceBetweenBeeps = new byte[amntSilenceBytesPerSpaceBetween];
for (int i = 0; i < amntSilenceBytesPerSpaceBetween; i++) {
spaceBetweenBeeps[i] = silenceByte;
}
WaveFileBuilder wavBuilder = new WaveFileBuilder(WaveFileBuilder.AUDIOFORMAT_PCM, 1, 44100, 16);
// Adding all the beeps and silence to the WAVE file (test3.wav)
wavBuilder.addBytes(beep);
wavBuilder.addBytes(spaceBetweenDigits);
wavBuilder.addBytes(beep);
wavBuilder.addBytes(spaceBetweenDigits);
wavBuilder.addBytes(beep);
wavBuilder.addBytes(spaceBetweenDigits);
wavBuilder.addBytes(beep);
wavBuilder.addBytes(spaceBetweenDigits);
wavBuilder.addBytes(beep);
wavBuilder.addBytes(nextChar);
File outputFile = new File("path/test3.wav");
wavBuilder.saveFile(outputFile);
These are the 2 methods I used in the beginning:
/**
* Converts a wav file to a byte array containing its audio data
* #param file the wav file you want to convert
* #return the data part of a wav file in byte form
*/
public static byte[] fileToAudioByteArrray(File file) throws UnsupportedAudioFileException, IOException {
AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(file);
AudioFormat audioFormat = audioInputStream.getFormat();
int bytesPerSample = audioFormat.getFrameSize();
if (bytesPerSample == AudioSystem.NOT_SPECIFIED) {
bytesPerSample = -1;
}
long numSamples = audioInputStream.getFrameLength();
int numBytes = (int) (numSamples * bytesPerSample);
byte[] audioBytes = new byte[numBytes];
int numBytesRead;
while((numBytesRead = audioInputStream.read(audioBytes)) != -1);
return audioBytes;
}
/**
* Repeats an array into a new array x times
* #param array the array you want to copy x times
* #param repeat the amount of times you want to copy the array into the new array
* #return an array containing the content of {#code array} {#code repeat} times.
*/
public static byte[] repeatArray(byte[] array, int repeat) {
byte[] result = new byte[array.length * repeat];
for (int i = 0; i < result.length; i++) {
result[i] = array[i % array.length];
}
return result;
}
Now for my WaveFileBuilder class:
/**
* <p> Constructs a WavFileBuilder which can be used to create wav files.</p>
*
* <p>The builder takes care of the subchunks based on the parameters that are given in the constructor.</p>
*
* <h3>Adding audio to the wav file</h3>
* There are 2 methods that can be used to add audio data to the WavFile.
* One is {#link #addBytes(byte[]) addBytes} which lets you directly inject bytes
* into the data section of the wav file.
* The other is {#link #addAudioFile(File) addAudioFile} which lets you add the audio
* data of another wav file to the wav file's audio data.
*
* #param audioFormat The be.jonaseveraert.util.audio format of the wav file {#link #AUDIOFORMAT_PCM PCM} = 1
* #param numChannels The number of channels the wav file will have {#link #NUM_CHANNELS_MONO MONO} = 1,
* {#link #NUM_CHANNELS_STEREO STEREO} = 2
* #param sampleRate The sample rate of the wav file in Hz (e.g. 22050, 44100, ...)
* #param bitsPerSample The amount of bits per sample. If 16 bits, the audio sample will contain 2 bytes per
* channel. (e.g. 8, 16, ...). This is important to take into account when using the
* {#link #addBytes(byte[]) addBytes} method to insert data into the wav file.
*/
public WaveFileBuilder(int audioFormat, int numChannels, int sampleRate, int bitsPerSample) {
this.audioFormat = audioFormat;
this.numChannels = numChannels;
this.sampleRate = sampleRate;
this.bitsPerSample = bitsPerSample;
// Subchunk 1 calculations
this.byteRate = this.sampleRate * this.numChannels * (this.bitsPerSample / 8);
this.blockAlign = this.numChannels * (this.bitsPerSample / 8);
}
/**
* Contains the audio data for the wav file that is being constructed
*/
byte[] audioBytes = null;
// For debug purposes
int counter = 0;
/**
* Adds audio data to the wav file from bytes
* <p>See the "see also" for the structure of the "Data" part of a wav file</p>
* #param audioBytes audio data
* #see Wave PCM Soundfile Format
*/
public void addBytes(byte[] audioBytes) throws IOException {
// This is all debug code that I used to maker byteText19.txt
// which I have linked in my question
String test1;
try {
test1 = (temp.bytesToHex(this.audioBytes, true));
} catch (NullPointerException e) {
test1 = "null";
}
File file = new File("/Users/jonaseveraert/Desktop/Morse Sound Test/debug/byteTest" + counter + ".txt");
file.createNewFile();
counter++;
BufferedWriter writer = new BufferedWriter(new FileWriter(file));
writer.write(test1);
writer.close();
// This is where the actual code starts //
if (this.audioBytes != null)
this.audioBytes = ArrayUtils.addAll(this.audioBytes, audioBytes);
else
this.audioBytes = audioBytes;
// End of code //
// This is for debug again
String test2 = (temp.bytesToHex(this.audioBytes, true));
File file2 = new File("/Users/jonaseveraert/Desktop/Morse Sound Test/debug/byteTest" + counter + ".txt");
file2.createNewFile();
counter++;
BufferedWriter writer2 = new BufferedWriter(new FileWriter(file2));
writer2.write(test2);
writer2.close();
}
/**
* Saves the file to the location of the {#code outputFile}.
* #param outputFile The file that will be outputted (not created yet), contains the path
* #return true if the file was created and written to successfully. Else false.
* #throws IOException If an I/O error occurred
*/
public boolean saveFile(File outputFile) throws IOException {
// subchunk2 calculations
//int numBytesInData = data.length()/2;
int numBytesInData = audioBytes.length;
int numSamples = numBytesInData / (2 * numChannels);
subchunk2Size = numSamples * numChannels * (bitsPerSample / 8);
// chunk calculation
chunkSize = 4 + (8 + subchunk1Size) + (8 + subchunk2Size);
// convert everything to hex string //
// Chunk descriptor
String f_chunkID = asciiStringToHexString(chunkID);
String f_chunkSize = intToLittleEndianHexString(chunkSize, 4);
String f_format = asciiStringToHexString(format);
// fmt subchunck
String f_subchunk1ID = asciiStringToHexString(subchunk1ID);
String f_subchunk1Size = intToLittleEndianHexString(subchunk1Size, 4);
String f_audioformat = intToLittleEndianHexString(audioFormat, 2);
String f_numChannels = intToLittleEndianHexString(numChannels, 2);
String f_sampleRate = intToLittleEndianHexString(sampleRate, 4);
String f_byteRate = intToLittleEndianHexString(byteRate, 4);
String f_blockAlign = intToLittleEndianHexString(blockAlign, 2);
String f_bitsPerSample = intToLittleEndianHexString(bitsPerSample, 2);
// data subchunk
String f_subchunk2ID = asciiStringToHexString(subchunk2ID);
String f_subchunk2Size = intToLittleEndianHexString(subchunk2Size, 4);
// data is stored in audioData
// Combine all hex data into one String (except for the
// audio data, which is passed in as a byte array)
final String AUDIO_BYTE_STREAM_STRING = f_chunkID + f_chunkSize + f_format
+ f_subchunk1ID + f_subchunk1Size + f_audioformat + f_numChannels + f_sampleRate + f_byteRate + f_blockAlign + f_bitsPerSample
+ f_subchunk2ID + f_subchunk2Size;
// Convert the hex data to a byte array
final byte[] BYTES = hexStringToByteArray(AUDIO_BYTE_STREAM_STRING);
// Create & write file
if (outputFile.createNewFile()) {
// Combine byte arrays
// This array now contains the full WAVE file
byte[] audioFileBytes = ArrayUtils.addAll(BYTES, audioBytes);
try (FileOutputStream fos = new FileOutputStream(outputFile)) {
fos.write(audioFileBytes); // Write the bytes into a file
}
catch (IOException e) {
logger.log(Level.SEVERE, "IOException occurred");
logger.log(Level.SEVERE, null, e);
return false;
}
logger.log(Level.INFO, "File created: " + outputFile.getName());
}
return true;
} else {
//System.out.println("File already exists.");
logger.log(Level.WARNING, "File already exists.");
}
return false;
}
}
// Aiding methods
/**
* Converts a string containing hexadecimal to bytes
* #param s e.g. 00014F
* #return an array of bytes e.g. {00, 01, 4F}
*/
private byte[] hexStringToByteArray(String s) {
int len = s.length();
byte[] bytes = new byte[len / 2];
for (int i = 0; i < len; i+= 2) {
bytes[i / 2] = (byte) ((Character.digit(s.charAt(i), 16) << 4) + Character.digit(s.charAt(i+1), 16));
}
return bytes;
}
/**
* Converts an int to a hexadecimal string in the little-endian format
* #param input an integer number
* #param numberOfBytes The number of bytes the the integer is stored in
* #return The integer as a hexadecimal string in the little-endian byte ordering
*/
private String intToLittleEndianHexString(int input, int numberOfBytes) {
String hexBigEndian = Integer.toHexString(input);
StringBuilder hexLittleEndian = new StringBuilder();
int amountOfNumberProcessed = 0;
for (int i = 0; i < hexBigEndian.length()/2f; i++) {
int endIndex = hexBigEndian.length() - (i * 2);
try {
hexLittleEndian.append(hexBigEndian.substring(endIndex-2, endIndex));
} catch (StringIndexOutOfBoundsException e ) {
hexLittleEndian.append(0).append(hexBigEndian.charAt(0));
}
amountOfNumberProcessed++;
}
while (amountOfNumberProcessed != numberOfBytes) {
hexLittleEndian.append("00");
amountOfNumberProcessed++;
}
return hexLittleEndian.toString();
}
/**
* Converts a string containing ascii to its hexadecimal notation
* #param input The string that has to be converted
* #return The string as a hexadecimal notation in the big-endian byte ordering
*/
private String asciiStringToHexString(String input) {
byte[] bytes = input.getBytes(StandardCharsets.US_ASCII);
StringBuilder hex = new StringBuilder();
for (byte b : bytes) {
String hexChar = String.format("%02X", b);
hex.append(hexChar);
}
return hex.toString().trim();
}
And lastly: if you want the full code, replace
final byte[] wave = fileToAudioByteArray(new File("path to my wav file"); in the beginning of my code with:
File morse_half_wave_file = new File("/Users/jonaseveraert/Desktop/Morse Sound Test/morse_audio_fragment.wav");
final byte[] half_wave = temp.fileToAudioByteArrray(morse_half_wave_file);
final byte[] half_wave_inverse = temp.invertByteArray(half_wave);
// Then the wave byte array becomes:
final byte[] wave = ArrayUtils.addAll(half_wave, half_wave_inverse); // This ArrayUtils.addAll comes from the Apache Commons lang3 library
// And this is the invertByteArray method
/**
* Inverts bytes e.g. 000101 becomes 111010
*/
public static byte[] invertByteArray(byte[] bytes) {
if (bytes == null) {
return null;
// TODO: throw empty byte array expcetion
}
byte[] outputArray = new byte[bytes.length];
for(int i = 0; i < bytes.length; i++) {
outputArray[i] = (byte) ~bytes[i];
}
return outputArray;
}
P.S. Here is the morse_audio_fragment.wav: morse_audio_fragment.wav
Thanks in advance,
Jonas

The problem
Your .wav file is Signed 16 bit Little Endian, Rate 44100 Hz, Mono - which means that each sample in the file is 2 bytes long, and describes a signed amplitude. So you can copy-and-paste chunks of samples without any problems, as long as their lengths are divisible by 2 (your block size). Your silences are likely of odd length, so that the 1st sample after a silence is interpreted as
0x00 0x65 // last byte of silence, 1st byte of actual beep: weird
and all subsequent pairs bytes are interpreted wrong (taking the 2nd byte from each sample with the 1st byte from the next sample) due to this initial mis-alignment, until you find the next odd-length silence, when suddenly everything gets re-aligned correctly again; instead of the expected
0x65 0x05 // 1st and 2nd byte of beep: actual expected sample
How to fix it
Do not allow calls to addBytes that would add a number of bytes that does not evenly divide the block-size.
public class WaveFileBuilder() {
byte[] audioBytes = null;
// ... other attributes, methods, constructor
public void addBytes(byte[] audioBytes) throws IOException {
// ... debug code above, handle empty
// THIS SHOULD CHECK audioBytes IS MULTIPLE OF blockSize
this.audioBytes = ArrayUtils.addAll(this.audioBytes, audioBytes);
// ... debug code below
}
public boolean saveFile(File outputFile) throws IOException {
// ... prepare headers
// concatenate header (BYTES) and contents
byte[] audioFileBytes = ArrayUtils.addAll(BYTES, audioBytes);
// ... write out bytes
try (FileOutputStream fos = new FileOutputStream(outputFile)) {
fos.write(audioFileBytes);
}
// ...
}
}
First, you could have avoided some confusion using a different name for the attribute and the parameter. Then, you are constantly growing an array over and over; this is wasteful, making code that could run in O(n) run in O(n^2), because you are calling it like this:
wavBuilder.addBytes(beep);
wavBuilder.addBytes(spaceBetweenDigits);
wavBuilder.addBytes(beep);
wavBuilder.addBytes(spaceBetweenDigits);
wavBuilder.addBytes(beep);
wavBuilder.addBytes(spaceBetweenDigits);
wavBuilder.addBytes(beep);
wavBuilder.addBytes(spaceBetweenDigits);
wavBuilder.addBytes(beep);
wavBuilder.addBytes(nextChar);
Instead, I propose the following:
public class WaveFileBuilder() {
List<byte[]> chunks = new ArrayList<>();
// ... other attributes, methods, constructor
public void addBytes(byte[] audioBytes) throws IOException {
if ((audioBytes.length % blockAlign) != 0) {
throw new IllegalArgumentException("Trying to add a chunk that does not fit evenly; this would cause un-aligned blocks")
}
chunks.add(audioBytes);
}
public boolean saveFile(File outputFile) throws IOException {
// ... prepare headers
// ... write out bytes
try (FileOutputStream fos = new FileOutputStream(outputFile)) {
for (byte[] chunk : chunks) fos.write(chunk);
}
}
}
This version uses no concatenation at all, and should be much faster and easier to test. It also requires less memory, because it is not copying all those arrays around to concatenate them to each other.

Video Streaming on network using java sockets

I am developing a networking java application. I wanted to stream a video from network (maybe using sockets). I search on the internet but I didnt find any working server and client code to stream video from a server to a client.
Can anyone find streaming server and client or code a simple program so that I can understand how streaming is done using java.
PS. I fount an assignment related to this on internet. But it has error and some methods are also unimplemented. If you can remove errors and complete the methods it will also be helpful..
http://cs.anu.edu.au/student/comp3310/2004/Labs/lab6/lab5.html

See: Any simple (and up to date) Java frameworks for embedding movies within a Swing Application?, just refer to the JavaFX only code sample (you don't need any Swing code).
import javafx.application.Application;
import javafx.scene.*;
import javafx.scene.media.*;
import javafx.stage.Stage;
public class VideoPlayerExample extends Application {
public static void main(String[] args) throws Exception { launch(args); }
#Override public void start(final Stage stage) throws Exception {
final MediaPlayer oracleVid = new MediaPlayer(
new Media("http://download.oracle.com/otndocs/products/javafx/oow2010-2.flv")
);
stage.setScene(new Scene(new Group(new MediaView(oracleVid)), 540, 208));
stage.show();
oracleVid.play();
}
}
So, encode your video to a format understood by JavaFX (e.g. h264 encoded mp4) and place it on a http server and you can load the video data over http from your JavaFX client. Ensure that your client is a certified system configuration for media playback using JavaFX.
That is probably sufficient for what you need.
If you need something a bit more fancy, JavaFX also supports http live streaming, which you can read up on and see if you need (which you probably don't). I don't have instructions on setting up a http live streaming server, nor a link to somewhere on the internet on how to do that (you would have to do your own research on that if you want to go that route).
Also, note, I converted the mjpeg player lab assignment you reference in your question to JavaFX to answer the question: Display RTP MJPEG. It is useful if you want to understand at a low level how such video playback is done. However, I would not recommend using this method for your video playback for a production project - instead just use the built-in JavaFX MediaPlayer.

Here are the basic code: http://xuggle.googlecode.com/svn/trunk/java/xuggle-xuggler/src/com/xuggle/xuggler/demos/DecodeAndPlayAudioAndVideo.java
but I changed it to:
package Pasban;
/**
*
* #modified by Pasban
*/
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.SourceDataLine;
import com.xuggle.xuggler.Global;
import com.xuggle.xuggler.IAudioSamples;
import com.xuggle.xuggler.IContainer;
import com.xuggle.xuggler.IPacket;
import com.xuggle.xuggler.IPixelFormat;
import com.xuggle.xuggler.IStream;
import com.xuggle.xuggler.IStreamCoder;
import com.xuggle.xuggler.ICodec;
import com.xuggle.xuggler.IVideoPicture;
import com.xuggle.xuggler.IVideoResampler;
import com.xuggle.xuggler.Utils;
import com.xuggle.xuggler.demos.VideoImage;
import java.awt.Dimension;
/**
* Takes a media container, finds the first video stream,
* decodes that stream, and then plays the audio and video.
*
* This code does a VERY coarse job of matching time-stamps, and thus
* the audio and video will float in and out of slight sync. Getting
* time-stamps syncing-up with audio is very system dependent and left
* as an exercise for the reader.
*
* #author aclarke
*
*/
public class DecodeAndPlayAudioAndVideo {
/**
* The audio line we'll output sound to; it'll be the default audio device on your system if available
*/
private static SourceDataLine mLine;
/**
* The window we'll draw the video on.
*
*/
private static VideoImage mScreen = null;
private static long mSystemVideoClockStartTime;
private static long mFirstVideoTimestampInStream;
/**
* Takes a media container (file) as the first argument, opens it,
* plays audio as quickly as it can, and opens up a Swing window and displays
* video frames with <i>roughly</i> the right timing.
*
* #param args Must contain one string which represents a filename
*/
#SuppressWarnings("deprecation")
public static void main(String[] args) {
String filename = "http://techslides.com/demos/sample-videos/small.mp4";
// Let's make sure that we can actually convert video pixel formats.
if (!IVideoResampler.isSupported(IVideoResampler.Feature.FEATURE_COLORSPACECONVERSION)) {
throw new RuntimeException("you must install the GPL version of Xuggler (with IVideoResampler support) for this demo to work");
}
// Create a Xuggler container object
IContainer container = IContainer.make();
// Open up the container
if (container.open("http://techslides.com/demos/sample-videos/small.mp4", IContainer.Type.READ, null) < 0) {
throw new IllegalArgumentException("could not open file: " + filename);
}
// query how many streams the call to open found
int numStreams = container.getNumStreams();
// and iterate through the streams to find the first audio stream
int videoStreamId = -1;
IStreamCoder videoCoder = null;
int audioStreamId = -1;
IStreamCoder audioCoder = null;
for (int i = 0; i < numStreams; i++) {
// Find the stream object
IStream stream = container.getStream(i);
// Get the pre-configured decoder that can decode this stream;
IStreamCoder coder = stream.getStreamCoder();
if (videoStreamId == -1 && coder.getCodecType() == ICodec.Type.CODEC_TYPE_VIDEO) {
videoStreamId = i;
videoCoder = coder;
} else if (audioStreamId == -1 && coder.getCodecType() == ICodec.Type.CODEC_TYPE_AUDIO) {
audioStreamId = i;
audioCoder = coder;
}
}
if (videoStreamId == -1 && audioStreamId == -1) {
throw new RuntimeException("could not find audio or video stream in container: " + filename);
}
/*
* Check if we have a video stream in this file. If so let's open up our decoder so it can
* do work.
*/
IVideoResampler resampler = null;
if (videoCoder != null) {
if (videoCoder.open() < 0) {
throw new RuntimeException("could not open audio decoder for container: " + filename);
}
if (videoCoder.getPixelType() != IPixelFormat.Type.BGR24) {
// if this stream is not in BGR24, we're going to need to
// convert it. The VideoResampler does that for us.
resampler = IVideoResampler.make(videoCoder.getWidth(), videoCoder.getHeight(), IPixelFormat.Type.BGR24,
videoCoder.getWidth(), videoCoder.getHeight(), videoCoder.getPixelType());
openJavaVideo(videoCoder);
if (resampler == null) {
throw new RuntimeException("could not create color space resampler for: " + filename);
}
}
/*
* And once we have that, we draw a window on screen
*/
}
if (audioCoder != null) {
if (audioCoder.open() < 0) {
throw new RuntimeException("could not open audio decoder for container: " + filename);
}
/*
* And once we have that, we ask the Java Sound System to get itself ready.
*/
try {
openJavaSound(audioCoder);
} catch (LineUnavailableException ex) {
throw new RuntimeException("unable to open sound device on your system when playing back container: " + filename);
}
}
/*
* Now, we start walking through the container looking at each packet.
*/
IPacket packet = IPacket.make();
mFirstVideoTimestampInStream = Global.NO_PTS;
mSystemVideoClockStartTime = 0;
while (container.readNextPacket(packet) >= 0) {
/*
* Now we have a packet, let's see if it belongs to our video stream
*/
if (packet.getStreamIndex() == videoStreamId) {
/*
* We allocate a new picture to get the data out of Xuggler
*/
IVideoPicture picture = IVideoPicture.make(videoCoder.getPixelType(),
videoCoder.getWidth(), videoCoder.getHeight());
/*
* Now, we decode the video, checking for any errors.
*
*/
int bytesDecoded = videoCoder.decodeVideo(picture, packet, 0);
if (bytesDecoded < 0) {
throw new RuntimeException("got error decoding audio in: " + filename);
}
/*
* Some decoders will consume data in a packet, but will not be able to construct
* a full video picture yet. Therefore you should always check if you
* got a complete picture from the decoder
*/
if (picture.isComplete()) {
IVideoPicture newPic = picture;
/*
* If the resampler is not null, that means we didn't get the video in BGR24 format and
* need to convert it into BGR24 format.
*/
if (resampler != null) {
// we must resample
newPic = IVideoPicture.make(resampler.getOutputPixelFormat(), picture.getWidth(), picture.getHeight());
if (resampler.resample(newPic, picture) < 0) {
throw new RuntimeException("could not resample video from: " + filename);
}
}
if (newPic.getPixelType() != IPixelFormat.Type.BGR24) {
throw new RuntimeException("could not decode video as BGR 24 bit data in: " + filename);
}
long delay = millisecondsUntilTimeToDisplay(newPic);
// if there is no audio stream; go ahead and hold up the main thread. We'll end
// up caching fewer video pictures in memory that way.
try {
if (delay > 0) {
Thread.sleep(delay);
}
} catch (InterruptedException e) {
return;
}
// And finally, convert the picture to an image and display it
mScreen.setImage(Utils.videoPictureToImage(newPic));
}
} else if (packet.getStreamIndex() == audioStreamId) {
/*
* We allocate a set of samples with the same number of channels as the
* coder tells us is in this buffer.
*
* We also pass in a buffer size (1024 in our example), although Xuggler
* will probably allocate more space than just the 1024 (it's not important why).
*/
IAudioSamples samples = IAudioSamples.make(1024, audioCoder.getChannels());
/*
* A packet can actually contain multiple sets of samples (or frames of samples
* in audio-decoding speak). So, we may need to call decode audio multiple
* times at different offsets in the packet's data. We capture that here.
*/
int offset = 0;
/*
* Keep going until we've processed all data
*/
while (offset < packet.getSize()) {
int bytesDecoded = audioCoder.decodeAudio(samples, packet, offset);
if (bytesDecoded < 0) {
throw new RuntimeException("got error decoding audio in: " + filename);
}
offset += bytesDecoded;
/*
* Some decoder will consume data in a packet, but will not be able to construct
* a full set of samples yet. Therefore you should always check if you
* got a complete set of samples from the decoder
*/
if (samples.isComplete()) {
// note: this call will block if Java's sound buffers fill up, and we're
// okay with that. That's why we have the video "sleeping" occur
// on another thread.
playJavaSound(samples);
}
}
} else {
/*
* This packet isn't part of our video stream, so we just silently drop it.
*/
do {
} while (false);
}
}
/*
* Technically since we're exiting anyway, these will be cleaned up by
* the garbage collector... but because we're nice people and want
* to be invited places for Christmas, we're going to show how to clean up.
*/
if (videoCoder != null) {
videoCoder.close();
videoCoder = null;
}
if (audioCoder != null) {
audioCoder.close();
audioCoder = null;
}
if (container != null) {
container.close();
container = null;
}
closeJavaSound();
closeJavaVideo();
}
private static long millisecondsUntilTimeToDisplay(IVideoPicture picture) {
/**
* We could just display the images as quickly as we decode them, but it turns
* out we can decode a lot faster than you think.
*
* So instead, the following code does a poor-man's version of trying to
* match up the frame-rate requested for each IVideoPicture with the system
* clock time on your computer.
*
* Remember that all Xuggler IAudioSamples and IVideoPicture objects always
* give timestamps in Microseconds, relative to the first decoded item. If
* instead you used the packet timestamps, they can be in different units depending
* on your IContainer, and IStream and things can get hairy quickly.
*/
long millisecondsToSleep = 0;
if (mFirstVideoTimestampInStream == Global.NO_PTS) {
// This is our first time through
mFirstVideoTimestampInStream = picture.getTimeStamp();
// get the starting clock time so we can hold up frames
// until the right time.
mSystemVideoClockStartTime = System.currentTimeMillis();
millisecondsToSleep = 0;
} else {
long systemClockCurrentTime = System.currentTimeMillis();
long millisecondsClockTimeSinceStartofVideo = systemClockCurrentTime - mSystemVideoClockStartTime;
// compute how long for this frame since the first frame in the stream.
// remember that IVideoPicture and IAudioSamples timestamps are always in MICROSECONDS,
// so we divide by 1000 to get milliseconds.
long millisecondsStreamTimeSinceStartOfVideo = (picture.getTimeStamp() - mFirstVideoTimestampInStream) / 1000;
final long millisecondsTolerance = 50; // and we give ourselfs 50 ms of tolerance
millisecondsToSleep = (millisecondsStreamTimeSinceStartOfVideo
- (millisecondsClockTimeSinceStartofVideo + millisecondsTolerance));
}
return millisecondsToSleep;
}
/**
* Opens a Swing window on screen.
*/
/**
* Forces the swing thread to terminate; I'm sure there is a right
* way to do this in swing, but this works too.
*/
private static void closeJavaVideo() {
System.exit(0);
}
private static void openJavaSound(IStreamCoder aAudioCoder) throws LineUnavailableException {
AudioFormat audioFormat = new AudioFormat(aAudioCoder.getSampleRate(),
(int) IAudioSamples.findSampleBitDepth(aAudioCoder.getSampleFormat()),
aAudioCoder.getChannels(),
true, /* xuggler defaults to signed 16 bit samples */
false);
DataLine.Info info = new DataLine.Info(SourceDataLine.class, audioFormat);
mLine = (SourceDataLine) AudioSystem.getLine(info);
/**
* if that succeeded, try opening the line.
*/
mLine.open(audioFormat);
/**
* And if that succeed, start the line.
*/
mLine.start();
}
private static void playJavaSound(IAudioSamples aSamples) {
/**
* We're just going to dump all the samples into the line.
*/
byte[] rawBytes = aSamples.getData().getByteArray(0, aSamples.getSize());
mLine.write(rawBytes, 0, aSamples.getSize());
}
private static void closeJavaSound() {
if (mLine != null) {
/*
* Wait for the line to finish playing
*/
mLine.drain();
/*
* Close the line.
*/
mLine.close();
mLine = null;
}
}
private static void openJavaVideo(IStreamCoder videoCoder) {
mScreen = new VideoImage();
mScreen.setPreferredSize(new Dimension(videoCoder.getWidth(), videoCoder.getHeight()));
mScreen.setLocationRelativeTo(null);
}
}
Things I changed:
private static void openJavaVideo(IStreamCoder videoCoder) {
mScreen = new VideoImage();
mScreen.setPreferredSize(new Dimension(videoCoder.getWidth(), videoCoder.getHeight()));
mScreen.setLocationRelativeTo(null);
}
Moved openJavaVideo method into videoStream detector:
openJavaVideo(videoCoder);
Changed the first part of the main:
public static void main(String[] args) {
String filename = "http://techslides.com/demos/sample-videos/small.mp4";
// Let's make sure that we can actually convert video pixel formats.
if (!IVideoResampler.isSupported(IVideoResampler.Feature.FEATURE_COLORSPACECONVERSION)) {
throw new RuntimeException("you must install the GPL version of Xuggler (with IVideoResampler support) for this demo to work");
}
// Create a Xuggler container object
IContainer container = IContainer.make();
// Open up the container
if (container.open("http://techslides.com/demos/sample-videos/small.mp4", IContainer.Type.READ, null) < 0) {
throw new IllegalArgumentException("could not open file: " + filename);
}
Actually, the important part was:
if (container.open("http://techslides.com/demos/sample-videos/small.mp4", IContainer.Type.READ, null) < 0) {
throw new IllegalArgumentException("could not open file: " + filename);
}

Xuggle is/was one of the best:
Streaming video with Xuggler
Right now I don;t have a complete project, but I believe it had an example with its demo files.
Search google for xuggle video streaming demo or similar keywords with Xuggler. It is easy to use and support most of the known formats as it wraps FFMPEG with itself.

I came to another idea it may worth trying.
Create a JavaFX application.
Add a web browser into it, check webengine
create a template webpage which it contains a html player, or load the page in your server where it accept file id, then create a page that create a player for that file, similar to youtube, then auto play it.
This will be much better idea it you could do it.
There are sample codes for webengine and javaFX. Once you loaded a page, say, youtube or vimeo and played a video there, then sky is the limit :)

Generate waveform image from audio stream of a video

I need to generate a waveform of an audio stream of a video,
currently I'm using xuggler and java to do some little things, but seems like I'm not able to get a byte array of the inputstream audio of my video from IAudioSamples.
Now I'm searching for an easier way to do it since xuggler is really becoming hard to understand, I've searched online and I've found this:
http://codeidol.com/java/swing/Audio/Build-an-Audio-Waveform-Display/
should work on .wav files, but when I try the code on a video or a .mp3 the AudioInputStream returns "cannot find an audio input stream"
can someone tell me a way to get byte[] array of the audiostream of one video so that I can follow the tutorial to create a waveform?
also if you have suggestion or other library that could help me I would be glad

Because mp3 it's an encoded format, you'll need before to decode it to get ray data (bytes) from it.
class Mp3FileXuggler {
private boolean DEBUG = true;
private String _sInputFileName;
private IContainer _inputContainer;
private int _iBitRate;
private IPacket _packet;
private int _iAudioStreamId;
private IStreamCoder _audioCoder;
private int _iSampleBufferSize;
private int _iInputSampleRate;
private static SourceDataLine mLine;
private int DECODED_AUDIO_SECOND_SIZE = 176375; /** bytes */
private int _bytesPerPacket;
private byte[] _residualBuffer;
/**
* Constructor, prepares stream to be readed
* #param input input File
* #throws UnsuportedSampleRateException
*/
public Mp3FileXuggler(String sFileName) throws UnsuportedSampleRateException{
this._sInputFileName = sFileName;
this._inputContainer = IContainer.make();
this._iSampleBufferSize = 18432;
this._residualBuffer = null;
/** Open container **/
if (this._inputContainer.open(this._sInputFileName, IContainer.Type.READ, null) < 0)
throw new IllegalArgumentException("Could not read the file: " + this._sInputFileName);
/** How many streams does the file actually have */
int iNumStreams = this._inputContainer.getNumStreams();
this._iBitRate = this._inputContainer.getBitRate();
if (DEBUG) System.out.println("Bitrate: " + this._iBitRate);
/** Iterate the streams to find the first audio stream */
this._iAudioStreamId = -1;
this._audioCoder = null;
boolean bFound = false;
int i = 0;
while (i < iNumStreams && bFound == false){
/** Find the stream object */
IStream stream = this._inputContainer.getStream(i);
IStreamCoder coder = stream.getStreamCoder();
/** If the stream is audio, stop looking */
if (coder.getCodecType() == ICodec.Type.CODEC_TYPE_AUDIO){
this._iAudioStreamId = i;
this._audioCoder = coder;
this._iInputSampleRate = coder.getSampleRate();
bFound = true;
}
++i;
}
/** If none was found */
if (this._iAudioStreamId == -1)
throw new RuntimeException("Could not find audio stream in container: " + this._sInputFileName);
/** Otherwise, open audiocoder */
if (this._audioCoder.open(null,null) < 0)
throw new RuntimeException("could not open audio decoder for container: " + this._sInputFileName);
this._packet = IPacket.make();
//openJavaSound(this._audioCoder);
/** Dummy read one packet to avoid problems in some audio files */
this._inputContainer.readNextPacket(this._packet);
/** Supported sample rates */
switch(this._iInputSampleRate){
case 22050:
this._bytesPerPacket = 2304;
break;
case 44100:
this._bytesPerPacket = 4608;
break;
}
}
public byte[] getSamples(){
byte[] rawBytes = null;
/** Go to the correct packet */
while (this._inputContainer.readNextPacket(this._packet) >= 0){
//System.out.println(this._packet.getDuration());
/** Once we have a packet, let's see if it belongs to the audio stream */
if (this._packet.getStreamIndex() == this._iAudioStreamId){
IAudioSamples samples = IAudioSamples.make(this._iSampleBufferSize, this._audioCoder.getChannels());
// System.out.println(">> " + samples.toString());
/** Because a packet can contain multiple set of samples (frames of samples). We may need to call
* decode audio multiple times at different offsets in the packet's data */
int iCurrentOffset = 0;
while(iCurrentOffset < this._packet.getSize()){
int iBytesDecoded = this._audioCoder.decodeAudio(samples, this._packet, iCurrentOffset);
iCurrentOffset += iBytesDecoded;
if (samples.isComplete()){
rawBytes = samples.getData().getByteArray(0, samples.getSize());
//playJavaSound(samples);
}
}
return rawBytes;
}
else{
/** Otherwise drop it */
do{}while(false);
}
}
return rawBytes; /** This will return null at this point */
}
}
Use this class in order to get the raw data from a mp3 file, and with them feed your spectrum drawer.

Working with audio in Java

I went over Java's tutorial on sounds but somehow it is way too complex for a beginner.
It is here
My aim is this:
Detect all the audio input and output devices
let the user select a audio input device
capture what the user says
output it to the default audio output device
Now how do I go about doing that?
Is there a better tutorial available?
What I have tried:
import javax.sound.sampled.*;
public class SoundTrial {
public static void main(String[] args) {
Mixer.Info[] info = AudioSystem.getMixerInfo();
int i =0;
for(Mixer.Info print : info){
System.out.println("Name: "+ i + " " + print.getName());
i++;
}
}
}
output:
Name: 0 Primary Sound Driver
Name: 1 Speakers and Headphones (IDT High Definition Audio CODEC)
Name: 2 Independent Headphones (IDT High Definition Audio CODEC
Name: 3 SPDIF (Digital Out via HP Dock) (IDT High Definition Audio CODEC)
Name: 4 Primary Sound Capture Driver
Name: 5 Integrated Microphone Array (ID
Name: 6 Microphone (IDT High Definition
Name: 7 Stereo Mix (IDT High Definition
Name: 8 Port Speakers and Headphones (IDT Hi
Name: 9 Port SPDIF (Digital Out via HP Dock)
Name: 10 Port Integrated Microphone Array (ID
Name: 11 Port Microphone (IDT High Definition
Name: 12 Port Stereo Mix (IDT High Definition
Name: 13 Port Independent Headphones (IDT Hi

This code may help you. Note this has been taken from this link: Audio Video. I found using Google search, I just posted code here in-case link becomes outdated.
import javax.media.*;
import javax.media.format.*;
import javax.media.protocol.*;
import java.util.*;
/*******************************************************************************
* A simple application to allow users to capture audio or video through devices
* connected to the PC. Via command-line arguments the user specifies whether
* audio (-a) or video (-v) capture, the duration of the capture (-d) in
* seconds, and the file to write the media to (-f).
*
* The application would be far more useful and versatile if it provided control
* over the formats of the audio and video captured as well as the content type
* of the output.
*
* The class searches for capture devices that support the particular default
* track formats: linear for audio and Cinepak for video. As a fall-back two
* device names are hard-coded into the application as an example of how to
* obtain DeviceInfo when a device's name is known. The user may force the
* application to use these names by using the -k (known devices) flag.
*
* The class is static but employs the earlier Location2Location example to
* perform all the Processor and DataSink related work. Thus the application
* chiefly involves CaptureDevice related operations.
*
* #author Michael (Spike) Barlow
******************************************************************************/
public class SimpleRecorder {
/////////////////////////////////////////////////////////////
// Names for the audio and video capture devices on the
// author's system. These will vary system to system but are
// only used as a fallback.
/////////////////////////////////////////////////////////////
private static final String AUDIO_DEVICE_NAME = "DirectSoundCapture";
private static final String VIDEO_DEVICE_NAME = "vfw:Microsoft WDM Image Capture:0";
///////////////////////////////////////////////////////////
// Default names for the files to write the output to for
// the case where they are not supplie by the user.
//////////////////////////////////////////////////////////
private static final String DEFAULT_AUDIO_NAME = "file://./captured.wav";
private static final String DEFAULT_VIDEO_NAME = "file://./captured.avi";
///////////////////////////////////////////
// Type of capture requested by the user.
//////////////////////////////////////////
private static final String AUDIO = "audio";
private static final String VIDEO = "video";
private static final String BOTH = "audio and video";
////////////////////////////////////////////////////////////////////
// The only audio and video formats that the particular application
// supports. A better program would allow user selection of formats
// but would grow past the small example size.
////////////////////////////////////////////////////////////////////
private static final Format AUDIO_FORMAT = new AudioFormat(
AudioFormat.LINEAR);
private static final Format VIDEO_FORMAT = new VideoFormat(
VideoFormat.CINEPAK);
public static void main(String[] args) {
//////////////////////////////////////////////////////
// Object to handle the processing and sinking of the
// data captured from the device.
//////////////////////////////////////////////////////
Location2Location capture;
/////////////////////////////////////
// Audio and video capture devices.
////////////////////////////////////
CaptureDeviceInfo audioDevice = null;
CaptureDeviceInfo videoDevice = null;
/////////////////////////////////////////////////////////////
// Capture device's "location" plus the name and location of
// the destination.
/////////////////////////////////////////////////////////////
MediaLocator captureLocation = null;
MediaLocator destinationLocation;
String destinationName = null;
////////////////////////////////////////////////////////////
// Formats the Processor (in Location2Location) must match.
////////////////////////////////////////////////////////////
Format[] formats = new Format[1];
///////////////////////////////////////////////
// Content type for an audio or video capture.
//////////////////////////////////////////////
ContentDescriptor audioContainer = new ContentDescriptor(
FileTypeDescriptor.WAVE);
ContentDescriptor videoContainer = new ContentDescriptor(
FileTypeDescriptor.MSVIDEO);
ContentDescriptor container = null;
////////////////////////////////////////////////////////////////////
// Duration of recording (in seconds) and period to wait afterwards
///////////////////////////////////////////////////////////////////
double duration = 10;
int waitFor = 0;
//////////////////////////
// Audio or video capture?
//////////////////////////
String selected = AUDIO;
////////////////////////////////////////////////////////
// All devices that support the format in question.
// A means of "ensuring" the program works on different
// machines with different capture devices.
////////////////////////////////////////////////////////
Vector devices;
//////////////////////////////////////////////////////////
// Whether to search for capture devices that support the
// format or use the devices whos names are already
// known to the application.
//////////////////////////////////////////////////////////
boolean useKnownDevices = false;
/////////////////////////////////////////////////////////
// Process the command-line options as to audio or video,
// duration, and file to save to.
/////////////////////////////////////////////////////////
for (int i = 0; i 0 && !useKnownDevices) {
audioDevice = (CaptureDeviceInfo) devices.elementAt(0);
} else
audioDevice = CaptureDeviceManager.getDevice(AUDIO_DEVICE_NAME);
if (audioDevice == null) {
System.out.println("Can't find suitable audio device. Exiting");
System.exit(1);
}
captureLocation = audioDevice.getLocator();
formats[0] = AUDIO_FORMAT;
if (destinationName == null)
destinationName = DEFAULT_AUDIO_NAME;
container = audioContainer;
}
/////////////////////////////////////////////////////////////////
// Perform setup for video capture. Includes finding a suitable
// device, obatining its MediaLocator and setting the content
// type.
////////////////////////////////////////////////////////////////
else if (selected.equals(VIDEO)) {
devices = CaptureDeviceManager.getDeviceList(VIDEO_FORMAT);
if (devices.size() > 0 && !useKnownDevices)
videoDevice = (CaptureDeviceInfo) devices.elementAt(0);
else
videoDevice = CaptureDeviceManager.getDevice(VIDEO_DEVICE_NAME);
if (videoDevice == null) {
System.out.println("Can't find suitable video device. Exiting");
System.exit(1);
}
captureLocation = videoDevice.getLocator();
formats[0] = VIDEO_FORMAT;
if (destinationName == null)
destinationName = DEFAULT_VIDEO_NAME;
container = videoContainer;
} else if (selected.equals(BOTH)) {
captureLocation = null;
formats = new Format[2];
formats[0] = AUDIO_FORMAT;
formats[1] = VIDEO_FORMAT;
container = videoContainer;
if (destinationName == null)
destinationName = DEFAULT_VIDEO_NAME;
}
////////////////////////////////////////////////////////////////////
// Perform all the necessary Processor and DataSink preparation via
// the Location2Location class.
////////////////////////////////////////////////////////////////////
destinationLocation = new MediaLocator(destinationName);
System.out.println("Configuring for capture. Please wait.");
capture = new Location2Location(captureLocation, destinationLocation,
formats, container, 1.0);
/////////////////////////////////////////////////////////////////////////////
// Start the recording and tell the user. Specify the length of the
// recording. Then wait around for up to 4-times the duration of
// recording
// (can take longer to sink/write the data so should wait a bit incase).
/////////////////////////////////////////////////////////////////////////////
System.out.println("Started recording " + duration + " seconds of "
+ selected + " ...");
capture.setStopTime(new Time(duration));
if (waitFor == 0)
waitFor = (int) (4000 * duration);
else
waitFor *= 1000;
int waited = capture.transfer(waitFor);
/////////////////////////////////////////////////////////
// Report on the success (or otherwise) of the recording.
/////////////////////////////////////////////////////////
int state = capture.getState();
if (state == Location2Location.FINISHED)
System.out.println(selected
+ " capture successful in approximately "
+ ((int) ((waited + 500) / 1000))
+ " seconds. Data written to " + destinationName);
else if (state == Location2Location.FAILED)
System.out.println(selected
+ " capture failed after approximately "
+ ((int) ((waited + 500) / 1000)) + " seconds");
else {
System.out.println(selected
+ " capture still ongoing after approximately "
+ ((int) ((waited + 500) / 1000)) + " seconds");
System.out.println("Process likely to have failed");
}
System.exit(0);
}
}

Unable to understand this method (How does it try to match the frame rate ?)

I came across this snippet while going through the tutorial on how to decode a video :
private static long millisecondsUntilTimeToDisplay(IVideoPicture picture)
{
/**
* We could just display the images as quickly as we decode them, but it turns
* out we can decode a lot faster than you think.
*
* So instead, the following code does a poor-man's version of trying to
* match up the frame-rate requested for each IVideoPicture with the system
* clock time on your computer.
*
* Remember that all Xuggler IAudioSamples and IVideoPicture objects always
* give timestamps in Microseconds, relative to the first decoded item. If
* instead you used the packet timestamps, they can be in different units depending
* on your IContainer, and IStream and things can get hairy quickly.
*/
long millisecondsToSleep = 0;
if (mFirstVideoTimestampInStream == Global.NO_PTS)
{
// This is our first time through
mFirstVideoTimestampInStream = picture.getTimeStamp();
// get the starting clock time so we can hold up frames
// until the right time.
mSystemVideoClockStartTime = System.currentTimeMillis();
millisecondsToSleep = 0;
} else {
long systemClockCurrentTime = System.currentTimeMillis();
long millisecondsClockTimeSinceStartofVideo = systemClockCurrentTime - mSystemVideoClockStartTime;
// compute how long for this frame since the first frame in the stream.
// remember that IVideoPicture and IAudioSamples timestamps are always in MICROSECONDS,
// so we divide by 1000 to get milliseconds.
long millisecondsStreamTimeSinceStartOfVideo = (picture.getTimeStamp() - mFirstVideoTimestampInStream)/1000;
final long millisecondsTolerance = 50; // and we give ourselfs 50 ms of tolerance
millisecondsToSleep = (millisecondsStreamTimeSinceStartOfVideo -
(millisecondsClockTimeSinceStartofVideo+millisecondsTolerance));
}
return millisecondsToSleep;
}
I have scratched a lot but don't understand what does this method do ? what are we returning ? And why we are making the thread to sleep after the method returns ( what is the purpose of the method ?)
This is the complete code in the link :
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.SourceDataLine;
import com.xuggle.xuggler.demos.*;
import com.xuggle.xuggler.Global;
import com.xuggle.xuggler.IAudioSamples;
import com.xuggle.xuggler.IContainer;
import com.xuggle.xuggler.IPacket;
import com.xuggle.xuggler.IPixelFormat;
import com.xuggle.xuggler.IStream;
import com.xuggle.xuggler.IStreamCoder;
import com.xuggle.xuggler.ICodec;
import com.xuggle.xuggler.IVideoPicture;
import com.xuggle.xuggler.IVideoResampler;
import com.xuggle.xuggler.Utils;
public class DecodeAndPlayAudioAndVideo
{
/**
* The audio line we'll output sound to; it'll be the default audio device on your system if available
*/
private static SourceDataLine mLine;
/**
* The window we'll draw the video on.
*
*/
private static VideoImage mScreen = null;
private static long mSystemVideoClockStartTime;
private static long mFirstVideoTimestampInStream;
/**
* Takes a media container (file) as the first argument, opens it,
* plays audio as quickly as it can, and opens up a Swing window and displays
* video frames with <i>roughly</i> the right timing.
*
* #param args Must contain one string which represents a filename
*/
#SuppressWarnings("deprecation")
public static void main(String[] args)
{
if (args.length <= 0)
throw new IllegalArgumentException("must pass in a filename as the first argument");
String filename = args[0];
// Let's make sure that we can actually convert video pixel formats.
if (!IVideoResampler.isSupported(IVideoResampler.Feature.FEATURE_COLORSPACECONVERSION))
throw new RuntimeException("you must install the GPL version of Xuggler (with IVideoResampler support) for this demo to work");
// Create a Xuggler container object
IContainer container = IContainer.make();
// Open up the container
if (container.open(filename, IContainer.Type.READ, null) < 0)
throw new IllegalArgumentException("could not open file: " + filename);
// query how many streams the call to open found
int numStreams = container.getNumStreams();
// and iterate through the streams to find the first audio stream
int videoStreamId = -1;
IStreamCoder videoCoder = null;
int audioStreamId = -1;
IStreamCoder audioCoder = null;
for(int i = 0; i < numStreams; i++)
{
// Find the stream object
IStream stream = container.getStream(i);
// Get the pre-configured decoder that can decode this stream;
IStreamCoder coder = stream.getStreamCoder();
if (videoStreamId == -1 && coder.getCodecType() == ICodec.Type.CODEC_TYPE_VIDEO)
{
videoStreamId = i;
videoCoder = coder;
}
else if (audioStreamId == -1 && coder.getCodecType() == ICodec.Type.CODEC_TYPE_AUDIO)
{
audioStreamId = i;
audioCoder = coder;
}
}
if (videoStreamId == -1 && audioStreamId == -1)
throw new RuntimeException("could not find audio or video stream in container: "+filename);
/*
* Check if we have a video stream in this file. If so let's open up our decoder so it can
* do work.
*/
IVideoResampler resampler = null;
if (videoCoder != null)
{
if(videoCoder.open() < 0)
throw new RuntimeException("could not open audio decoder for container: "+filename);
if (videoCoder.getPixelType() != IPixelFormat.Type.BGR24)
{
// if this stream is not in BGR24, we're going to need to
// convert it. The VideoResampler does that for us.
resampler = IVideoResampler.make(videoCoder.getWidth(), videoCoder.getHeight(), IPixelFormat.Type.BGR24,
videoCoder.getWidth(), videoCoder.getHeight(), videoCoder.getPixelType());
if (resampler == null)
throw new RuntimeException("could not create color space resampler for: " + filename);
}
/*
* And once we have that, we draw a window on screen
*/
openJavaVideo();
}
if (audioCoder != null)
{
if (audioCoder.open() < 0)
throw new RuntimeException("could not open audio decoder for container: "+filename);
/*
* And once we have that, we ask the Java Sound System to get itself ready.
*/
try
{
openJavaSound(audioCoder);
}
catch (LineUnavailableException ex)
{
throw new RuntimeException("unable to open sound device on your system when playing back container: "+filename);
}
}
/*
* Now, we start walking through the container looking at each packet.
*/
IPacket packet = IPacket.make();
mFirstVideoTimestampInStream = Global.NO_PTS;
mSystemVideoClockStartTime = 0;
while(container.readNextPacket(packet) >= 0)
{
/*
* Now we have a packet, let's see if it belongs to our video stream
*/
if (packet.getStreamIndex() == videoStreamId)
{
/*
* We allocate a new picture to get the data out of Xuggler
*/
IVideoPicture picture = IVideoPicture.make(videoCoder.getPixelType(),
videoCoder.getWidth(), videoCoder.getHeight());
/*
* Now, we decode the video, checking for any errors.
*
*/
int bytesDecoded = videoCoder.decodeVideo(picture, packet, 0);
if (bytesDecoded < 0)
throw new RuntimeException("got error decoding audio in: " + filename);
/*
* Some decoders will consume data in a packet, but will not be able to construct
* a full video picture yet. Therefore you should always check if you
* got a complete picture from the decoder
*/
if (picture.isComplete())
{
IVideoPicture newPic = picture;
/*
* If the resampler is not null, that means we didn't get the video in BGR24 format and
* need to convert it into BGR24 format.
*/
if (resampler != null)
{
// we must resample
newPic = IVideoPicture.make(resampler.getOutputPixelFormat(), picture.getWidth(), picture.getHeight());
if (resampler.resample(newPic, picture) < 0)
throw new RuntimeException("could not resample video from: " + filename);
}
if (newPic.getPixelType() != IPixelFormat.Type.BGR24)
throw new RuntimeException("could not decode video as BGR 24 bit data in: " + filename);
long delay = millisecondsUntilTimeToDisplay(newPic);
// if there is no audio stream; go ahead and hold up the main thread. We'll end
// up caching fewer video pictures in memory that way.
try
{
if (delay > 0)
Thread.sleep(delay);
}
catch (InterruptedException e)
{
return;
}
// And finally, convert the picture to an image and display it
mScreen.setImage(Utils.videoPictureToImage(newPic));
}
}
else if (packet.getStreamIndex() == audioStreamId)
{
/*
* We allocate a set of samples with the same number of channels as the
* coder tells us is in this buffer.
*
* We also pass in a buffer size (1024 in our example), although Xuggler
* will probably allocate more space than just the 1024 (it's not important why).
*/
IAudioSamples samples = IAudioSamples.make(1024, audioCoder.getChannels());
/*
* A packet can actually contain multiple sets of samples (or frames of samples
* in audio-decoding speak). So, we may need to call decode audio multiple
* times at different offsets in the packet's data. We capture that here.
*/
int offset = 0;
/*
* Keep going until we've processed all data
*/
while(offset < packet.getSize())
{
int bytesDecoded = audioCoder.decodeAudio(samples, packet, offset);
if (bytesDecoded < 0)
throw new RuntimeException("got error decoding audio in: " + filename);
offset += bytesDecoded;
/*
* Some decoder will consume data in a packet, but will not be able to construct
* a full set of samples yet. Therefore you should always check if you
* got a complete set of samples from the decoder
*/
if (samples.isComplete())
{
// note: this call will block if Java's sound buffers fill up, and we're
// okay with that. That's why we have the video "sleeping" occur
// on another thread.
playJavaSound(samples);
}
}
}
else
{
/*
* This packet isn't part of our video stream, so we just silently drop it.
*/
do {} while(false);
}
}
/*
* Technically since we're exiting anyway, these will be cleaned up by
* the garbage collector... but because we're nice people and want
* to be invited places for Christmas, we're going to show how to clean up.
*/
if (videoCoder != null)
{
videoCoder.close();
videoCoder = null;
}
if (audioCoder != null)
{
audioCoder.close();
audioCoder = null;
}
if (container !=null)
{
container.close();
container = null;
}
closeJavaSound();
closeJavaVideo();
}
What does The following method do ?
private static long millisecondsUntilTimeToDisplay(IVideoPicture picture)
{
/**
* We could just display the images as quickly as we decode them, but it turns
* out we can decode a lot faster than you think.
*
* So instead, the following code does a poor-man's version of trying to
* match up the frame-rate requested for each IVideoPicture with the system
* clock time on your computer.
*
* Remember that all Xuggler IAudioSamples and IVideoPicture objects always
* give timestamps in Microseconds, relative to the first decoded item. If
* instead you used the packet timestamps, they can be in different units depending
* on your IContainer, and IStream and things can get hairy quickly.
*/
long millisecondsToSleep = 0;
if (mFirstVideoTimestampInStream == Global.NO_PTS)
{
// This is our first time through
mFirstVideoTimestampInStream = picture.getTimeStamp();
// get the starting clock time so we can hold up frames
// until the right time.
mSystemVideoClockStartTime = System.currentTimeMillis();
millisecondsToSleep = 0;
} else {
long systemClockCurrentTime = System.currentTimeMillis();
long millisecondsClockTimeSinceStartofVideo = systemClockCurrentTime - mSystemVideoClockStartTime;
// compute how long for this frame since the first frame in the stream.
// remember that IVideoPicture and IAudioSamples timestamps are always in MICROSECONDS,
// so we divide by 1000 to get milliseconds.
long millisecondsStreamTimeSinceStartOfVideo = (picture.getTimeStamp() - mFirstVideoTimestampInStream)/1000;
final long millisecondsTolerance = 50; // and we give ourselfs 50 ms of tolerance
millisecondsToSleep = (millisecondsStreamTimeSinceStartOfVideo -
(millisecondsClockTimeSinceStartofVideo+millisecondsTolerance));
}
return millisecondsToSleep;
}
/**
* Opens a Swing window on screen.
*/
private static void openJavaVideo()
{
mScreen = new VideoImage();
}
/**
* Forces the swing thread to terminate; I'm sure there is a right
* way to do this in swing, but this works too.
*/
private static void closeJavaVideo()
{
System.exit(0);
}
private static void openJavaSound(IStreamCoder aAudioCoder) throws LineUnavailableException
{
AudioFormat audioFormat = new AudioFormat(aAudioCoder.getSampleRate(),
(int)IAudioSamples.findSampleBitDepth(aAudioCoder.getSampleFormat()),
aAudioCoder.getChannels(),
true, /* xuggler defaults to signed 16 bit samples */
false);
DataLine.Info info = new DataLine.Info(SourceDataLine.class, audioFormat);
mLine = (SourceDataLine) AudioSystem.getLine(info);
/**
* if that succeeded, try opening the line.
*/
mLine.open(audioFormat);
/**
* And if that succeed, start the line.
*/
mLine.start();
}
private static void playJavaSound(IAudioSamples aSamples)
{
/**
* We're just going to dump all the samples into the line.
*/
byte[] rawBytes = aSamples.getData().getByteArray(0, aSamples.getSize());
mLine.write(rawBytes, 0, aSamples.getSize());
}
private static void closeJavaSound()
{
if (mLine != null)
{
/*
* Wait for the line to finish playing
*/
mLine.drain();
/*
* Close the line.
*/
mLine.close();
mLine=null;
}
}
}

Rough algorithm in pseudocode:
Is this the first frame?
> Yes, save the frame time and the current time.
> No, do the following:
See how much time has passed since the first frame was displayed in System Time
See the difference in time between the current frame and the first frame
If there is a discrepancy
>Return a number of milliseconds to sleep for, else return 0.
So, what you then get is the overall algorithm of:
Decode frame
Check if we need to delay the frame (the method in question)
Delay
Display frame
In this way, the program will never display frames faster than the variable frame rate declared by the video. The method in question maintains the state of previous frame times and calculates how long to sleep for.
EDIT: The delay is needed because you can decode frames (much!) faster than the video's frame rate. Let's say you have a fairly slow machine running this program, and it takes 10ms to decode a frame. Let's also say that you have video that has a variable frame rate, but is roughly 10 frames per second (or 100ms per frame). Now if you take this step out of our 'overall algorithm':
Decode frame (10ms)
Display frame (1ms)
Decode frame (10ms)
Display frame (1ms)
If this was happening you would find 1 frame displayed every 10ms, meaning that the video will be displayed at 100 frames per second, which is wrong!
EDIT2: I guess what you're asking is why don't we do this?
Decode frame
Frame Delta = Current Frame Time - Previous Frame Time
Delay (for Delta milliseconds)
Display frame
The problem with this is what happens if it takes a long time to decode or display a frame? This would cause the frame rate to be significantly slower than the frame rate in the file.
Instead, this algorithm syncs the first frame to the system time, then does a little bit of extra calculation:
long systemTimeChange = currentSystemTime - firstFrameSystemTime;
long frameTimeChange = currentFrameTime - firstFrameTime;
// Subtract the time elapsed.
long differenceInChanges = frameTimeChange - systemTimeChange;
if(differenceInChanges > 0) {
// It was faster to decode than the frame rate!
Thread.sleep(differenceInChanges);
}

system time actually denotes the time at which the particular frame has been decoded and the frameTimeroughly denotes the frame rate the video has. So the difference goes like this : discrepancy = frameRate - decodeRate + tolerance Tolerance may be useful when decoding video takes larger time than it takes or the media takes longer time to display . Here is what you get from the difference :
Since decoding is too fast compared to the frame rate of video we have to wait some time and won't display that frame right now. And we use systemTimeStamp to sync our frames and hold it till the right time. In the above picture you see how fast the decoding rate is but frame rate is slow compared to the decoding rate.

seems like this:
* So instead, the following code does a poor-man's version of trying to
* match up the frame-rate requested for each IVideoPicture with the system
* clock time on your computer.
the delay is to try to match the framerate.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Capturing audio streams in JAVA - java

Related

WAVE file unexpected behaviour

Video Streaming on network using java sockets

Generate waveform image from audio stream of a video

Working with audio in Java

Unable to understand this method (How does it try to match the frame rate ?)

Categories

Resources