Converting Stereo to Mono using TarsosDSP does not work

Converting Stereo to Mono using TarsosDSP does not work - java

I would like to use some features of TarsosDSP on sound data. The incoming data is Stereo, but Tarsos does only support mono, so I tried to transfer it to mono as follows, but the result still sounds like stereo data interpreted as mono, i.e. the conversion via MultichannelToMono doesn't seem to have any effect, although its implementation looks good upon a quick glance.
#Test
public void testPlayStereoFile() throws IOException, UnsupportedAudioFileException, LineUnavailableException {
AudioDispatcher dispatcher = AudioDispatcherFactory.fromFile(FILE,4096,0);
dispatcher.addAudioProcessor(new MultichannelToMono(dispatcher.getFormat().getChannels(), false));
dispatcher.addAudioProcessor(new AudioPlayer(dispatcher.getFormat()));
dispatcher.run();
}
Is there anything that I do wrong here? Why does the MultichannelToMono processor not transfer the data to mono?

The only way I found which works is to use the Java Audio System for performing this conversion before sending the data to TarsosDSP, it seems it does not convert the framesize correctly
I found the following snippet at https://www.experts-exchange.com/questions/26925195/java-stereo-to-mono-conversion-unsupported-conversion-error.html which I use to convert to mono before applying more advanced audio transformations with TarsosDSP.
public static AudioInputStream convertToMono(AudioInputStream sourceStream) {
AudioFormat sourceFormat = sourceStream.getFormat();
// is already mono?
if(sourceFormat.getChannels() == 1) {
return sourceStream;
}
AudioFormat targetFormat = new AudioFormat(
sourceFormat.getEncoding(),
sourceFormat.getSampleRate(),
sourceFormat.getSampleSizeInBits(),
1,
// this is the important bit, the framesize needs to change as well,
// for framesize 4, this calculation leads to new framesize 2
(sourceFormat.getSampleSizeInBits() + 7) / 8,
sourceFormat.getFrameRate(),
sourceFormat.isBigEndian());
return AudioSystem.getAudioInputStream(targetFormat, sourceStream);
}

Related

How can I get an array of all valid system AudioFormat objects in Java

I am creating an object that can play synthesised audio in Java but I need to be able to set it to the AudioFormat with the Operating system's highest possible audio bitrate it can play.
(Synth generates 64-bit float audio and can bit-crush it to 32-bit float or PCM, 24-bit, 16-bit and 8-bit PCM audio.)
I will need to filter all the Operating system's valid AudioFormats and pick the format with the highest bitrate the system can use.
How can I get the approtriate array of all the AudioFormats that the system can play without error?
public class AudioSettings {
// instance variables
private int sampleRate;
private AudioFormat audioFormat;
private SourceDataLine sourceDataLine;
public AudioSettings(int sampleRate) {
this.sampleRate = sampleRate;
// get highest possible quality bitrate for system
int highestBitRate = 16;
AudioFormat currentFormat = new AudioFormat(new Encoding("PCM_SIGNED"), (float) sampleRate, highestBitRate,
2, highestBitRate / 8 * 2, sampleRate, true);
for (AudioFormat format : /* What goes here? */) {
if (format.getSampleSizeInBits() > highestBitRate
&& format.isBigEndian()
&& format.getChannels() == 2) {
currentFormat = format;
highestBitRate = format.getSampleSizeInBits();
}
}
audioFormat = currentFormat;
}
}

According to this document frpm the Java 8 days, Java Sound Technology, Java supports a max of 16-bit encoding, and a highest sample rate of 48 kHz.
IDK if there's been any advancement since then. There must be a specification for Java 17, for example, where the specs are listed.
As far as querying the system for supported file types, there is a mention of in the tutorial Using File and Format Converters, in the last section: Learning What Conversions Are Available.
A related AudioSystem method, getAudioFileTypes(AudioInputStream),
returns the complete list of supported file types for the given
stream, as an array of AudioFileFormat.Type instances.

Thanks to #gpasch I found my answer from his link. Although I think you only need to read one instance of the Line.Info[] array because it seems to print out three groups that are exactly the same.
public static void main(String[] args) {
Line.Info desired = new Line.Info(SourceDataLine.class);
Line.Info[] infos = AudioSystem.getSourceLineInfo(desired);
for (Line.Info info : infos) {
if (info instanceof DataLine.Info) {
AudioFormat[] forms = ((DataLine.Info) info).getFormats();
for (AudioFormat format : forms) {
System.out.println(format);
}
}
}
}

Java Audio SourceDataLine does not support PCM_FLOAT

I am trying to play a buffer of audio using Java on Linux.
I am getting the following exception when attempting to open the line (not when I write the audio to it)...
Exception in thread "main" java.lang.IllegalArgumentException: No line matching interface SourceDataLine supporting format PCM_FLOAT 44100.0 Hz, 16 bit, mono, 2 bytes/frame, is supported.
public boolean open()
{
try {
int smpSizeInBits = bytesPerSmp * 8;
int frameSize = bytesPerSmp * channels; // just an fyi, frameSize does not always == bytesPerSmp * channels for non PCM encodings
int frameRate = (int)smpRate; // again this might not be the case for non PCM encodings.
boolean isBigEndian = false;
AudioFormat af = new AudioFormat(AudioFormat.Encoding.PCM_FLOAT , smpRate, smpSizeInBits, channels, frameSize, frameRate, isBigEndian);
DataLine.Info info = new DataLine.Info(SourceDataLine.class, af);
int bufferSizeInBytes = bufferSizeInFrames * channels * bytesPerSmp;
line = (SourceDataLine) AudioSystem.getLine(info);
line.open(af, bufferSizeInBytes);
open = true;
}
catch(LineUnavailableException e) {
System.out.println("PcmFloatPlayer: Unable to open, line unavailble.");
}
return open;
}
I am wondering if my assumptions about what PCM_FLOAT encoding is, are actually incorrect.
I have some code that reads in a wav file. The wavfile is mono, 16bit, uncompressed format. I then convert the audio to floats in range of -1.0 to 1.0 for processing.
I assumed the PCM_FLOAT encoding is just raw PCM data that has been converted to float values between -1.0 and 1.0. Is this correct?
I then assumed that the SourceDataLine would convert the float audio to the appropriate format based on my passed format info (mono, 16bit, 2bytes/frame). Again is this assumption incorrect?
Must I convert my float -1.0 to 1.0 audio back to my desired output format, and set the SourceDataLine to PCM_SIGNED (assuming that is my desired format)?
EDIT:
In addition, when I called AudioSystem.getTargetEncodings(), with PCM_FLOAT, it returns three encodings. Does that mean that it will accept PCM_FLOAT, and be capable to converting to the returned encodings, based on what the underlying audio system supports?
AudioFormat.Encoding[] encodings = AudioSystem.getTargetEncodings(AudioFormat.Encoding.PCM_FLOAT);
for(AudioFormat.Encoding e : encodings)
System.out.println(e);
results in...
PCM_SIGNED
PCM_UNSIGNED
PCM_FLOAT

I don't know that I'll be able to answer your direct questions. But maybe the code I can show you, which I know works (including on Linux), will help you arrive at a workable solution. I have programs that generate audio signals via incoming cues, but also custom-made Synths, and I do all the mixing and effects with PCM floats in the range -1 to 1. To output, I convert the floats to a standard "CD Quality" format that Java supports.
Here is the format I use for the outputting SourceDataLine:
AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 44100, 16, 2, 4, 44100, false);
You'll probably want to make this mono instead of stereo. But I should say, it seems to me that if you are able to read an incoming wav file with a different format, you should be able to play back that same format, assuming you reverse all the steps taken to convert the incoming data to PCM.
For the standard "CD Quality" format, to go from pcm signed floats to bytes, there is an intermediate step of inflating to the range of a signed short (-32768 to 32767).
public static byte[] fromBufferToAudioBytes(byte[] audioBytes, float[] buffer)
{
for (int i = 0, n = buffer.length; i < n; i++)
{
buffer[i] *= 32767;
audioBytes[i*2] = (byte) buffer[i];
audioBytes[i*2 + 1] = (byte)((int)buffer[i] >> 8 );
}
return audioBytes;
}
This is taken from the AudioCue library that I wrote and posted on github.
I find it reduces headaches to just deal with the one AudioFormat, to make conversions with Audacity to the one format, and not try make provisions for multiple formats. But that is just a personal preference, and I don't know if that strategy would work for your situation or not.
Hope there is something here that helps!

public class Main {
public static void main(String[] args) throws InterruptedException {
Thread t1 = new Thread2();
t1.start();
Thread t2 = new thread3();
t2.start();
Thread.sleep(5000);
}
}
import javax.sound.sampled.*;
import java.io.File;
import java.io.IOException;
import java.util.logging.Level;
import java.util.logging.Logger;
public class Thread2 extends Thread implements Runnable {
#Override
public void run() {
playWav("C:/Windows/Media/feel_good_x.wav");
}
private static void playWav(String soundFilePath) {
File sFile = new File(soundFilePath);
if (!sFile.exists()) {
String ls = System.lineSeparator();
System.err.println("немає в директорії»+
ls + "(" + soundFilePath + ")" + ls);
return;
}
try {
Clip clip;
try (AudioInputStream audioInputStream = AudioSystem.
getAudioInputStream(sFile.getAbsoluteFile())) {
clip = AudioSystem.getClip();
clip.setFramePosition(0);
clip.open(audioInputStream);
}
clip.start();
}
catch (UnsupportedAudioFileException | IOException | LineUnavailableException ex) {
Logger.getLogger("playWav()").log(Level.SEVERE, null, ex);
}
}
}

Is there any way to make image compression and saving faster on Android?

The situation
I should show 200-350 frames animation in my application. Images have 500x300ish resolution. If user wants to share animation, i have to convert it to Video. For convertion i am using ffmpeg command.
ffmpeg -y -r 1 -i /sdcard/videokit/pic00%d.jpg -i /sdcard/videokit/in.mp3 -strict experimental -ar 44100 -ac 2 -ab 256k -b 2097152 -ar 22050 -vcodec mpeg4 -b 2097152 -s 320x240 /sdcard/videokit/out.mp4
To convert images to video ffmpeg wants actual files not Bitmap or byte[].
Problem
Compressing bitmaps to image files taking to much time. 210 image convertion takes about 1 minute to finish on average device(HTC ONE m7). Converting image files to mp4 takes about 15 seconds on the same device. All together user have to wait about 1.5 minutes.
What i have tried
I changed comrpession format form PNG to JPEG(1.5 minute result is
achieved with JPEG compression(quality=80),with PNG it takes about
2-2.5 minutes) success
Tried to find how pass byte[] or bitmap to ffmpeg - no succes.
QUESTION
Is there any way(library (even native)) to make saving process faster.
Is there any way to pass byte[] or Bitmap objects (i mean png file decompressed to Android Bitmap Class Object) to ffmpeg library video creating method
Is there any other working library which will create mp4(or any supported format(supported by main Social Networks)) from byte[] or Bitmap objects in about 30 seconds(for 200 frames).

You can convert Bitmap (or byte[]) to YUV format quickly, using renderscript (see https://stackoverflow.com/a/39877029/192373). You can pass these YUV frames to ffmpeg library (as suggests halfelf), or use the built-in native MediaCodec which uses dedicated hardware on modt devices (but compression options are less flexible than all-software ffmpeg).

There are two steps slow us down. Compressing image to PNG/JPG and writing them to disk. Both can be skipped if we directly code against ffmpeg libs, instead of calling ffmpeg command. (There are other improvements too, such like GPU encoding and multithreading, but much more complicated.)
Some approaches to code:
Only use C/C++ NDK for android programming. FFmpeg will happily work. But I guess it's not an option here.
Build it from scratch by Java JNI. Not much experience here. I only know this could link java to c/c++ libs.
Some java wrapper. Luckily I found javacpp-presets. (There are others too, but this one is good enough and up to date.)
This library includes a good example ported from famous dranger's ffmpeg tutorial, though it is a demuxing one.
We can try to write a muxing one, following ffmpeg's muxing.c example.
import java.io.*;
import org.bytedeco.javacpp.*;
import static org.bytedeco.javacpp.avcodec.*;
import static org.bytedeco.javacpp.avformat.*;
import static org.bytedeco.javacpp.avutil.*;
import static org.bytedeco.javacpp.swscale.*;
public class Muxer {
public class OutputStream {
public AVStream Stream;
public AVCodecContext Ctx;
public AVFrame Frame;
public SwsContext SwsCtx;
public void setStream(AVStream s) {
this.Stream = s;
}
public AVStream getStream() {
return this.Stream;
}
public void setCodecCtx(AVCodecContext c) {
this.Ctx = c;
}
public AVCodecContext getCodecCtx() {
return this.Ctx;
}
public void setFrame(AVFrame f) {
this.Frame = f;
}
public AVFrame getFrame() {
return this.Frame;
}
public OutputStream() {
Stream = null;
Ctx = null;
Frame = null;
SwsCtx = null;
}
}
public static void main(String[] args) throws IOException {
Muxer t = new Muxer();
OutputStream VideoSt = t.new OutputStream();
AVOutputFormat Fmt = null;
AVFormatContext FmtCtx = new AVFormatContext(null);
AVCodec VideoCodec = null;
AVDictionary Opt = null;
SwsContext SwsCtx = null;
AVPacket Pkt = new AVPacket();
int GotOutput;
int InLineSize[] = new int[1];
String FilePath = "/path/xxx.mp4";
avformat_alloc_output_context2(FmtCtx, null, null, FilePath);
Fmt = FmtCtx.oformat();
AVCodec codec = avcodec_find_encoder_by_name("libx264");
av_format_set_video_codec(FmtCtx, codec);
VideoCodec = avcodec_find_encoder(Fmt.video_codec());
VideoSt.setStream(avformat_new_stream(FmtCtx, null));
AVStream stream = VideoSt.getStream();
VideoSt.getStream().id(FmtCtx.nb_streams() - 1);
VideoSt.setCodecCtx(avcodec_alloc_context3(VideoCodec));
VideoSt.getCodecCtx().codec_id(Fmt.video_codec());
VideoSt.getCodecCtx().bit_rate(5120000);
VideoSt.getCodecCtx().width(1920);
VideoSt.getCodecCtx().height(1080);
AVRational fps = new AVRational();
fps.den(25); fps.num(1);
VideoSt.getStream().time_base(fps);
VideoSt.getCodecCtx().time_base(fps);
VideoSt.getCodecCtx().gop_size(10);
VideoSt.getCodecCtx().max_b_frames();
VideoSt.getCodecCtx().pix_fmt(AV_PIX_FMT_YUV420P);
if ((FmtCtx.oformat().flags() & AVFMT_GLOBALHEADER) != 0)
VideoSt.getCodecCtx().flags(VideoSt.getCodecCtx().flags() | AV_CODEC_FLAG_GLOBAL_HEADER);
avcodec_open2(VideoSt.getCodecCtx(), VideoCodec, Opt);
VideoSt.setFrame(av_frame_alloc());
VideoSt.getFrame().format(VideoSt.getCodecCtx().pix_fmt());
VideoSt.getFrame().width(1920);
VideoSt.getFrame().height(1080);
av_frame_get_buffer(VideoSt.getFrame(), 32);
// should be at least Long or even BigInteger
// it is a unsigned long in C
int nextpts = 0;
av_dump_format(FmtCtx, 0, FilePath, 1);
avio_open(FmtCtx.pb(), FilePath, AVIO_FLAG_WRITE);
avformat_write_header(FmtCtx, Opt);
int[] got_output = { 0 };
while (still_has_input) {
// convert or directly copy your Bytes[] into VideoSt.Frame here
// AVFrame structure has two important data fields:
// AVFrame.data (uint8_t*[]) and AVFrame.linesize (int[])
// data includes pixel values in some formats and linesize is size of each picture line.
// For example, if formats is RGB. linesize should has 3 valid values equaling to `image_width * 3`. And data will point to three arrays containing rgb values.
// But I guess we'll need swscale() to convert pixel format here. From RGB to yuv420p (or other yuv family formats).
Pkt = new AVPacket();
av_init_packet(Pkt);
VideoSt.getFrame().pts(nextpts++);
avcodec_encode_video2(VideoSt.getCodecCtx(), Pkt, VideoSt.getFrame(), got_output);
av_packet_rescale_ts(Pkt, VideoSt.getCodecCtx().time_base(), VideoSt.getStream().time_base());
Pkt.stream_index(VideoSt.getStream().index());
av_interleaved_write_frame(FmtCtx, Pkt);
av_packet_unref(Pkt);
}
// get delayed frames
for (got_output[0] = 1; got_output[0] != 0;) {
Pkt = new AVPacket();
av_init_packet(Pkt);
avcodec_encode_video2(VideoSt.getCodecCtx(), Pkt, null, got_output);
if (got_output[0] > 0) {
av_packet_rescale_ts(Pkt, VideoSt.getCodecCtx().time_base(), VideoSt.getStream().time_base());
Pkt.stream_index(VideoSt.getStream().index());
av_interleaved_write_frame(FmtCtx, Pkt);
}
av_packet_unref(Pkt);
}
// free c structs
avcodec_free_context(VideoSt.getCodecCtx());
av_frame_free(VideoSt.getFrame());
avio_closep(FmtCtx.pb());
avformat_free_context(FmtCtx);
}
}
For porting C code, normally several changes should be done:
Mostly the work is to replace every C struct member access (. and ->) to java getter/setter.
Also there are many C address-of operators &, just delete them.
Change C NULL macro and C++ nullptr pointer to Java null object.
C codes used to check bool result of an int type in if, for, while. Have to compare them with 0 in java.
And there may be other API changes, as long as referencing to javacpp-presets docs, it'll be ok.
Note that I omitted all error handling codes here. It may be needed in real development/production.

Really I don't want to make publicity but to use pkzip and its SDK may be a good
solution. Pkzip compress file to 95% as they say.
The Smartcrypt SDK is available in all major programming languages, including C++, Java, and C#, and can be used to encrypt both structured and unstructured data. Changes to existing applications typically consist of two or three lines of code.

C# equivalent for Java's AudioFormat.isBigEndian and AudioFormat.Encoding.PCM_SIGNED

I am having hard time trying to port some Java code to C# for my simple project. The Java code makes use of format.isBigEndian and checks if the audio file data is signed or not. My C# project makes use of NAudio for handling audio files.
Here is the Java code
public void LoadAudioStream(AudioInputStream inputStream) {
AudioFormat format = inputStream.getFormat();
sampleRate = (int) format.getSampleRate();
bigEndian = format.isBigEndian();
AudioFormat.Encoding encoding = format.getEncoding();
if (encoding.equals(AudioFormat.Encoding.PCM_SIGNED))
dataIsSigned = true;
else if (encoding.equals(AudioFormat.Encoding.PCM_UNSIGNED))
dataIsSigned = false;
}
and the C# code that I am working with..
public void LoadAudioStream(WaveFileReader reader)
{
var format = reader.WaveFormat;
sampleRate = format.SampleRate;
//bigEndian = ??
var encoding = format.Encoding;
if (encoding.Equals( /*????*/))
{
dataIsSigned = true;
}
else if (encoding.Equals( /*?????*/))
{
dataIsSigned = false;
}
}
How can I check if the Audio file data is big-endian or not? and lastly is there a way to check if the AudioFormat is PCM signed or unsigned?

PCM WAV files use little endian. The most common bit depth is 16 bit, and this will be signed (ie short or Int16 in C#).

TargetDataLine and Xuggler to record audio with a video of the screen

TargetDataLine is, for me so far, the easiest way to capture microphone input in Java. I want to encode the audio that I capture with a video of the screen [in a screen recorder software] so that the user can create a tutorial, slide case etc.
I use Xuggler to encode the video.
They do have a tutorial on encoding audio with video but they take their audio from a file. In my case, the audio is live.
To encode the video I use com.xuggle.mediaTool.IMediaWriter. The IMediaWriter object allows me to add a video stream and has an
encodeAudio(int streamIndex, short[] samples, long timeStamp, TimeUnit timeUnit)
I can use that if I can get the samples from target data line as short[]. It returns byte[]
So two questions are:
How can I encode the live audio with video?
How do I maintain the proper timing of the audio packets so that they are encoded at the proper time?
References:
1. DavaDoc for TargetDataLine: http://docs.oracle.com/javase/1.4.2/docs/api/javax/sound/sampled/TargetDataLine.html
2. Xuggler Documentation: http://build.xuggle.com/view/Stable/job/xuggler_jdk5_stable/javadoc/java/api/index.html
Update
My code for capturing video
public void run(){
final IRational FRAME_RATE = IRational.make(frameRate, 1);
final IMediaWriter writer = ToolFactory.makeWriter(completeFileName);
writer.addVideoStream(0, 0,FRAME_RATE, recordingArea.width, recordingArea.height);
long startTime = System.nanoTime();
while(keepCapturing==true){
image = bot.createScreenCapture(recordingArea);
PointerInfo pointerInfo = MouseInfo.getPointerInfo();
Point globalPosition = pointerInfo.getLocation();
int relativeX = globalPosition.x - recordingArea.x;
int relativeY = globalPosition.y - recordingArea.y;
BufferedImage bgr = convertToType(image,BufferedImage.TYPE_3BYTE_BGR);
if(cursor!=null){
bgr.getGraphics().drawImage(((ImageIcon)cursor).getImage(), relativeX,relativeY,null);
}
try{
writer.encodeVideo(0,bgr,System.nanoTime()-startTime,TimeUnit.NANOSECONDS);
}catch(Exception e){
writer.close();
JOptionPane.showMessageDialog(null,
"Recording will stop abruptly because" +
"an error has occured", "Error",JOptionPane.ERROR_MESSAGE,null);
}
try{
sleep(sleepTime);
}catch(InterruptedException e){
e.printStackTrace();
}
}
writer.close();
}

I answered most of that recently under this question: Xuggler encoding and muxing
Code sample:
writer.addVideoStream(videoStreamIndex, 0, videoCodec, width, height);
writer.addAudioStream(audioStreamIndex, 0, audioCodec, channelCount, sampleRate);
while (... have more data ...)
{
BufferedImage videoFrame = ...;
long videoFrameTime = ...; // this is the time to display this frame
writer.encodeVideo(videoStreamIndex, videoFrame, videoFrameTime, DEFAULT_TIME_UNIT);
short[] audioSamples = ...; // the size of this array should be number of samples * channelCount
long audioSamplesTime = ...; // this is the time to play back this bit of audio
writer.encodeAudio(audioStreamIndex, audioSamples, audioSamplesTime, DEFAULT_TIME_UNIT);
}
In the case of TargetDataLine, getMicrosecondPosition() will tell you the time you need for audioSamplesTime. This appears to start from the time the TargetDataLine was opened. You need to figure out how to get a video timestamp referenced to the same clock, which depends on the video device and/or how you capture video. The absolute values do not matter as long as they are both using the same clock. You could subtract the initial value (at start of stream) from both your video and your audio times so that the timestamps match, but that is only a somewhat approximate match (probably close enough in practice).
You need to call encodeVideo and encodeAudio in strictly increasing order of time; you may have to buffer some audio and some video to make sure you can do that. More details here.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Converting Stereo to Mono using TarsosDSP does not work - java

Related

How can I get an array of all valid system AudioFormat objects in Java

Java Audio SourceDataLine does not support PCM_FLOAT

Is there any way to make image compression and saving faster on Android?

C# equivalent for Java's AudioFormat.isBigEndian and AudioFormat.Encoding.PCM_SIGNED

TargetDataLine and Xuggler to record audio with a video of the screen

Categories

Resources