How to merge input and output audio to send another conferencer

How to merge input and output audio to send another conferencer - java

I have changed my question message...
I have two streams with audio in Java. What I want is to combine these two audios into one OutputStream.
I've being searching and it seems that if you have both streams with the same audio format, and using PCM, you just need to do the following operation with the two byte arrays:
mixAudio[i] = (byte) ((audio1[i] + audio2[i]) >> 1);
However I am writing this into a file and I get a file without any audio.
Does anyone know how to combine two audios when I have the audios in two streams (not two audio files)?
Thank you in advance.

decent quality audio consumes two bytes of data per sample per channel to give the audio curve a bit depth of 16 bits which gives your audio curve 2^16 distinct values when digitizing the analog audio curve ... knowing this you cannot do your adding while the data lives as simply bytes ... so to add together two channels you first need to get your audio out of its bytes and into a two byte integer ... then you need to pluck out of that two byte integer each of those two bytes one by one and stow into your output array
in pseudo code ( this puts into an integer two consecutive bytes of your audio array which represents one sample in your audio curve )
assign into a 16 bit integer value of your most significant byte
left shift this integer by 8 bits something like ( myint = myint << 8 )
bit level add to this integer your 2nd byte which is your least significant byte
Top Tip : after you have written code to populate one integer from two bytes then do the reverse namely convert a multi byte integer into two bytes in some array ... bonus points if you plot these integers so you can visualize your raw audio curve
To perform above you must know your endianness ( are you doing little endian or big endian ) which will determine the order of your bytes ... specifically since we now know each audio sample consumes two bytes (or more say for 24 bit audio ) the bytes myarray[i] and myarray[i + 1] are one audio sample however only after knowing your endianness will you realize which array element to use first when populating the above myint ... if none of this makes sense please invest time and effort to research notion of raw audio in PCM format
I highly encourage you to do all of above in your code at least once to appreciate what is happening inside some audio library which may do this for you
going back to your question instead of simply doing
mixAudio[i] = (byte) ((audio1[i] + audio2[i]) >> 1);
you should be doing something like this (untested especially regarding endianess)
twoByteAnswer = (byte) ((audio1[i] << 8) + audio1[i + 1]) + (audio2[i] << 8 + audio2[i + 1])) >> 1);
now you need to spread out your twoByteAnswer into two bytes of array mixAudio ... something like this (also untested)
mixAudio[i] = twoByteAnswer >> 8 // throw away its least sig byte only using its most sig byte
mixAudio[i + 1] = twoByteAnswer && 0x0000FFFF // do a bit AND operator mask

Related

How to write Huffman code to a binary file?

I have a sample .txt file that I want to compress using Huffman encoding. My problem is that if one character has a size of one byte and the smallest size you can write is a byte, how do I reduce the size of the sample file?
I converted the sample file into Huffman codes and wrote it to a new empty .txt file which just consists of 0s and 1s as one huge line of characters. Then I took the new file and used the BitSet class in Java to write to a binary file bit by bit. If the character was 0 or 1 in the new file, I wrote 0 or 1 respectively to the binary file. This process was very slow and it crashed my computer multiple times, I was hoping that someone had a more efficient solution. I have written all my code in Java.

Do not write "0" and "1" characters to the file. Write 0 and 1 bits to the file.
You do this by accumulating eight bits into a byte buffer using the shift (<<) and or (|) operators, and then writing that byte to the file. Repeat. At the end you may have less than eight bits in the byte buffer. If so, write that byte to the file, which will have the remaining bits filled with zeros.
E.g. int buf = 0, count = 0;, for each bit: buf |= bit << count++;, check for eight: if (count == 8) { out.writeByte(buf); buf = count = 0; }. At the end, if (count > 0) out.writeByte(buf);.
When decoding the Huffman codes, you may run into a problem with those filler zero bits in the last byte. They could be decoded as an extraneous symbol or symbols. In order to deal with this you will need for the decoder to know when to stop, by either sending the number of symbols before the Huffman codes, or by adding a symbol for end-of-stream.

One way is to use BitSet to set the bits that represent the code as you compute it. Then you can do either BitSet.toByteArray() or BitSet.toLongArray() and write out the information. Both of these store the bits in little endian encoding.

Joining two (or more) byte[] of wav-sound. Gives backgroundnoise

I am trying to join byte-arrays of wav-sound and it works except for backgroundnoise. Anyone knows any algoritm to add two byte-arrays of sound.
This is what I have tried so far
for(int i=0;i<bArr1.length;i++)
{
bArrJoined[i]=bArr1[i] + bArr2[i];
}
also tried to divide by 2 not to be to high numbers
for(int i=0;i<bArr1.length;i++)
{
bArrJoined[i]=(bArr1[i] + bArr2[i]) / 2;
}
Anyone knows how to make this work without the noise?

A number of things could cause artifacts here. Different audio sampling rates or data bit sizes could do it.
Assuming those are non-issues, you should be aware you can't add a byte with another byte without overflow (256 will become 0, etc.). So convert to int before adding. Clipping will occur if you exceed the max volume, so your divide by 2 operation is smart and should stop that issue. The divide operation should occur with the int versions. Only cast back to byte at the end.
However, if you aren't working with 8-bit audio, then a byte is not your atomic unit. For example, 16-bit audio uses 2 bytes and you would need to convert every two consecutive bytes to an int (with respect to proper endianness) before you perform any mathematical operations on the values. 32-bit audio data occupies 4 consecutive bytes for each single numeric value. Just having an array of bytes does not in itself tell you where the data boundaries are.

What does interleaved stereo PCM linear Int16 big endian audio look like?

I know that there are a lot of resources online explaining how to deinterleave PCM data. In the course of my current project I have looked at most of them...but I have no background in audio processing and I have had a very hard time finding a detailed explanation of how exactly this common form of audio is stored.
I do understand that my audio will have two channels and thus the samples will be stored in the format [left][right][left][right]...
What I don't understand is what exactly this means. I have also read that each sample is stored in the format [left MSB][left LSB][right MSB][right LSB]. Does this mean the each 16 bit integer actually encodes two 8 bit frames, or is each 16 bit integer its own frame destined for either the left or right channel?
Thank you everyone. Any help is appreciated.
Edit: If you choose to give examples please refer to the following.
Method Context
Specifically what I have to do is convert an interleaved short[] to two float[]'s each representing the left or right channel. I will be implementing this in Java.
public static float[][] deinterleaveAudioData(short[] interleavedData) {
//initialize the channel arrays
float[] left = new float[interleavedData.length / 2];
float[] right = new float[interleavedData.length / 2];
//iterate through the buffer
for (int i = 0; i < interleavedData.length; i++) {
//THIS IS WHERE I DON'T KNOW WHAT TO DO
}
//return the separated left and right channels
return new float[][]{left, right};
}
My Current Implementation
I have tried playing the audio that results from this. It's very close, close enough that you could understand the words of a song, but is still clearly not the correct method.
public static float[][] deinterleaveAudioData(short[] interleavedData) {
//initialize the channel arrays
float[] left = new float[interleavedData.length / 2];
float[] right = new float[interleavedData.length / 2];
//iterate through the buffer
for (int i = 0; i < left.length; i++) {
left[i] = (float) interleavedData[2 * i];
right[i] = (float) interleavedData[2 * i + 1];
}
//return the separated left and right channels
return new float[][]{left, right};
}
Format
If anyone would like more information about the format of the audio the following is everything I have.
Format is PCM 2 channel interleaved big endian linear int16
Sample rate is 44100
Number of shorts per short[] buffer is 2048
Number of frames per short[] buffer is 1024
Frames per packet is 1

I do understand that my audio will have two channels and thus the samples will be stored in the format [left][right][left][right]... What I don't understand is what exactly this means.
Interleaved PCM data is stored one sample per channel, in channel order before going on to the next sample. A PCM frame is made up of a group of samples for each channel. If you have stereo audio with left and right channels, then one sample from each together make a frame.
Frame 0: [left sample][right sample]
Frame 1: [left sample][right sample]
Frame 2: [left sample][right sample]
Frame 3: [left sample][right sample]
etc...
Each sample is a measurement and digital quantization of pressure at an instantaneous point in time. That is, if you have 8 bits per sample, you have 256 possible levels of precision that the pressure can be sampled at. Knowing that sound waves are... waves... with peaks and valleys, we are going to want to be able to measure distance from the center. So, we can define center at 127 or so and subtract and add from there (0 to 255, unsigned) or we can treat those 8 bits as signed (same values, just different interpretation of them) and go from -128 to 127.
Using 8 bits per sample with single channel (mono) audio, we use one byte per sample meaning one second of audio sampled at 44.1kHz uses exactly 44,100 bytes of storage.
Now, let's assume 8 bits per sample, but in stereo at 44.1.kHz. Every other byte is going to be for the left, and every other is going to be for the R.
LRLRLRLRLRLRLRLRLRLRLR...
Scale it up to 16 bits, and you have two bytes per sample (samples set up with brackets [ and ], spaces indicate frame boundaries)
[LL][RR] [LL][RR] [LL][RR] [LL][RR] [LL][RR] [LL][RR]...
I have also read that each sample is stored in the format [left MSB][left LSB][right MSB][right LSB].
Not necessarily. The audio can be stored in any endianness. Little endian is the most common, but that isn't a magic rule. I do think though that all channels go in order always, and front left would be channel 0 in most cases.
Does this mean the each 16 bit integer actually encodes two 8 bit frames, or is each 16 bit integer its own frame destined for either the left or right channel?
Each value (16-bit integer in this case) is destined for a single channel. Never would you have two multi-byte values smashed into each other.
I hope that's helpful. I can't run your code but given your description, I suspect you have an endian problem and that your samples aren't actual big endian.

Let's start by getting some terminology out of the way
A channel is a monaural stream of samples. The term does not necessarily imply that the samples are contiguous in the data stream.
A frame is a set of co-incident samples. For stereo audio (e.g. L & R channels) a frame contains two samples.
A packet is 1 or more frames, and is typically the minimun number of frames that can be processed by a system at once. For PCM Audio, a packet often contains 1 frame, but for compressed audio it will be larger.
Interleaving is a term typically used for stereo audio, in which the data stream consists of consecutive frames of audio. The stream therefore looks like L1R1L2R2L3R3......LnRn
Both big and little endian audio formats exist, and depend on the use-case. However, it's generally ever an issue when exchanging data between systems - you'll always use native byte-order when processing or interfacing with operating system audio components.
You don't say whether you're using a little or big endian system, but I suspect it's probably the former. In which case you need to byte-reverse the samples.
Although not set in stone, when using floating point samples are usually in the range -1.0<x<+1.0, so you want to divide the samples by 1<<15. When 16-bit linear types are used, they are typically signed.
Taking care of byte-swapping and format conversions:
int s = (int) interleavedData[2 * i];
short revS = (short) (((s & 0xff) << 8) | ((s >> 8) & 0xff))
left[i] = ((float) revS) / 32767.0f;

Actually your are dealing with an almost typical WAVE file at Audio CD quality, that is to say :
2 channels
sampling rate of 44100 kHz
each amplitude sample quantized on a 16-bits signed integer
I said almost because big-endianness is usually used in AIFF files (Mac world), not in WAVE files (PC world). And I don't know without searching how to deal with endianness in Java, so I will leave this part to you.
About how the samples are stored is quite simple:
each sample takes 16-bits (integer from -32768 to +32767)
if channels are interleaved: (L,1),(R,1),(L,2),(R,2),...,(L,n),(R,n)
if channels are not: (L,1),(L,2),...,(L,n),(R,1),(R,2),...,(R,n)
Then to feed an audio callback, it is usually required to provide 32-bits floating point, ranging from -1 to +1. And maybe this is where something may be missing in your aglorithm. Dividing your integers by 32768 (2^(16-1)) should make it sound as expected.

I ran into a similar issue with de-interleaving the short[] frames that came in through Spotify Android SDK's onAudioDataDelivered().
The documentation for onAudioDelivered was poorly written a year ago. See Github issue. They've updated the docs with a better description and more accurate parameter names:
onAudioDataDelivered(short[] samples, int sampleCount, int sampleRate, int channels)
What can be confusing is that samples.length can be 4096. However, it contains only sampleCount valid samples. If you're receiving stereo audio, and sampleCount = 2048 there are only 1024 frames (each frame has two samples) of audio in samples array!
So you'll need to update your implementation to make sure you're working with sampleCount and not samples.length.

Java - byte array arithmetics on bit-level?

I'm using byte arrays (of size 2 or 4) to emulate the effect of short and int data types.
Main idea is to have a data type that support both char and int types, however it is really hard for me to emulate arithmetic operations in this way, since I must do them in bit level.
For those who do not follow:
The int representation of 123 is not equal to the byte[] of {0,1,2,3} since their bit representations differ (123 representation is 00000000000000000000000001111011 and the representation of {0,1,2,3} is 00000000000000010000001000000011 on my system.
So "int of 123" would actually be equivalent to "byte[] of {0,0,0,123}". The problems occur when values stretch over several bytes and I try to subtract or decrement from those byte arrays, since then you have to interact with several different bytes and my math isn't that sharp.
Any pseudo-code or java library suggestions would be welcome.

Unless you really want to know what bits are being carried from one byte to the next, I'd suggest don't do it this way! If it's just plain math, then convert your arrays to real short and int types, do the math, then convert them back again.
If you must do it this way, consider the following:
Imaging you're adding two short variables that are in byte arrays.
The first problem you have is that all Java integer types are signed.
The second is that the "carry" from the least-significant-byte into the most-significant-byte is best done using a type that's longer than a byte because otherwise you can't detect the overflow.
i.e. if you add two 8-bit values, the carry will be in bit 8. But a byte only has bits 0..7, so to calculate bit 8 you have to promote your bytes to the next appropriate larger type, do the add operation, then figure out if it resulted in a carry, and then handle that when you add up the MSB. It's just not worth it.
BTW, I did actually have to do this sort of bit manipulation many years ago when I wrote an MC6809 CPU emulator. It was necessary to perform multiple operations on the same operands just to be able to figure out the effect on the CPU's various status bits, when those same bits are generated "for free" by a hardware ALU.
For example, my (C++) code to add two 8-bit registers looked like this:
void mc6809::help_adc(Byte& x)
{
Byte m = fetch_operand();
{
Byte t = (x & 0x0f) + (m & 0x0f) + cc.bit.c;
cc.bit.h = btst(t, 4); // Half carry
}
{
Byte t = (x & 0x7f) + (m & 0x7f) + cc.bit.c;
cc.bit.v = btst(t, 7); // Bit 7 carry in
}
{
Word t = x + m + cc.bit.c;
cc.bit.c = btst(t, 8); // Bit 7 carry out
x = t & 0xff;
}
cc.bit.v ^= cc.bit.c;
cc.bit.n = btst(x, 7);
cc.bit.z = !x;
}
which requires that three different additions get done on different variations of the operands just to extract the h, v and c flags.

Does Java read integers in little endian or big endian?

I ask because I am sending a byte stream from a C process to Java. On the C side the 32 bit integer has the LSB is the first byte and MSB is the 4th byte.
So my question is: On the Java side when we read the byte as it was sent from the C process, what is endian on the Java side?
A follow-up question: If the endian on the Java side is not the same as the one sent, how can I convert between them?

Use the network byte order (big endian), which is the same as Java uses anyway. See man htons for the different translators in C.

I stumbled here via Google and got my answer that Java is big endian.
Reading through the responses I'd like to point out that bytes do indeed have an endian order, although mercifully, if you've only dealt with “mainstream” microprocessors you are unlikely to have ever encountered it as Intel, Motorola, and Zilog all agreed on the shift direction of their UART chips and that MSB of a byte would be 2**7 and LSB would be 2**0 in their CPUs (I used the FORTRAN power notation to emphasize how old this stuff is :) ).
I ran into this issue with some Space Shuttle bit serial downlink data 20+ years ago when we replaced a $10K interface hardware with a Mac computer. There is a NASA Tech brief published about it long ago. I simply used a 256 element look up table with the bits reversed (table[0x01]=0x80 etc.) after each byte was shifted in from the bit stream.

There are no unsigned integers in Java. All integers are signed and in big endian.
On the C side the each byte has tne LSB at the start is on the left and the MSB at the end.
It sounds like you are using LSB as Least significant bit, are you? LSB usually stands for least significant byte.
Endianness is not bit based but byte based.
To convert from unsigned byte to a Java integer:
int i = (int) b & 0xFF;
To convert from unsigned 32-bit little-endian in byte[] to Java long (from the top of my head, not tested):
long l = (long)b[0] & 0xFF;
l += ((long)b[1] & 0xFF) << 8;
l += ((long)b[2] & 0xFF) << 16;
l += ((long)b[3] & 0xFF) << 24;

There's no way this could influence anything in Java, since there's no (direct non-API) way to map some bytes directly into an int in Java.
Every API that does this or something similar defines the behaviour pretty precisely, so you should look up the documentation of that API.

I would read the bytes one by one, and combine them into a long value. That way you control the endianness, and the communication process is transparent.

Imho there is no endianness defined for java. The endianness is the one of the hardware but java is highlevel and hides the hardware so you don't have to wory about that.
The only endianess related feature is how the java lib maps int and long to byte[] (and inversely). It does it Big-Endian which is the most readable and natural:
int i=0xAABBCCDD
maps to
byte[] b={0xAA,0xBB,0xCC,0xDD}

If it fits the protocol you use, consider using a DataInputStream, where the behavior is very well defined.

Java is 'Big-endian' as noted above. That means that the MSB of an int is on the left if you examine memory (on an Intel CPU at least). The sign bit is also in the MSB for all Java integer types.
Reading a 4 byte unsigned integer from a binary file stored by a 'Little-endian' system takes a bit of adaptation in Java. DataInputStream's readInt() expects Big-endian format.
Here's an example that reads a four byte unsigned value (as displayed by HexEdit as 01 00 00 00) into an integer with a value of 1:
// Declare an array of 4 shorts to hold the four unsigned bytes
short[] tempShort = new short[4];
for (int b = 0; b < 4; b++) {
tempShort[b] = (short)dIStream.readUnsignedByte();
}
int curVal = convToInt(tempShort);
// Pass an array of four shorts which convert from LSB first
public int convToInt(short[] sb)
{
int answer = sb[0];
answer += sb[1] << 8;
answer += sb[2] << 16;
answer += sb[3] << 24;
return answer;
}

java force indeed big endian : https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-2.html#jvms-2.11

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to merge input and output audio to send another conferencer - java

Related

How to write Huffman code to a binary file?

Joining two (or more) byte[] of wav-sound. Gives backgroundnoise

What does interleaved stereo PCM linear Int16 big endian audio look like?

Java - byte array arithmetics on bit-level?

Does Java read integers in little endian or big endian?

Categories

Resources