I'm facing some problems with WAV files in Java.
WAV format: PCM_SIGNED 44100.0 Hz, 24-bit, stereo, 6 bytes/frame, little-endian.
I extracted the WAV data to a byte array with no problems.
I'm trying to convert the byte array to a double array, but some doubles come with NaN value.
Code:
ByteBuffer byteBuffer = ByteBuffer.wrap(byteArray);
double[] doubles = new double[byteArray.length / 8];
for (int i = 0; i < doubles.length; i++) {
doubles[i] = byteBuffer.getDouble(i * 8);
}
The fact of being 16/24/32-bit, mono/stereo makes me confused.
I intend to pass the double[] to a FFT algorithm and get the audio frequencies.
try this:
public static byte[] toByteArray(double[] doubleArray){
int times = Double.SIZE / Byte.SIZE;
byte[] bytes = new byte[doubleArray.length * times];
for(int i=0;i<doubleArray.length;i++){
ByteBuffer.wrap(bytes, i*times, times).putDouble(doubleArray[i]);
}
return bytes;
}
public static double[] toDoubleArray(byte[] byteArray){
int times = Double.SIZE / Byte.SIZE;
double[] doubles = new double[byteArray.length / times];
for(int i=0;i<doubles.length;i++){
doubles[i] = ByteBuffer.wrap(byteArray, i*times, times).getDouble();
}
return doubles;
}
public static byte[] toByteArray(int[] intArray){
int times = Integer.SIZE / Byte.SIZE;
byte[] bytes = new byte[intArray.length * times];
for(int i=0;i<intArray.length;i++){
ByteBuffer.wrap(bytes, i*times, times).putInt(intArray[i]);
}
return bytes;
}
public static int[] toIntArray(byte[] byteArray){
int times = Integer.SIZE / Byte.SIZE;
int[] ints = new int[byteArray.length / times];
for(int i=0;i<ints.length;i++){
ints[i] = ByteBuffer.wrap(byteArray, i*times, times).getInt();
}
return ints;
}
Your WAV format is 24 bit, but a double uses 64 bit. So the quantities stored in your wav can't be doubles. You have one 24 bit signed integer per frame and channel, which amounts to these 6 bytes mentioned.
You could do something like this:
private static double readDouble(ByteBuffer buf) {
int v = (byteBuffer.get() & 0xff);
v |= (byteBuffer.get() & 0xff) << 8;
v |= byteBuffer.get() << 16;
return (double)v;
}
You'd call that method once for the left channel and once for the right. Not sure about the correct order, but I guess left first. The bytes are read from least significant one to most significant one, as little-endian indicates. The lower two bytes are masked with 0xff in order to treat them as unsigned. The most significant byte is treated as signed, since it will contain the sign of the signed 24 bit integer.
If you operate on arrays, you can do it without the ByteBuffer, e.g. like this:
double[] doubles = new double[byteArray.length / 3];
for (int i = 0, j = 0; i != doubles.length; ++i, j += 3) {
doubles[i] = (double)( (byteArray[j ] & 0xff) |
((byteArray[j+1] & 0xff) << 8) |
( byteArray[j+2] << 16));
}
You will get samples for both channels interleaved, so you might want to separate these afterwards.
If you have mono, you won't have two channels interleaved but only once. For 16 bit you can use byteBuffer.getShort(), for 32 bit you can use byteBuffer.getInt(). But 24 bit isn't commonly used for computation, so ByteBuffer doesn't have a method for this. If you have unsigned samples, you'll have to mask all signs, and to offset the result, but I guess unsigned WAV is rather uncommon.
For floating-point types in DSP they usually prefer values in the range [0, 1] or [0, 1), so you should divide each element by 224-1. Do like the answer of MvG above but with some changes
int t = ((byteArray[j ] & 0xff) << 0) |
((byteArray[j+1] & 0xff) << 8) |
(byteArray[j+2] << 16);
return t/double(0xFFFFFF);
But double is really a waste of space and CPU for data process purposes. I would recommend convert it to 32-bit int instead, or float which has the same precision (24 bits) but bigger range. In fact 32-bit int or float is the biggest type for a data channel when you do audio or video processing
Finally you can utilize multithreading and SIMD to accelerate the conversion
Related
I am receiving a stream of bits over the Ethernet. I am collecting the bits in a byte[] array in Java(I am collecting them in a byte[] because I think its relevant).The stream is a digitized image where every 10 bits represent a pixel. There are 1280*1024 pixels. Every pixel is represented by 10 bits. Hence,1280*1024*10 = 13107200 bits = 1638400 bytes is the image size.
here's the solution - but if the 10 bits represent actually 8 bits with some 'nonsense' in the other two bits its better to cut that like b=b>>2 - if your image is color then it sounds strange but use all 10 bits
int[] pix=new int[1280*1024];
for(i=0; i<pix.length; i++) {
read next ten bits put then in an int
int b=read();
pix[i]=0xff000000|b;
}
BufferedImage bim=new BufferedImage(1280, 1024, BufferedImage.TYPE_INT_RGB);
bim.setRGB(0, 0, 1280, 1024, pix, 0, 1280);
try {
ImageIO.write(bim, "jpg", new File(path+".jpg"));
} catch (IOException ex) { ex.printStackTrace(); }
Here is a method that can take a byte array and "split" it into groups of 10 bit. Each group is saved as an int.
static int[] getPixel(byte[] in) {
int bits = 0, bitCount = 0, posOut = 0;
int[] out = new int[(in.length * 8) / 10];
for(int posIn = 0; posIn < in.length; posIn++) {
bits = (bits << 8) | (in[posIn] & 0xFF);
bitCount += 8;
if(bitCount >= 10) {
out[posOut++] = (bits >>> (bitCount - 10)) & 0x3FF;
bitCount -= 10;
}
}
return out;
}
I am working with Local Binary Patterns (LBP) which produce numbers in the range 0-255.
That means that they can fit in a byte (256 different values may be included into a byte). So that explains why many (if not all) implementation in java I have found uses byte[] to store these values.
The problem is that since I am interested in the rank of these values when converted to byte (from int for example) they do not keep the previous rank they had (as int for example) since byte are signed (as all but chars in java I think) and so the greater 128 values (127 and after) of the range 0-255 becomes negative numbers. Furthermore I think they are inverted in order (the negative ones).
Some examples to be more specific:
(int) 0 = (byte) 0
(int) 20 = (byte) 20
(int) 40 = (byte) 40
(int) 60 = (byte) 60
(int) 80 = (byte) 80
(int) 100 = (byte) 100
(int) 120 = (byte) 120
(int) 140 = (byte) -116
(int) 160 = (byte) -96
(int) 180 = (byte) -76
(int) 200 = (byte) -56
(int) 220 = (byte) -36
(int) 240 = (byte) -16
My question is whether there is a specific way to maintain the order of int values when converted to byte (meaning 240 > 60 should hold true in byte also -16 < 60!) while keeping memory needs minimum (meaning use only 8bits if that many are required). I know I could consider comparing the byte in a more complex way (for example every negative > positive and if both bytes are negative inverse the order) but I think it's not that satisfactory.
Is there any other way to convert to byte besides (byte) i?
You could subtract 128 from the value:
byte x = (byte) (value - 128);
That would be order-preserving, and reversible later by simply adding 128 again. Be careful to make sure you do add 128 later on though... It's as simple as:
int value = x + 128;
So for example, if you wanted to convert between an int[] and byte[] in a reversible way:
public byte[] toByteArray(int[] values) {
byte[] ret = new byte[values.length];
for (int i = 0; i < values.length; i++) {
ret[i] = (byte) (values[i] - 128);
}
return ret;
}
public int[] toIntArray(int[] values) {
int[] ret = new byte[values.length];
for (int i = 0; i < values.length; i++) {
ret[i] = values[i] + 128;
}
return ret;
}
If you wanted to keep the original values though, the byte comparison wouldn't need to be particularly complex:
int unsigned1 = byte1 & 0xff;
int unsigned2 = byte2 & 0xff;
// Now just compare unsigned1 and unsigned2...
How do I convert long to 4 bytes? I am receiving some output from a C program and it uses unsigned long. I need to read this output and convert this to 4 bytes.
However, java uses signed long which is 64 bits. Is there any way to do this conversion?
To read 4 bytes as an unsigned 32-bit value, assuming it is little endian, the simplest thing to do is to use ByteBuffer
byte[] bytes = { 1,2,3,4 };
long l = ByteBuffer.wrap(bytes)
.order(ByteOrder.LITTLE_ENDIAN).getInt() & 0xFFFFFFFFL;
While l can be an signed 64-bit value it will only be between 0 and 2^^32-1 which is the range of a unsigned 32-bit value.
You can use the java.nio.ByteBuffer. It can parse the long, and it does the byte ordering for you.
You can code a loop where you divide the "long" by 256, take the rest, then you have the "Least Significant Byte" ...
(depending on whether you want little-endian or big-endian you can loop forwards or backwards)
long l = (3* 256 * 256 * 256 + 1 * 256 *256 + 4 * 256 + 8);
private byte[] convertLongToByteArray(long l) {
byte[] b = new byte[4];
if(java.nio.ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN){
for (int i=0; i<4; i++) {
b[i] = (byte)(l % 256) ;
l = l / 256;
}
}else{
for (int i=3; i>=0; i--) {
b[i] = (byte)(l % 256) ;
l = l / 256;
}
}
return b;
}
I am trying to convert a HEX-sequence to a String encoded in either, ISO-8859-1, UTF-8 or UTF-16BE. That is, I have a String looking like: "0422043504410442" this represents the characters: "Test" in UTF-16BE.
The code I used to convert between the two formats was:
private static String hex2String(String hex, String encoding) throws UnsupportedEncodingException {
char[] hexArray = hex.toCharArray();
int length = hex.length() / 2;
byte[] rawData = new byte[length];
for(int i=0; i<length; i++){
int high = Character.digit(hexArray[i*2], 16);
int low = Character.digit(hexArray[i*2+1], 16);
int value = (high << 4) | low;
if( value > 127)
value -= 256;
rawData[i] = (byte) value;
}
return new String(rawData, encoding);
}
This seems to work fine for me, but I still have two questions regarding this:
Is there any simpler way (preferably without bit-handling) to do this conversion?
How am I to interpret the line: int value = (high << 4) | low;?
I am familiar with the basics of bit-handling, though not at all with the Java syntax. I believe the first part shift all bits to the left by 4 steps. Though the rest I don't understand and why it would be helpful in this certain situation.
I apologize for any confusion in my question, please let me know if I should clarify anything.
Thank you.
//Abeansits
Is there any simpler way (preferably without bit-handling) to do this conversion?
None I would know of - the only simplification seems to parse the whole byte at once rather than parsing digit by digit (e.g. using int value = Integer.parseInt(hex.substring(i * 2, i * 2 + 2), 16);)
public static byte[] hexToBytes(final String hex) {
final byte[] bytes = new byte[hex.length() / 2];
for (int i = 0; i < bytes.length; i++) {
bytes[i] = (byte) Integer.parseInt(hex.substring(i * 2, i * 2 + 2), 16);
}
return bytes;
}
How am I to interpret the line: int value = (high << 4) | low;?
look at this example for your last two digits (42):
int high = 4; // binary 0100
int low = 2; // binary 0010
int value = (high << 4) | low;
int value = (0100 << 4) | 0010; // shift 4 to left
int value = 01000000 | 0010; // bitwise or
int value = 01000010;
int value = 66; // 01000010 == 0x42 == 66
You can replace the << and | in this case with * and +, but I don't recommend it.
The expression
int value = (high << 4) | low;
is equivalent to
int value = high * 16 + low;
The subtraction of 256 to get a value between -128 and 127 is unnecessary. Simply casting, for example, 128 to a byte will produce the correct result. The lowest 8 bits of the int 128 have the same pattern as the byte -128: 0x80.
I'd write it simply as:
rawData[i] = (byte) ((high << 4) | low);
Is there any simpler way (preferably
without bit-handling) to do this
conversion?
You can use the Hex class in Apache commons, but internally, it will do the same thing, perhaps with minor differences.
How am I to interpret the line: int value = (high << 4) | low;?
This combines two hex digits, each of which represents 4 bits, into one unsigned 8-bit value stored as an int. The next two lines convert this to a signed Java byte.
I'm getting a slight distortion (sounds like buzzing) in the background when I run the following code. Because of its subtle nature it makes believe there is some sort of aliasing going on with the byte casting.
AudioFormat = PCM_SIGNED 44100.0 Hz, 16 bit, stereo, 4 bytes/frame, big-endian
Note: code assumes (for now) that the data is in big endian.
public static void playFreq(AudioFormat audioFormat, double frequency, SourceDataLine sourceDataLine)
{
System.out.println(audioFormat);
double sampleRate = audioFormat.getSampleRate();
int sampleSizeInBytes = audioFormat.getSampleSizeInBits() / 8;
int channels = audioFormat.getChannels();
byte audioBuffer[] = new byte[(int)Math.pow(2.0, 19.0) * channels * sampleSizeInBytes];
for ( int i = 0; i < audioBuffer.length; i+=sampleSizeInBytes*channels )
{
int wave = (int) (127.0 * Math.sin( 2.0 * Math.PI * frequency * i / (sampleRate * sampleSizeInBytes * channels) ) );
//wave = (wave > 0 ? 127 : -127);
if ( channels == 1 )
{
if ( sampleSizeInBytes == 1 )
{
audioBuffer[i] = (byte) (wave);
}
else if ( sampleSizeInBytes == 2 )
{
audioBuffer[i] = (byte) (wave);
audioBuffer[i+1] = (byte)(wave >>> 8);
}
}
else if ( channels == 2 )
{
if ( sampleSizeInBytes == 1 )
{
audioBuffer[i] = (byte) (wave);
audioBuffer[i+1] = (byte) (wave);
}
else if ( sampleSizeInBytes == 2 )
{
audioBuffer[i] = (byte) (wave);
audioBuffer[i+1] = (byte)(wave >>> 8);
audioBuffer[i+2] = (byte) (wave);
audioBuffer[i+3] = (byte)(wave >>> 8);
}
}
}
sourceDataLine.write(audioBuffer, 0, audioBuffer.length);
}
Your comments say that the code assumes big-endian.
Technically you're actually outputting in little-endian, however it doesn't seem to matter because through a lucky quirk your most significant byte is always 0.
EDIT: to explain that further - when your value is at its maximum value of 127, you should be writing (0x00, 0x7f), but the actual output from your code is (0x7f, 0x00) which is 32512. This happens to be near the proper 16 bit maximum value of 32767, but with the bottom 8 bits all zero. It would be better to always use 32767 as the maximum value, and then discard the bottom 8 bits if required.
This means that even though you're outputting 16-bit data, the effective resolution is only 8 bit. This seems to account for the lack of sound quality.
I've made a version of your code that just dumps the raw data to a file, and can't see anything otherwise wrong with the bit shifting itself. There's no unexpected changes of sign or missing bits, but there is a buzz consistent with 8 bit sample quality.
Also, for what it's worth your math will be easier if you calculate the wave equation based on sample counts, and then worry about byte offsets separately:
int samples = 2 << 19;
byte audioBuffer[] = new byte[samples * channels * sampleSizeInBytes];
for ( int i = 0, j = 0; i < samples; ++i )
{
int wave = (int)(32767.0 * Math.sin(2.0 * Math.PI * frequency * i / sampleRate));
byte msb = (byte)(wave >>> 8);
byte lsb = (byte) wave;
for (int c = 0; c < channels; ++c) {
audioBuffer[j++] = msb;
if (sampleSizeInBytes > 1) {
audioBuffer[j++] = lsb;
}
}
}
I assume you are calling this code repeatedly to play a long sound.
Is there a chance that the wave you are generating is not getting to complete a full period before it is written?
If the wave gets "cut-off" before it completes a full period and then the next wave is written to the output, you will certainly hear something strange and I assume that may be what is causing the buzzing.
For example:
/-------\ /-------\ /-------\
-----/ \ -----/ \ -----/ \
\ \ \
\----- \----- \-----
Notice the disconnect between parts of this wave. That might be causing the buzzing.