Understanding bytes from wav audio data - java

I am novel handling audio. I am trying to understand how the audio wav file works. I get the bytes with a java code and then render the first 1500 samples in an excel file. This is the image from Audacity:
AudacityImage
And this is the representation in Excel:
ExcelImage
I can see the wave but I don't know what the peaks mixed with the original signal are. Can someone explain this to me please?

You may yet need to do another important step in order for the Excel image to be meaningful. If the wav data is 16-bit or 24-bit or 32-bit encoding, the bytes need to be appended into PCM data. 8-bit values (+/- 128) are not used so often any more for encoding waveforms. Shorts (16-bit, +/- 32767) give much better fidelity. Check the format to see the encoding and the byte order (may be either big-endian or little-endian) and number of tracks (mono or stereo) and assemble your PCM values accordingly and I bet you will get the desired result.
The tutorial provided by Oracle, Overview of the Sampled Package, goes over basic concepts and tools that will be helpful in the Java context.

Related

Generate Image using Text

I visited this website,
https://xcode.darkbyte.ru/
Basically the website takes a text as Input and generates an Image.
It also takes an image as input and decodes it back to text.
I really wish to know what this is called and how it is done
I'd like to know the algorithm [preferably in Java]
Please help, Thanks in advance
There are many ways to encode a text (series of bytes) as an image, but the site you quoted does it in a pretty simple and straightforward way. And you can reverse-engineer it easily:
Up to 3 chars are coded as 1 pixel; 4 chars as 2 pixels -- we learn from this that only R(ed), G(reen) and B(lue) channels for each pixel are used (and not alpha/transparency channel).
We know PNG supports 8 bits per channel, and each ASCII char is 8 bits wide. Let's test if first char (first 8 bits) are stored in red channel.
Let's try z... Since z is relatively high in ASCII table (122) and . is relatively low (46) -- we expect to get a redish 1x1 PNG. And we do.
Let's try .z.. Is should be greenesh.. And it is.
Similarly for ..z we get a bluish pixel.
Now let's see what happens with a non-ASCII input. Try entering: ① (unicode char \u2460). The site html-encodes the string into ① and then encodes that ASCII text into the image as before.
Compression. When entering a larger amount of text, we notice the output is shorter then expected. It means the back-end is running some compression algorithm on raw input before (or after?) encoding it as image. By noticing the resolution of the image and maximum information content (HxWx3x8 bits) being smaller than input, we can conclude the compression is done before encoding to image, and not after (thus not relying to PNG compression). We could go further in detecting which compression algorithm is used by encoding the raw input with the common culprits like Huffman coding, Lempel-Zip, LZW, DEFLATE, even Brotli, and comparing the output with bytes from image pixels. (Note we can't detect it directly by inspecting a magic prefix, chances being author stripped anything but the raw compressed data.)

Which files have good compression ratio using textbook's Huffman coding algorithm?

I am testing Huffman coding now, and I wanted to know which type of files(like .txt, .jpg, .mp3 etc) experience a good compression when they undergo Huffman based compression. I implemented Huffman coding in java and I found out that I was getting about 40% size reduction for .txt files(the ones with ordinary English text) and about almost 0% - 1% reduction on .jpg, .mp3, and .mp4 files (of course I haven't tested it on huge files above 1 MB, because my program is super slow). I understand that Huffman coding works best for those files which have more frequently occurring symbols, however I do not know what kind of symbols are there in a video, audio or an image file, hence the question. Since that I have designed this program(I did it for my project at school, I will not deny it, I did it on my own and I am only asking for a few pointers for my research), I wanted to know where it would work well.
Thanks.
Note: I initially created this project only for .txt files and to my wonder, it was working on all other types of files as well, hence I wanted to test it and thereby I had to ask this question. I found out that for image files, you don't encode the symbols themselves, but rather some RGB values? Correct me if I am wrong.
It's all about the amount of redundancy in the file.
In any file each byte occupies 8 bits, allowing 256 distinct symbols per byte. In a text file, a relatively small number of those symbols are actually used, and the distribution of the symbols is not flat (there are more es than qs). Thus the information "density" is more like 5 bits per byte.
JPEGs, MP3 and MP4 are already compressed and have almost no redundancy. All 256 symbols are used, with about equal frequency, so the information "density" is very close to 8 bits per byte. You cannot compress it further.

visualize microphone audio in Java

Is there a way to visualize audio in Java in a kind of wave?
How should I start, I already set up a microphone selection an a Thread to read the bytes from the TargetDataLine into a buffer.
But what should I do now?
Any help would be appreciated.
If you are using the Java Sound API, the data that you have read is 8 or 16 bits PCM. If it is 8-bit then it is fine, otherwise you may need to adjust the endianess.
If you are reading 8-bit PCM, each byte is a sample, then the value of that byte is the sound sample. If you are reading 16-bit PCM, then the samples are packed either as hi,lo,hi,lo or lo,hi,lo,hi (where hi and lo are high and low order bytes) depending on endianness. In that case you should convert that to a short value.
For plotting you will need a 3rd party library, such as freechart or jahuwaldt.plot. (I used the latter on a real time wave visualization program).

Create a wav with hidden binary data in it and read it (Java)

What I'm willing to do is to convert a text string into a wav file format in high frequencies (18500Hz +): this will be the encoder.
And create an engine to decode this text string from a wav formatted recording that will support error control as I will not use the same file obviously, to read, but a recording of this sound.
Thanks
An important consideration will be whether or not you want to hide the string into an existing audio file (so it sounds like a normal file, but has an encoded message -- that is called steganography), or whether you will just be creating a file that sounds like gibberish, for the purpose of encoding data only. I'm assuming the latter since you didn't ask to hide a message in an existing file.
So I assume you are not looking for low-level details on writing WAV files (I am sure you can find documentation on how to read and write individual samples to a WAV file). Obviously, the simplest approach would be to simply take each byte of the source string, and store it as a sample in the WAV file (assuming an 8-bit recording. If it's a 16-bit recording, you can store two bytes per sample. If it's a stereo 16-bit recording, you can store four bytes per sample). Then you can just read the WAV file back in and read the samples back as bytes. That's the simple approach but as you say, you want to be able to make a (presumably analog) recording of the sound, and then read it back into a WAV file, and still be able to read the data.
With the approach above, if the analog recording is not exactly perfect (and how could it be), you would lose bytes of the message. This means you need to store the message in such a way that missing bytes, or bytes that have a slight error, are not going to be a problem. How you do this will depend highly upon exactly what sort of "damage" will be happening to the sound file. I would expect two major forms of damage:
"Vertical" damage: A sample (byte) would have a slightly higher or lower value than it originally had.
"Horizontal" damage: Samples may be averaged, stretched or squashed horizontally. From a byte perspective, this means some samples may be repeated, while others may be missing.
To combat this, you need some redundancy in the message. More redundancy means the message will take up more space (be longer), but will be more reliable.
I would recommend thinking about how old (pre-mobile) telephone dial tones worked: each key generated a unique tone and sent it across the wire. The tones are long enough, and far enough apart pitch-wise that they can be distinguished even given the above forms of damage. So, choose two parameters: a) length and b) frequency-delta. For each byte of data, select a frequency, spacing the 256 byte values frequency-delta Hertz apart. Then, generate a sine wave for length milliseconds of that frequency. This encodes a lot more redundancy than the above one-byte-per-sample approach, since each byte takes up many samples, and if you lose some samples, it doesn't matter.
When you read them back in, read every length milliseconds of audio data and then estimate the frequency of the sine wave. Map this onto the byte value with the nearest frequency.
Obviously, longer values of length and further-apart frequency-delta will make the signal more reliable, but require the sound to be longer and higher-frequency, respectively. So you will have to play around with these values to see what works.
Some last thoughts, since your title says "hidden" binary data:
If you really want the data to be "hidden", consider encrypting it before encoding it to audio.
If you want to take the steganography approach, you will have to read up on audio steganography (I imagine you can use the above techniques, but you will have to insert them as extremely low-volume signals on top of the existing sound).

How to write a TIFF from a 2D array of floats in Java?

In a Java program I have a 1024 x 1024 array of floats. How can I write a TIFF file corresponding to the image represented by this array?
Clarifications:
I am asking for a code snippet illustrating how to write a TIFF corresponding to the array of floats.
I'm looking for a grayscale image.
I know how to convert the 1024 x 1024 array of floats into any other 1024 x 1024 array of numerical values; e.g. if the method you have in mind requires, say, 1024 x 1024 floats in the range [0, 1.0), no problem, I know how to convert my data so that this constraint holds.
Thanks!
kjo
The problem that you will have is that, while it is possible to have floating point values for pixel data in TIFF, this is not part of the baseline specification. TIFF is a mushy enough spec to allow floating point samples, but not to standardize their semantic meaning. For example, I had a customer who had floating point samples generated by a Java app (using ImageJ, I believe) and expected us to read them correctly. ImageJ had put in a badly serialized hashtable into one of the description strings so I had to give them code that would work for that sample file but probably for no others. Don't be that Java app. And if you're going to use ImageJ to write floating point TIFFs, normalize your data between 0 and 1, because then I can guarantee that at least my tools will read it correctly without depending on semantic meaning.
While the baseline spec says that 16 bit per channel samples aren't part of the baseline, they are more likely to be be recognized by current TIFF consumers. So you might be happier in the long run writing grayscale with 16-bit samples in the range 0..65535, if you're hell-bent on writing TIFF.
If you think that you're going to write a non-compliant TIFF, just write your own file format and publish the spec and the reading and writing code. If you shoe-horn it into TIFF, you are creating a new format anyway and you will break most TIFF consuming applications as a side-effect. Which is better for the ecosystem?
Remember, when you write a bad TIFF, an angel gets set on fire.
AFAIU JAI can write TIFF files.
The canonical standard for handling TIFF images is the libtiff library, which is written in C.
It is possible to call native C code from Java.

Categories

Resources