I visited this website,
https://xcode.darkbyte.ru/
Basically the website takes a text as Input and generates an Image.
It also takes an image as input and decodes it back to text.
I really wish to know what this is called and how it is done
I'd like to know the algorithm [preferably in Java]
Please help, Thanks in advance
There are many ways to encode a text (series of bytes) as an image, but the site you quoted does it in a pretty simple and straightforward way. And you can reverse-engineer it easily:
Up to 3 chars are coded as 1 pixel; 4 chars as 2 pixels -- we learn from this that only R(ed), G(reen) and B(lue) channels for each pixel are used (and not alpha/transparency channel).
We know PNG supports 8 bits per channel, and each ASCII char is 8 bits wide. Let's test if first char (first 8 bits) are stored in red channel.
Let's try z... Since z is relatively high in ASCII table (122) and . is relatively low (46) -- we expect to get a redish 1x1 PNG. And we do.
Let's try .z.. Is should be greenesh.. And it is.
Similarly for ..z we get a bluish pixel.
Now let's see what happens with a non-ASCII input. Try entering: ① (unicode char \u2460). The site html-encodes the string into ① and then encodes that ASCII text into the image as before.
Compression. When entering a larger amount of text, we notice the output is shorter then expected. It means the back-end is running some compression algorithm on raw input before (or after?) encoding it as image. By noticing the resolution of the image and maximum information content (HxWx3x8 bits) being smaller than input, we can conclude the compression is done before encoding to image, and not after (thus not relying to PNG compression). We could go further in detecting which compression algorithm is used by encoding the raw input with the common culprits like Huffman coding, Lempel-Zip, LZW, DEFLATE, even Brotli, and comparing the output with bytes from image pixels. (Note we can't detect it directly by inspecting a magic prefix, chances being author stripped anything but the raw compressed data.)
Related
I am novel handling audio. I am trying to understand how the audio wav file works. I get the bytes with a java code and then render the first 1500 samples in an excel file. This is the image from Audacity:
AudacityImage
And this is the representation in Excel:
ExcelImage
I can see the wave but I don't know what the peaks mixed with the original signal are. Can someone explain this to me please?
You may yet need to do another important step in order for the Excel image to be meaningful. If the wav data is 16-bit or 24-bit or 32-bit encoding, the bytes need to be appended into PCM data. 8-bit values (+/- 128) are not used so often any more for encoding waveforms. Shorts (16-bit, +/- 32767) give much better fidelity. Check the format to see the encoding and the byte order (may be either big-endian or little-endian) and number of tracks (mono or stereo) and assemble your PCM values accordingly and I bet you will get the desired result.
The tutorial provided by Oracle, Overview of the Sampled Package, goes over basic concepts and tools that will be helpful in the Java context.
This is one of those "pretty sure we found the answer, but hoping we're wrong" questions. We are looking at a steganography problem and it's not pretty.
Situation:
We have a series of images. We want to mark them (watermark) so the watermarks survive a series of conditions. The kicker is, we are using a lossfull format, JPG, rather than lossless such as PNG. Our watermarks need to survive screenshotting and, furthermore, need to be invisible to the naked eye. Finally, they need contain at least 32 bytes of data (we expect them to be repeating patterns across an image, of course). Due to the above, we need to hide the information in the pixels themselves. I am trying a Least Significant Bit change, including using large blocks per "bit" (I tried both increments of 16 as these are the jpg compression algorithms size chunks from what we understand, as well as various prime numbers) and reading the average of the resulting block. This sort of leads to requirements:
Must be .jpg
Must survive the jpg compression algorithm
Must survive screenshotting (assume screenshots are saved losslessly)
Problem:
JPG compression, even 100% "minimum loss" changes the pixel values. EG if we draw a huge band across an image setting the Red channel to 255 in a block 64 pixels high, more than half are not 255 in the compiled image. This means that even using an average of the blocks yields the LSB to be random, rather than what we "encoded". Our current prototype can take a random image, compress the message into a bit-endoded string and convert it to a XbyX array which is then superimposed on the image using the LSB of one of the three color-channels. This works and is detectable while it remains a BufferedImage, but once we convert to a JPG the compression destroys the message.
Question:
Is there a way to better control a jpg compression's pixel values? Or are we simply SOOL here and need to drop this avenue, either shifting to PNG output (unlikely) or need to understand the JPG compression's algorithm at length and use it to somehow determine LSB pattern outcomes? Preferably java, but we are open to looking at alternative languages if there are any that can solve our problem (our current PoC is in java)
I've done some research to do a GIF (version 87a) encoder but there are some implementation specifics I can't find specially about the data blocks
First, I can only check how many bytes each block will have when I reach either 255 bytes or the end of the image right? There's no way to tell in advance how many bytes I will need.
Second, as the gif encoding is in little endian, how can I write the resulting integers of the LZW compression in java and how should I align them? I looked at ByteBuffer (I'm coding in Java) but as the integers won't necessarily fit in in a byte it won't work right? How should I do then?
At last, between data blocks, should I start a new LZW compression or just continue where I left off in the previous block?
I am using GZIPOutputStream in my java program to compress big strings, and finally storing it in database.
I can see that while compressing English text, I am achieving 1/4 to 1/10 compression ration (depending on the string value). So say for example my original English text is 100kb, then on an average compressed text will be somewhere around 30kb.
But when I am compressing unicode characters, the compressed string is actually occupying more bytes than the original string. Say for example, my original unicode string is 100kb, then the compressed version is coming out to 200kb.
Unicode string example: "嗨,这是,短信计数测试持续for.Hi这是短"
Can anyone suggest that how can I achieve compression for unicode text as well? and why the compressed version is actually bigger than the original version?
My compression code in Java:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
GZIPOutputStream zos = new GZIPOutputStream(baos);
zos.write(text.getBytes("UTF-8"));
zos.finish();
zos.flush();
byte[] udpBuffer = baos.toByteArray();
Java's GZIPOutputStream uses the Deflate compression algorithm to compress data. Deflate is a combination of LZ77 and Huffman coding. According to Unicode's Compression FAQ:
Q: What's wrong with using standard compression algorithms such as Huffman coding or patent-free variants of LZW?
A: SCSU bridges the gap between an 8-bit based LZW and a 16-bit encoded Unicode text, by removing the extra redundancy that is part of the encoding (sequences of every other byte being the same) and not a redundancy in the content. The output of SCSU should be sent to LZW for block compression where that is desired.
To get the same effect with one of the popular general purpose algorithms, like Huffman or any of the variants of Lempel-Ziv compression, it would have to be retargeted to 16-bit, losing effectiveness due to the larger alphabet size. It's relatively easy to work out the math for the Huffman case to show how many extra bits the compressed text would need just because the alphabet was larger. Similar effects exist for LZW. For a detailed discussion of general text compression issues see the book Text Compression by Bell, Cleary and Witten (Prentice Hall 1990).
I was able to find this set of Java classes for SCSU compression on the unicode website, which may be useful to you, however I couldn't find a .jar library that you could easily import into your project, though you can probably package them into one if you like.
I don't really know Chinese, but as far as I know te GZIP compression depends on repeating sequences of text and those repeating sequences are changed with "descriptions" (this is a very high level explanation). This means if you have a word "library" on 20 places in a string the algorithm will store the word "library" on the side and than note that it should appear on places x, y, z... So, you might not have a lot of redundancy in your original string so you cannot save a lot. Instead, you have more overhead than savings.
I'm not really a compression expert, and I don't know the details, but this is the basic principle of the compression.
P.S
This question might just be a duplicate of: Why gzip compressed buffer size is greater then uncompressed buffer?
What I'm willing to do is to convert a text string into a wav file format in high frequencies (18500Hz +): this will be the encoder.
And create an engine to decode this text string from a wav formatted recording that will support error control as I will not use the same file obviously, to read, but a recording of this sound.
Thanks
An important consideration will be whether or not you want to hide the string into an existing audio file (so it sounds like a normal file, but has an encoded message -- that is called steganography), or whether you will just be creating a file that sounds like gibberish, for the purpose of encoding data only. I'm assuming the latter since you didn't ask to hide a message in an existing file.
So I assume you are not looking for low-level details on writing WAV files (I am sure you can find documentation on how to read and write individual samples to a WAV file). Obviously, the simplest approach would be to simply take each byte of the source string, and store it as a sample in the WAV file (assuming an 8-bit recording. If it's a 16-bit recording, you can store two bytes per sample. If it's a stereo 16-bit recording, you can store four bytes per sample). Then you can just read the WAV file back in and read the samples back as bytes. That's the simple approach but as you say, you want to be able to make a (presumably analog) recording of the sound, and then read it back into a WAV file, and still be able to read the data.
With the approach above, if the analog recording is not exactly perfect (and how could it be), you would lose bytes of the message. This means you need to store the message in such a way that missing bytes, or bytes that have a slight error, are not going to be a problem. How you do this will depend highly upon exactly what sort of "damage" will be happening to the sound file. I would expect two major forms of damage:
"Vertical" damage: A sample (byte) would have a slightly higher or lower value than it originally had.
"Horizontal" damage: Samples may be averaged, stretched or squashed horizontally. From a byte perspective, this means some samples may be repeated, while others may be missing.
To combat this, you need some redundancy in the message. More redundancy means the message will take up more space (be longer), but will be more reliable.
I would recommend thinking about how old (pre-mobile) telephone dial tones worked: each key generated a unique tone and sent it across the wire. The tones are long enough, and far enough apart pitch-wise that they can be distinguished even given the above forms of damage. So, choose two parameters: a) length and b) frequency-delta. For each byte of data, select a frequency, spacing the 256 byte values frequency-delta Hertz apart. Then, generate a sine wave for length milliseconds of that frequency. This encodes a lot more redundancy than the above one-byte-per-sample approach, since each byte takes up many samples, and if you lose some samples, it doesn't matter.
When you read them back in, read every length milliseconds of audio data and then estimate the frequency of the sine wave. Map this onto the byte value with the nearest frequency.
Obviously, longer values of length and further-apart frequency-delta will make the signal more reliable, but require the sound to be longer and higher-frequency, respectively. So you will have to play around with these values to see what works.
Some last thoughts, since your title says "hidden" binary data:
If you really want the data to be "hidden", consider encrypting it before encoding it to audio.
If you want to take the steganography approach, you will have to read up on audio steganography (I imagine you can use the above techniques, but you will have to insert them as extremely low-volume signals on top of the existing sound).