Jpeg calculating max size

Jpeg calculating max size - java

I have to say the I don't know much about how file formats work.
My question is say I have a jpeg file that is 200 px by 200 px, how can one calculate what the maximum size that file could be in terms of megabytes/bytes?
I think that the reasoning that led to the question will help some one answer me. I have a Java Applet the uploads Images that people draw with it to my server. I need to know what the max size that this file can conceivably reach. It is always going to be 200x200.
It sounds dumb but are there colors that take more byte size then others and if so what is the most expensive one?

There are many ways to make a 'pathological' JPEG/JFIF file that is unusually large.
At the extreme end of the spectrum there is no limit to the size, since the standard doesn't limit some types of marker appearing more than once - e.g. a JFIF file full of many GB of DRI (define restart interval) markers and then an 8x8 pixel MCU at the end is technically valid.
If we restrict ourselves to 'normal' marker usage then we find an upper limit as follows :
Some background -
JPEG encodes pixels as a MCU (group) of 8x8 pixel blocks (DCT blocks), one DCT block for each component (Y, Cb, Cr).
To get best compression (and smallest size), a 4:2:0 chroma subsampling scheme is used where 75% of the chroma information is omitted. To get best quality (and largest size), the file is 2/3rd's chroma, 1/3rd luminance info.
Huffman bitstream symbols are used to encode DCT components, of which there are up to 65 per DCT block (64 AC + 1 DC).
Huffman symbols can range from 1 to 16 bits and are chosen by the encoder to be as small as possible; However, the choice of symbol length can be specified.
Final encoding of the huffman bitstream must be done so that markers can be uniquely identified. I.e, any occurance of a 0xff byte in the output must be replaced by two bytes - 0xff,0x00.
Using all this information we can construct a pathological, but valid, JPEG file which libjpeg (the most common JPEG decoder implementation) is happy to decode.
First, we need the longest possible huffman symbols. At first thought, defining a maximal-length huffman symbol (16 bits) of all 1's, would use most space, however libjpeg refuses to handle a Huffman symbol which is all 1's, this doesn't seem to be excluded by the standard - as it still a unique symbol as the size is already known to be 16 bits unlike other variable-length symbols, and indeed some decoders can handle it (JPEGSnoop).
So we define a huffman table which sets the last two symbols as follows :
11111111_1111110 -> (0,0) (EOB - end of block value)
11111111_11111110 -> (0,15)
Such a huffman table would appear in a JPEG file as :
0xFF, 0xC4 ; DHT - define huffman table
0x00, 35 ; length
0x00 ; DC 0
1,1,1,1,1,1,1,1,1,1, 1, 1, 1, 1, 1, 1 ; histogram
1,2,3,4,5,6,7,8,9,10,11,12,13,14,0,15 ; symbols
Now to encode a maximal-length DCT block :
1 x DC of 31 bits ( 11111111 11111110 11111111 1111111 )
64 x AC of 31 bits ( 11111111 11111110 11111111 1111111 )
= 2015 bits
Since an MCU will be 3 DCT blocks (one for each component), the MCU size will be
6045 bits.
Most of these bytes will be 0xff, which are replaced by 0xff,0x00 in the output stream, as per the standard, in order to differentiate the bitstream from valid markers.
Perform this mapping and a complete DCT is represented by 8 repeats of the following byte pattern :
0xff,0x00,0xfe,0xff,0x00,0xff,0x00
0xff,0x00,0xfd,0xff,0x00,0xff,0x00
0xff,0x00,0xfb,0xff,0x00,0xff,0x00
0xff,0x00,0xf7,0xff,0x00,0xff,0x00
0xff,0x00,0xef,0xff,0x00,0xff,0x00
0xff,0x00,0xdf,0xff,0x00,0xff,0x00
0xff,0x00,0xbf,0xff,0x00,0xff,0x00
0xff,0x00,0x7f,0xff,0x00
which totals 8*54 = 432 bytes
Adding all this up, we have :
3 components * (432 bytes per component)
= 1296 bytes per 8x8 pixels
a header of 339 bytes is required for the SOI/DHT/DQT/SOS segments to setup the image properties and huffman tables, a 2 byte EOI marker is required to end the image.
Since a 200x200 image would be 25x25 MCU's, we have a final size of :
339 + (25 * 25 * 1296) + 2
= 810341 bytes
which works out as a little over 20.25 bytes per pixel, over 6 times larger than an uncompressed BMP/TGA.

As a rule of thumb, no JPEG is going to be larger than an equivalent size 32-bit bitmap. A 32-bit bitmap will have 4 bytes per pixel in the image, so multiply the dimensions together (200x200 = 40000), then multiply that by 4 bytes (40000x4 = 160000), and you'll have an upper bound in bytes - for your example, 160000 bytes is approximately 156kb.

The maximum possible size a JPEG can be should be somewhere around width * height * 12 bits.
JPEG converts images to a different color space (YCbCr) which uses fewer bits (12 to be exact) to represent a single color. Realistically speaking though, the image will be much smaller than the above formula would suggest.
If we use lossless compression only, the file size would be a bit smaller. Even then, no one does that so your image should be far below the limit set by that formula.
In short: 60 kb tops, but most likely way less.

The final size in bytes is based on the encoding quality settings used and the number of pixels. In your case, all images should be the same size since you are doing the encoding and your user seems forced to draw on a 200x200 area.
According to wikipedia though, the maximum is roughly 9 bits per a pixel.
So 200*200*9 = 360000 bits = 45 kB
http://en.wikipedia.org/wiki/JPEG#Effects_of_JPEG_compression

I'm not sure this would be that helpful, but I believe that the absolute maximum it could be would be:
width * height * 4 (size of int) You should probably also add in a maybe a kilobyte for metadata... but I doubt the image would EVER reach that (as that is the whole point of JPEG compression)

Related

How to convert BMP image to 8-bit greyscale

I have to make a program that reads a 24 bit BMP image(without ImageIO or any external library) and make it a 8 bit greyscale BMP image... I read that I must change the header of the image to make it a 8 bit, Source 1 and Source 2. So I read here that the BitCount bytes are at 29 and 30 of the Header and try to change them...
First I read my file and generate the byte vector like this
FileInputStream image= new FileInputStream(path);
byte[] bytesImage = new byte[image.available()];
image.read(bytesImage);
image.close();
Then I get the image header and copy it to a new vector
int width = byteToInt(bytesImage[18], bytesImage[19], bytesImage[20], bytesImage[21]);
int height = byteToInt(bytesImage[22], bytesImage[23], bytesImage[24], bytesImage[25]);
int header = byteToInt(bytesImage[14], bytesImage[15], bytesImage[16], bytesImage[17]) + 14; // Add 14 for the header
vecGrey = Arrays.copyOf(bytesImage, bytesImage.length);
Then I change the header info bytes to make it an 8 bit BMP like this:
byte[] values = intToByte(8);
vecGrey[28] = values[0]; // This is the index for the BitCount byte 1
vecGrey[29] = values[1]; // and this one is the index for the second one.
Okay now comes the problem, for some reason I can't write a file with the header in vecGrey if i try to write vecGrey with a diferent header as show here:
FileOutputStream aGrey = new FileOutputStream(name+ "-gray.bmp");
aGrey.write(vecGrey);
aGrey.close();
// This is a method that displays the resulting image in a frame...
makeInterface(name + "-gray.bmp");
I know that I must change values in the vecGrey, but this should work showing incorrect output(probably a non greyscale image or not an image at all). But when I try to read the file that I generate in the makeInterface() method I get a
javax.imageio.iioexception unable to read the image header
So I asume that the program is unable to read correctly the header, but I don't know why! If I change the BitCount value to 16 it still works, but to 1, 4 or 8 it doesn't work with the same error... I didn't upload my hole code because it's in spanish, but if needed I can translate it and edit here.
Thanks!
EDIT1: I'm only using 640x480 24-bit BMP images, so I don't need to check padding.

When changing a BMP from 24bit to 8 bit you have to change several other things in the header, first of all the size of the image changes (bytes 3-6), since you are dealing with an 8bit image there is one byte per pixel, therefore the new size should become
headerSize {Usually 54} +(numberOfColors*4){This is for the color table/pallette, I recommend setting this at 256}+width*height {The actual amount of pixels}
Next you must indicate where is the offset for the pixel data, which is right after the color table/pallete, this value is located in the bytes 11-14 and the new value should be:
headerSize +numberOfColors*4
Next you need to modify the BITMAPINFOHEADER which starts on byte 15, bytes 15-18 should contain the size of this second header which is usually 40, if you just want to convert to grayscale you can ignore and leave some bytes unmodified until you reach byte 29 and 30 where you modify the bitCount (like you already did), then in bytes 35-38 as far as I know you have to input the new image size we have already calculated, bytes 47-50 determines the number of colors in your color palette, since you are doing grayscale I'd recommend using 256 colors and I'll explain why in a bit. Bytes 51-54 contains the number of important colors, set it at 0 to indicate every color is important.
Next you need to add the color table/palette right next to the heeader. The reason why I recommend 256 colors is because the color pallette is written like so: [B,G,R,0] where BGR are Blue, Green and Red color values in RGB format and a constant 0 at the end, with 256 colors you can make a palette that writes RGB values were R=G=B which should yield a shade of gray. So, next to the header you must add this new series of bytes in an ascending order:
[0,0,0,0] [1,1,1,0] [2,2,2,0] [3,3,3,0] ... [255,255,255,0]
Note that 256 is the numberOfColors which you need to calculate the new size of the image because it's the number of "entries" in the color pallette.
Next you'll want to write your new pixel data after the table/pallette. Since you were given an image in 24 bits, you can extract the pixel matrix and obtain RGB values of each pixel, just remember that you have a byte array which has values from -128 to 127, you need to make sure that you are getting the int value, so if the intensity of any channel is < 0 then add 256 to it to get the int value, then you can apply an equation which gives you an intensity of gray:
Y' = 0.299R' + 0.587G' + 0.114B'
Where Y' is the intensity of gray, R G B are the intensities of Red, Green and Blue.
You can round the result of the equation and then write it as a byte to the imange, and do the same with every pixel in the original image.
When you are done, simply add the two reserved 0s at the end of the file and you should have a brand new 8bit grayscale image of a 24bit image.
Hope this helped.
sources: The one you provided and:
https://en.wikipedia.org/wiki/BMP_file_format
https://en.wikipedia.org/wiki/Grayscale

you should first see the hex format of both 24-bit BMP as well as gray scale BMP, then you should go step wise,
-read 24-bit bmp header
-read data after offset.
-write header of 8-bit gray scale image
-write the data into 8-bit gray scale image.
note: you have to convert rgb bits into 8-bit gray scale by adding rgb bits and divide them by 3.

Java compress lots of long numbers

I need to compress lots of long numbers. Those long numbers are like database ids. After compression, it will be sent as part of the request. Other than java.util.zip, is there any better alternative to achieve higher compression rate?
Thanks

It is possible to change byte length of a any number by changing its radix. As computers use bytes for data (radix 256) and humans use base 10 cleartext numbers are not space efficient as they can be use only 10 values out of 256 possible.
Simple java program to demonstrate:
System.out.println(Long.MAX_VALUE);
String sa = Long.toString(Long.MAX_VALUE, Character.MAX_RADIX);
System.out.println(sa);
Outputs:
9223372036854775807 # 20 bytes
1y2p0ij32e8e7 # 14 bytes
Which is a 6 byte reduction (30% compression** in bytes). As Character.MAX_RADIX equals 36 you can achieve even greater compression by writing custom toString method.
Of course this works only for textual representation of numbers. Long.MAX_VALUE number used in this example is only 8 bytes long in its binary form. So even this 30% reduction in size is actually 75% increase when compared to a binary form of the number.
** This method is not really a compression. This is only exploit of storage inefficiency introduced by writing numbers in human readable form. Actual compression like zip will always beat this method, although it will make numbers unreadable by humans. To put it bluntly: you can read aloud numbers in base 10, 16, 36 or even 256. You can't read compressed numbers.

You can compress long numbers using Run Length Encoding: https://en.wikipedia.org/wiki/Run-length_encoding

What does interleaved stereo PCM linear Int16 big endian audio look like?

I know that there are a lot of resources online explaining how to deinterleave PCM data. In the course of my current project I have looked at most of them...but I have no background in audio processing and I have had a very hard time finding a detailed explanation of how exactly this common form of audio is stored.
I do understand that my audio will have two channels and thus the samples will be stored in the format [left][right][left][right]...
What I don't understand is what exactly this means. I have also read that each sample is stored in the format [left MSB][left LSB][right MSB][right LSB]. Does this mean the each 16 bit integer actually encodes two 8 bit frames, or is each 16 bit integer its own frame destined for either the left or right channel?
Thank you everyone. Any help is appreciated.
Edit: If you choose to give examples please refer to the following.
Method Context
Specifically what I have to do is convert an interleaved short[] to two float[]'s each representing the left or right channel. I will be implementing this in Java.
public static float[][] deinterleaveAudioData(short[] interleavedData) {
//initialize the channel arrays
float[] left = new float[interleavedData.length / 2];
float[] right = new float[interleavedData.length / 2];
//iterate through the buffer
for (int i = 0; i < interleavedData.length; i++) {
//THIS IS WHERE I DON'T KNOW WHAT TO DO
}
//return the separated left and right channels
return new float[][]{left, right};
}
My Current Implementation
I have tried playing the audio that results from this. It's very close, close enough that you could understand the words of a song, but is still clearly not the correct method.
public static float[][] deinterleaveAudioData(short[] interleavedData) {
//initialize the channel arrays
float[] left = new float[interleavedData.length / 2];
float[] right = new float[interleavedData.length / 2];
//iterate through the buffer
for (int i = 0; i < left.length; i++) {
left[i] = (float) interleavedData[2 * i];
right[i] = (float) interleavedData[2 * i + 1];
}
//return the separated left and right channels
return new float[][]{left, right};
}
Format
If anyone would like more information about the format of the audio the following is everything I have.
Format is PCM 2 channel interleaved big endian linear int16
Sample rate is 44100
Number of shorts per short[] buffer is 2048
Number of frames per short[] buffer is 1024
Frames per packet is 1

I do understand that my audio will have two channels and thus the samples will be stored in the format [left][right][left][right]... What I don't understand is what exactly this means.
Interleaved PCM data is stored one sample per channel, in channel order before going on to the next sample. A PCM frame is made up of a group of samples for each channel. If you have stereo audio with left and right channels, then one sample from each together make a frame.
Frame 0: [left sample][right sample]
Frame 1: [left sample][right sample]
Frame 2: [left sample][right sample]
Frame 3: [left sample][right sample]
etc...
Each sample is a measurement and digital quantization of pressure at an instantaneous point in time. That is, if you have 8 bits per sample, you have 256 possible levels of precision that the pressure can be sampled at. Knowing that sound waves are... waves... with peaks and valleys, we are going to want to be able to measure distance from the center. So, we can define center at 127 or so and subtract and add from there (0 to 255, unsigned) or we can treat those 8 bits as signed (same values, just different interpretation of them) and go from -128 to 127.
Using 8 bits per sample with single channel (mono) audio, we use one byte per sample meaning one second of audio sampled at 44.1kHz uses exactly 44,100 bytes of storage.
Now, let's assume 8 bits per sample, but in stereo at 44.1.kHz. Every other byte is going to be for the left, and every other is going to be for the R.
LRLRLRLRLRLRLRLRLRLRLR...
Scale it up to 16 bits, and you have two bytes per sample (samples set up with brackets [ and ], spaces indicate frame boundaries)
[LL][RR] [LL][RR] [LL][RR] [LL][RR] [LL][RR] [LL][RR]...
I have also read that each sample is stored in the format [left MSB][left LSB][right MSB][right LSB].
Not necessarily. The audio can be stored in any endianness. Little endian is the most common, but that isn't a magic rule. I do think though that all channels go in order always, and front left would be channel 0 in most cases.
Does this mean the each 16 bit integer actually encodes two 8 bit frames, or is each 16 bit integer its own frame destined for either the left or right channel?
Each value (16-bit integer in this case) is destined for a single channel. Never would you have two multi-byte values smashed into each other.
I hope that's helpful. I can't run your code but given your description, I suspect you have an endian problem and that your samples aren't actual big endian.

Let's start by getting some terminology out of the way
A channel is a monaural stream of samples. The term does not necessarily imply that the samples are contiguous in the data stream.
A frame is a set of co-incident samples. For stereo audio (e.g. L & R channels) a frame contains two samples.
A packet is 1 or more frames, and is typically the minimun number of frames that can be processed by a system at once. For PCM Audio, a packet often contains 1 frame, but for compressed audio it will be larger.
Interleaving is a term typically used for stereo audio, in which the data stream consists of consecutive frames of audio. The stream therefore looks like L1R1L2R2L3R3......LnRn
Both big and little endian audio formats exist, and depend on the use-case. However, it's generally ever an issue when exchanging data between systems - you'll always use native byte-order when processing or interfacing with operating system audio components.
You don't say whether you're using a little or big endian system, but I suspect it's probably the former. In which case you need to byte-reverse the samples.
Although not set in stone, when using floating point samples are usually in the range -1.0<x<+1.0, so you want to divide the samples by 1<<15. When 16-bit linear types are used, they are typically signed.
Taking care of byte-swapping and format conversions:
int s = (int) interleavedData[2 * i];
short revS = (short) (((s & 0xff) << 8) | ((s >> 8) & 0xff))
left[i] = ((float) revS) / 32767.0f;

Actually your are dealing with an almost typical WAVE file at Audio CD quality, that is to say :
2 channels
sampling rate of 44100 kHz
each amplitude sample quantized on a 16-bits signed integer
I said almost because big-endianness is usually used in AIFF files (Mac world), not in WAVE files (PC world). And I don't know without searching how to deal with endianness in Java, so I will leave this part to you.
About how the samples are stored is quite simple:
each sample takes 16-bits (integer from -32768 to +32767)
if channels are interleaved: (L,1),(R,1),(L,2),(R,2),...,(L,n),(R,n)
if channels are not: (L,1),(L,2),...,(L,n),(R,1),(R,2),...,(R,n)
Then to feed an audio callback, it is usually required to provide 32-bits floating point, ranging from -1 to +1. And maybe this is where something may be missing in your aglorithm. Dividing your integers by 32768 (2^(16-1)) should make it sound as expected.

I ran into a similar issue with de-interleaving the short[] frames that came in through Spotify Android SDK's onAudioDataDelivered().
The documentation for onAudioDelivered was poorly written a year ago. See Github issue. They've updated the docs with a better description and more accurate parameter names:
onAudioDataDelivered(short[] samples, int sampleCount, int sampleRate, int channels)
What can be confusing is that samples.length can be 4096. However, it contains only sampleCount valid samples. If you're receiving stereo audio, and sampleCount = 2048 there are only 1024 frames (each frame has two samples) of audio in samples array!
So you'll need to update your implementation to make sure you're working with sampleCount and not samples.length.

To retrieve the bit from LSB insertion

I read about LSB insertion online, but it only introduces about how to insert bits to LSB, but it didn't describe how to extract the bits. This is the article I read about LSB insertion.
I understand the method they use below, but how do you extract the bits?

Here's an algorithm for getting the encrypted message:
Read image.
Iterate over pixels.
Decompose pixel into RGB values (one byte for R, one for G, one for B)
Take the LSB from red. If the LSB is in bit zero, you can AND the red value with a mask of 1 (bits 000000001). So, lsbValue = rvalue & 0x01. Place the lsbValue (it will only be one or zero) in the highest bit
Get the LSB from green. Place this in the next highest bit.
Get the LSB from blue. Place this in the next bit down.
Read the next pixel and decompose into RGB bytes.
Stuff the LSB of the color components into bit positions until you've filled a byte. This is the first byte of your encrypted mesage.
Continue iterating over pixels and their RGB values until you've processed all pixels.
Inspect the bytes you've decrypted. The actual message should be obvious. Anything beyond the encrypted message will just be noise, i.e, the LSB of the actual image pixels.

GrayScale (8bit per pixel) Image Pixel Manipulation in Java

I've heard that the data in gray-scale images with 8-bits color depth is stored in the first 7 bits of a byte of each pixel and the last bit keep intact! So we can store some information using the last bit of all pixels, is it true?
If so, how the data could be interpreted in individual pixels? I mean there is no Red, Blue and Green! so what do those bits mean?
And How can I calculate the average value of all pixels of an image?
I prefer to use pure java classes not JAI or other third parties.
Update 1
BufferedImage image = ...; // loading image
image.getRGB(i, j);
getRGB method always return an int which is bigger than one byte!!!
What should I do?

My understanding is that 8-bits colour depth means there is 8-bits per pixel (i.e. one byte) and that Red, Gren and Blue are all this value. e.g. greyscale=192 means Red=192, Green=192, Blue=192. There is no 7 bits plus another 1 bit.
AFAIK, you can just use a normal average. However I would use long for the sum and make sure each byte is unsigned i.e. `b & 0xff
EDIT: If the grey scale is say 128 (or 0x80), I would expect the RGB to be 128,128,128 or 0x808080.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Jpeg calculating max size - java

I'm not sure this would be that helpful, but I believe that the absolute maximum it could be would be: width * height * 4 (size of int) You should probably also add in a maybe a kilobyte for metadata... but I doubt the image would EVER reach that (as that is the whole point of JPEG compression)

Related

How to convert BMP image to 8-bit greyscale

Java compress lots of long numbers

What does interleaved stereo PCM linear Int16 big endian audio look like?

To retrieve the bit from LSB insertion

GrayScale (8bit per pixel) Image Pixel Manipulation in Java

Categories

Resources