I am trying to do some image processing in Java. I used ImageIO library for reading and writing images. I can read the image pixel value in two ways as follows (there might be other methods which do not know).
Using BufferedImage's getRGB method:
pixel = image.getRGB(x,y);
Using Raster's getSample method:
WritableRaster raster = image.getRaster();
pixel = raster.getSample(x,y,0);
What is the difference in the above two approaches?
1: The first approach will always return a pixel in int ARGB format, and in the sRGB color space. Regardless of the image's internal representation. This means that unless the image's internal representation is TYPE_INT_ARGB, some conversion has to be done. This is sometimes useful, because it's predictable, but just as often it's quite slow. As an example, color space conversion is quite expensive. Also, if the image has higher precision than 8 bits per sample and/or 4 samples per pixel, precision loss occurs. This may or may not be acceptable, given your use case.
2: The second approach may give you a pixel value, but not in all cases, as it gives you the sample value at (x,y) for the the band 0 (the first band). For TYPE_INT_ARGB this will be the same as the pixel value. For TYPE_BYTE_INDEXED this will be the index to use in the look up table (you need to look it up to get the pixel value). For TYPE_3BYTE_BGR this will give you the blue value only (you need to combine it with the samples in band 1 and 2 to get the full pixel value). Etc. for other types. For samples that are not internally represented as an int, data type conversion occurs (and in rare cases precision loss). It might work for you, but I've never had much use for the getSample(...) methods.
Instead I suggest you look into what I believe to be the fastest way to get at pixel data. That is using the getDataElements method:
Object pixel = null; // pixel initialized on first invocation of getDataElements
for (y) {
for (x) {
pixel = raster.getDataElements(x, y, pixel);
}
}
This will give you the "native" values from the data buffer, without any conversion.
You then need to have special handling for each transfer type (see the DataBuffer class) you want to support, and perhaps a common fallback for non-standard types.
This will have the same "problem" as your approach 2 for pixel values vs normalized RGB values, so you might need to convert/look up "manually".
What approach is better, as always, depends. You have to look at each use case, and decide what's more important. Ease/simplicity, or the best possible performance (or perhaps best quality?).
Related
I just found out that there is BitSet in java. There are already arrays and similar data structures. Where can BitSet be used?
As the above answer only explains what a BitSet is, I am providing here an answer of how I use BitSet and why. At first, I did not knew that the BitSet construct exists. I have a QR Code generator in C++ and for flexible reasons I don't want to use a specific Bitmap structures in returning this QR Code back to the caller. The QR Code is just black and white and can be represented as a series of bits. The problem was that in the JNI C++, I have to return the byte array that represents these series of bits and then I have to return the count of bits. Note that the size of the bytes array alone could not tell the count of bits. In effect, I am face with a scenario wherein my JNI C++ has to return two values:
the byte[] array
the count of bits
My first solution, was to return an array of boolean. The content of this array are the QR Code pixels, and the square root of the length of the array is the length of the side. Of course this worked but I felt wasted because it is supposed to be a series of bits. My next attempt was to return Pair<int, byte[]> object which, after lots of hair pulling i am not able to make it work in C++. Here comes the BitSet(145) construct. By returning this BitSet object, I am conveying two types of information i listed above. But there is minor trick. If QR Code pixel has total 144 pixels, because one side is 12, then you have to allocate BitSet(145) and do obj.set(144). That is, we introduce an artificial last bit that we then set, but this last bit is not part of the QR Code pixels. This ensures that, BitSet::length() correctly returns the bit count. So in Kotlin:
var pixels:BitSet = getqrpixels(inputdata)
var pixels_len = pixels.length() - 1
var side = sqrt(pixels_len.toFloat()).toInt()
drawSquareBitmap(pixels, side)
And thus, is my unexpected use case of this mysterious BitSet.
Take a look at this:
https://docs.oracle.com/javase/7/docs/api/java/util/BitSet.html
A BitSet is a vector of bits. Each entry in the list is either true (1) or false (0). The BitSet class comes with methods that resemble the bitwise operators. It is a little bit more flexible then a normal binary type.
BitSet, unlike a boolean[], is actually a dynamically sized bitmask. Essentially, instead of using booleans to store values, it uses longs, where each of the longs 64 bits are used to store a single bit.
Ive made this method for getting me the pixel values of an image, im using it to compare 1 image against 50 other images. However it takes forever to produce outputs. Does anyone know of a way l can speed this method up? Would converting the images to Grayscale be a quicker way? If anyone could help with code, that would be great!
public static double[] GetHistogram (BufferedImage img) {
double[] myHistogram = new double [16777216];
for (int y = 0; y < img.getHeight(); y += 1)
{
for (int x = 0; x < img.getWidth(); x += 1)
{
int clr = img.getRGB(x,y);
Color c = new Color(img.getRGB(x, y));
int pixelIntValue = (int) c.getBlue() * 65536 + c.getGreen() * 256 + c.getRed();
myHistogram[pixelIntValue]++;
}
}
return myHistogram;
}
TLDR: use a smaller image and read this paper.
You should try to eliminate any unnecessary function calls as #Piglet mentioned, but you should definitely keep the colors in one histogram instead of a separate histogram for R, G, and B. Aside from getting rid of the extra function calls, I think there are four things you can do to speed up your algorithm—both creating and comparing the histograms—and reduce the memory usage (because less page caching means less disk thrashing and more speed).
Use a smaller image
One of the advantages of color histogram indexing is that it is relatively independent of resolution. The color of an object does not change with the size of the image. Obviously, there are limits to this—imagine trying to match objects using a 1×1 image. However, if your images have millions of pixels (like the images from most smart phones these days), you should definitely resize it. These authors found that an image resolution of only 16×11 still produced very good results [see page 17], but even resizing down to ~100×100 pixels should still provide a significant speed-up.
BufferedImage inherits the method getScaledInstance from Image, which you can use to get a smaller image.
double scalingFactor = 0.25; //You need to choose this value to work with your images
int aSmallHeight = myBigImage.getHeight() * scalingFactor;
int aSmallWidth = myBigImage.getWidth() * scalingFactor;
Image smallerImage = myBigImage.getScaledInstance(aSmallWidth, aSmallHeight, SCALE_FAST);
Reducing your image size is the single most effective thing you can do to speed up your algorithm. If you do nothing else, at least do this.
Use less information from each color channel
This won't make as much difference for generating your histograms because it will actually require a little more computation, but it will dramatically speed up comparing the histograms. The general idea is called quantization. Basically, if you have red values in the range 0..255, they can be represented as one byte. Within that byte, some bits are more important than others.
Consider this color sample image. I placed a mostly arbitrary shade of red in the top left, and in each of the other corners, I ignored one or more bits in the red channel (indicated by the underscores in the color byte). I intentionally chose a color with lots of one bits in it so that I could show the "worst" case of ignoring a bit. (The "best" case, when we ignore a zero bit, has no effect on the color.)
There's not much difference the upper right and upper left corners, even though we ignored one bit. The upper left and lower left have a visible, but minimal difference even though we ignored 3 bits. The Upper left and lower right corners are very different even though we ignored only one bit because it was the most significant bit. By strategically ignoring less significant bits, you can reduce the size of your histogram, which means there's less for the JVM to move around and fewer bins when it comes time to compare them.
Here are some solid numbers. Currently, you have 28×28×28 = 16777216 bins. If you ignore the 3 least significant bits from each color channel, you will get
25×25×25 = 32768 bins, which is 1/512 of the number of bins you are currently using. You may need to experiment with your set of images to see what level of quantization still produces acceptable results.
Quantization is very simple to implement. You can just ignore the rightmost bits by performing the bit shift operations.
int numBits = 3;
int quantizedRed = pixelColor.getRed() >> numBits;
int quantizedGreen = pixelColor.getGreen() >> numBits;
int quantizedBlue = pixelColor.getBlue() >> numBits;
Use a different color space
While grayscale might be quicker, you should not use grayscale because you lose all of your color information that way. When you're matching objects using color histograms, the actual hue or chromaticity is more important than how light or dark something is. (One reason for this is because the lighting intensity can vary across an image or even between images.) There are other representations of color that you could use that don't require you to use 3 color channels.
For example, L*a*b* (see also this) uses one channel (L) to encode the brightness, and two channels (a, b) to encode color. The a and b channels each range from -100 to 100, so if you create a histogram using only a and b, you would only need 40000 bins. The disadvantage of a histogram of only a and b is that you lose the ability to record black and white pixels. Other color spaces each have their own advantages and disadvantages for your algorithm.
It is generally not very difficult to convert between color spaces because there are many existing implementations of color space conversion functions that are freely available on the internet. For example, here is a Java conversion from RGB to L*a*b*.
If you do choose to use a different color space, be careful using quantization as well. You should apply any quantization after you do the color space conversion, and you will need to test different quantization levels because the new color space might be more or less sensitive to quantization than RGB. My preference would be to leave the image in RGB because quantization is already so effective at reducing the number of bins.
Use different data types
I did some investigating, and I notices that BufferedImage stores the image as a Raster, which uses a SampleModel to describe how pixels are stored in the data buffer. This means there is a lot of overhead just to retrieve the value of one pixel. You will achieve faster results if your image is stored as byte[] or int[]. You can get the byte array using
byte[] pixels = ((DataBufferByte) bufferedImage.getRaster().getDataBuffer()).getData();
See the answer to this previous question for more information and some sample code to convert it to a 2D array.
This last thing might not make much difference, but I noticed that you are using double for storing your histogram. You should consider whether int would work instead. In Java, int has a maximum value of > 2 billion, so overflow shouldn't be an issue (unless you are making a histogram of an image with more than 2 billion pixels, in which case, see my first point). An int uses only half as much memory as a double (which is a big deal when you have thousands or millions of histogram bins), and for many math operations they can be faster (though this depends on your hardware).
If you want to read more about color histograms for object matching, go straight to the source and read Swain and Ballard's Color Indexing paper from 1991.
Calculating a histogram with 16777216 classes is quite unusual.
Most histograms are calculated for each channel separately resulting in a 256 class histogram each for R,G and B. Or just one if you convert the image to grayscale.
I am no expert in Java. I don't know how clever the compilers optimize code.
But you call img.getHeight() for every row and img.getWidth() for every column of your image.
I don't know how often those expressions are actually evaluated but maybe you can save some processing time if you just use 2 variables that you assign the width and height of your image to befor you start your loops.
You also call img.getRGB(x,y) twice for every pixel. Same story. Maybe it is faster to just do it once. Function calls are usually slower than reading variables from memory.
You should also think about what you are doing here. img.getRGB(x,y) gives you an integer representation for a color.
Then you put that integer into a contrustor to make a Color object out of it. Then you use c.getBlue() and so on to get integer values for red, green and blue out of that Color object. Just to put it together into a integer again?
You could just use the return value of getRGB straight away and at least save 4 function calls, 3 multiplications, 3 summations...
So again given that I programmed Java for the last time like 10 years ago my function would look more like that:
public static double[] GetHistogram (BufferedImage img) {
double[] myHistogram = new double [16777216];
int width = img.getWidth()
int height = img.getHeight()
for (int y = 0; y < height; y += 1)
{
for (int x = 0; x < width; x += 1)
{
int clr = img.getRGB(x,y);
myHistogram[clr]++;
}
}
return myHistogram;
}
Of course the array type and size won't be correct and that whole 16777216 class histogram doesn't make sense but maybe that helps you a bit to speed things up.
I'd just use a bit mask to get the red, green and blue values out of that integer and create three histograms.
After I set a pixel of a java.awt.image.BufferedImage to a value using setRGB, a subsequent call to getRGB returns a different value than I set.
Code:
BufferedImage image = new BufferedImage(1, 1, BufferedImage.TYPE_BYTE_GRAY);
int color1 = -16711423; // corresponds to RGB(1, 1, 1)
image.setRGB(0, 0, color1);
int color2 = image.getRGB(0, 0);
System.out.println(color1);
System.out.println(color2);
It produces the following output
-16711423
-16777216
I think it has to do something with gamma correction, but I couldn't find anything about it in the documentation.
Ideally, I want to change this behavior to return the same value as I set. Is that possible?
The BufferedImage.getRGB() method, always returns a color (as an int in "packed format") in the non-linear sRGB color space (ColorSpace.CS_sRGB). It will do so, regardless of what color space and bits per pixel etc. your image has. Thus, conversion and possible precision loss may occur.
From the JavaDoc:
Returns an integer pixel in the default RGB color model (TYPE_INT_ARGB) and default sRGB colorspace. Color conversion takes place if this default model does not match the image ColorModel.
Your TYPE_BYTE_GRAY image internally uses a linear gray color space (ColorSpace.CS_GRAY), which does not map one-to-one with sRGB.
Also, I suggest using hexadecimal notation for (A)RGB colors, it makes the colors and difference much easier to see:
-16711423 == 0xff010101
-16777216 == 0xff000000
So, there is a minor precision loss here, but nothing unexpected.
If you want direct access to the pixel data, look into the Raster, SampleModel and DataBuffer classes (and their respective subclasses).
You set a color specified with an int which stores RGB components as bytes (in the range of 0..255 inclusive).
But the color model of your image is not RGB but BYTE_GRAY. Obviously you may suffer precision losing. This explains the different colors. Should you have used image type TYPE_INT_RGB you would've ended up with the same color.
Android uses the RectF structure for drawing bitmaps. I am working on my game structure, so for example, a Sprite will have and x/y coordinate, width, height, speed, and so on. This means every time in my render loop I have to cast those integers to floats when figuring out the source/target RectF's to use... alternatively, I can be far more universal and use floats everywhere so that when it comes time to simulate the physics and render, all of the types are already of the same type... even if it is unnecessary for what the property is (I don't need a float for "x position", but will have to cast it when rendering if not).
if floats generally are 2-10x's more ineffient (per http://developer.android.com/guide/practices/performance.html), what is the proper course to take?
TLDL: Should I cast int's to float on render, or just have all of the contributing variables be floats to begin with, even if floats are inefficient? How inefficient is a typecast?
The best way to do this is to do your calculations with the highest degree of precision required to produce the expected results within the specified tolerance. That means if you need to use doubles to do calculations and get the expected results with consistency, then use them. Figure out where less precision is acceptable, and only do the operations that require it with floats. Remember, your world doesn't have to approximate earth gravity for falling objects, you could simplify things and make the gravity 10 instead of 9.81, make pixels correspond to even units, etc. It'll also help if you define your constants in the same units and avoid doing unit conversion to do math (as this results in extra ops), it's better to add more final constants that have something like gravity in cm/s, m/s and km/h than it is to only have one and to convert it a thousand times. Oh and the cost of casting int to float isn't very high compared multiplying 2 floats, so think about that.
I'll also note that FPUs are becoming more and more common in modern android phones, and so the issue of using floating point math is becoming a little less important (although not entirely).
The other thing I want to note is Double/Float/Integer vs. double/float/int. The former require new objects to be created to use and shouldn't be used for math (whenever possible), whereas the latter are primitives and do not result in new object creation to use, they're less costly to create.
I've heard that the data in gray-scale images with 8-bits color depth is stored in the first 7 bits of a byte of each pixel and the last bit keep intact! So we can store some information using the last bit of all pixels, is it true?
If so, how the data could be interpreted in individual pixels? I mean there is no Red, Blue and Green! so what do those bits mean?
And How can I calculate the average value of all pixels of an image?
I prefer to use pure java classes not JAI or other third parties.
Update 1
BufferedImage image = ...; // loading image
image.getRGB(i, j);
getRGB method always return an int which is bigger than one byte!!!
What should I do?
My understanding is that 8-bits colour depth means there is 8-bits per pixel (i.e. one byte) and that Red, Gren and Blue are all this value. e.g. greyscale=192 means Red=192, Green=192, Blue=192. There is no 7 bits plus another 1 bit.
AFAIK, you can just use a normal average. However I would use long for the sum and make sure each byte is unsigned i.e. `b & 0xff
EDIT: If the grey scale is say 128 (or 0x80), I would expect the RGB to be 128,128,128 or 0x808080.