Compare image to a reference image in Java - java

I need to write tests (in Java) for an image capturing application. To be more precise:
Some image is captured from a scanner.
The application returns a JPEG of this image.
The test shall compare the scanned image with a reference image. This reference image has been captured by the identical application and has been verified visually that it contains the same content.
My first idea was comparing the images pixel by pixel but as the application applies a JPG compression the result would never by "100 % identical". Besides the fact that the scanner would capture the image with slight differences each time.
The goal is not to compare two "random images" for similarity like "Do both images show a car?" but rather "How similar is the captured car to the reference image of this car?".
What other possibilities do you see?

Let me say first, that image processing is not my best field, nor am I an expert in any way. However, here is my suggestion: take the absolute value of each pixel in the pictures, in row-major order, respective to each other. Then, average out the differences, and see if the average is within a certain threshold: if so, the images are similar in nature. Another way, would be to go once more pixel by pixel, but instead count the number of pixels that differ by at least x, where x is a number chosen to represent how close the matching needs to be. Then, just see if the different pixels comprise 5% of all pixels, 10%, and so on. The more pixels, the more different the image.
I hope this helps you, and best of luck.P.S. to judge how far two pixels are apart, it's going to be like finding the distance between two points (i.e. sqrt[(r2-r1)^2+(g2-g1)^2+(b2-b1)^2]).P.P.S. This all of course assumes that you listened to the Hovercraft Full of Eels, and have somehow made sure that the images match up in size and position.

Related

Improvement Suggestion for Color Measurement Algorithm

I'm working on an Image Processing project for a while, which consists in a way to measure and classify some types of sugar in the production line by its color. Until now, my biggest concern was searching and implementing the appropriate mathematical techniques to calculate distance between two colors (a reference color and the color being analysed), and then, turn this value into something more meaningful, as an industry standard measure.
From this, I'm trying to figure out how should I reliably extract the average color value from an image, once the frame captured by a video camera may contain noises or dirt in the sugar (most likely almost black dots).
Language: Java with OpenCV library.
Current solution: Before taking average image value, I'm applying the fastNlMeansDenoisingColored function, provided by OpenCV. It removes some white dots, at cost of more defined details. Couldn't remove black dots with it (not shown in the following images).
From there, I'm using the org.opencv.core.Core.mean function to computate the mean value of array elements independently for each channel, so that I can have a scalar value to use in my calculations.
I tried to use some kinds of image thresholding filters to get rid of black and white dots, and then calculate the mean with a mask, It kinda works too. Also, I tried to find any weighted average function which could return scalar values as well, but without success.
I don't know If those are robust enough pre-processing techniques to such application, mean values can vary easily. Am I in the right way? Would you suggest a better way to get reliable value that will represent my sugar's color?

Image processing - OpenCV, Identifying digits

I am new to image processing and to opencv in particular.
I am working on an OCR project in which i need to identify numbers.
This is my image to process:
Lets say i already optimized the image, my questions are:
In the image the number are always apeared several times, lets say i found the contours, so how can i know which one if the the best one to process?
How can I know in what angle I need to rotate each contour to make It stright?
In the image the number are always apeared several times, lets say i found the contours, so how can i know which one if the the best one to process?
You want always the biggest number, because they are least warped by perspective. So you always want the numbers in the middle of the image, because they are also n the middle of the ball.
How can I know in what angle I need to rotate each contour to make It stright?
Have a look at rotated rect. I explained how to find the angle in this thread.
Since you always have a perfectly centered ball, you should think about using mapping to "unwarp" your ball (so do a projection like from the globe onto a map). It should be pretty straightforward afterwards to find the numbers on the flat image.
Edit: Since you only have 10 numbers you might also "brute force" the solution with a big enough training set. So just throw all numbers you detect into a classifier and keep the most likely solution.
1) I agree with #Sebastian in the first part. Exploit the fact that in your scenario the numbers are placed in the surface of a ball, so first select the blobs inside a centered region of interest.
2) The contours shown in the image are not rotated (the numbers are). Instead of "rotating" these bounding boxes, which seems to be quite a headache, I'd rather use them combined with rotation invariant keypoints. I'll clarify this:
a) You know where your numbers are, so you don't have to search in the entire image. OK, keep these already selected regions in mind.
b) You can take "straight" samples of the numbers 0-9 and use them as ground truth.
c) You can perform a matching search between each "ground truth" image and each candidate region. Now, forget the scale/rotation: use scale/rotation invariant keypoints! Something like this:
Again, notice that you have already selected the region-of-interest, so in your case the search will consist on checking the number of matches (number of blue lines) between each of the registered numbers and your candidate. I think it worth a try! :)
You can find more info on the different keypoints available in opencv here.
Hope that it helps!

Find and Crop relevant image area automatically

We are trying to crop the relevant area of an image (photo) with a square aspect ratio (1:1), similar to what Facebook does when creating thumbnails.
In our case, it doesn't really matter if the crop has the original height (or width when the image orientation is portrait h>w) of the image to be processed or the auto-crop is resizing itself as well
I am thinking of algorithms like comparing objects with background or focus or something like a heat-map, combining colors and/or areas to find the most relevant part. There could be several ideas/methods to find the main part of the image to be used, similar to face detection.
We are looking for a Java (Android)-based solution or anything that can be adopted for Java / Android. Any help or idea would be greatly appreciated! Thank you!
I would do this in two steps, where the initial step is more robust and the second could be based on, for example, entropy. For the first step, you can use SURF which is relatively common nowadays and I would expect to find Java implementations of it. SURF will give a set of key points that it considers important to describe your image. Considering where these key points are in your image, you have a set of (x, y) coordinates from which you use to reduce the area of your initial image to that which encloses this set of points. Now, since these key points might be anywhere in your image, you will probably want to discard some of them (i.e., those that are too far from the others -- outliers). A very simple way to do this discarding step is considering the convex hull from the initial set of key points, from there, you can peel this hull multiple times. Each time you "peel" it, you are effectively discarding the points in the current convex hull.
Here is a sample for such first step:
f = Import["http://fohn.net/duck-pictures-facts/mallard-duck.jpg"];
kp = ImageKeypoints[f, MaxFeatures -> 200];
Show[f, Graphics[{PointSize[Medium], Red, Point[kp]}]]
After peeling once the convex hull formed by the key points and trimming the image according to the bounding rectangle of the remaining points:
From the image above, you can decide which sub-region of it to pick based on some other method. One that is apparently common is the one used by Reddit, which successively remove slices of lesser entropy from the image. Quickly searching for it, I found one such implementation at https://github.com/christopherhan/pycrop/blob/master/pycrop.py#L33, it is very simple.
Another different kind of method that you might wanna try is called Seam-Carving. Also note that depending on how large is the initial image, it is unlikely that cropping a small piece of it will give anything relevant. In those cases, it is more interesting to first resize the image and then apply the relevant methods.

Finding the average color of an Image

I am using a BufferedImage to hold a 10 by 10 sample of an image. With this Image I would like to find an approximate average color (as a Color object) that represents this image. Currently I have two ideas on how to implement this feature:
Make a scaled instance of the image into a 1 by 1 size image and find the color of the newly created image as the average color
Use two for loops. The inner-most is used to average each line, the secondary for-loop is used to average each line pixel by pixel.
I really like the idea of the first solution, however I am not sure how accurate it would be. The second solution would be as accurate as they come, however it seems incredibly tedious. I also believe the getColor command is processor intensive on a large scale such as this (I am performing this averaging roughly at 640 to 1920 times a second), please correct me if I am wrong. Since this method will be very CPU intensive, I would like to use a fairly efficient algorithm.
It depends what you mean by average. If you have half the pixels red and half the pixels blue, would the average be purple? In that case I think you can try adding all the values up and dividing it by how many pixels you have.
However, I suspect that rather than the average, you want the dominant colour?
In that case one alternative could be to discretise the colours into 'buckets' (say at intervals of 100, or even more sparser in the extreme case just 3, one for Red, one for Green and one for Blue), and create a histogram (a simple array with counts). You would then take the bucket which has the most count.
Be careful with idea 1. Remember that scaling often takes place by sampling. Since you have a very small image, you have already lost a lot of information. Scaling down further will probably just sample a few pixels and not really average all of them. Better check what algorithm your scaling process is using.

Appending to an Image File

I have written a program that takes a 'photo' and for every pixel it chooses to insert an image from a range of other photos. The image chosen is the photo of which the average colour is closest to the original pixel from the photograph.
I have done this by firstly averaging the rgb values from every pixel in 'stock' image and then converting it to CIE LAB so i could calculate the how 'close' it is to the pixel in question in terms of human perception of the colour.
I have then compiled an image where each pixel in the original 'photo' image has been replaced with the 'closest' stock image.
It works nicely and the effect is good however the stock image size is 300 by 300 pixels and even with the virtual machine flags of "-Xms2048m -Xmx2048m", which yes I know is ridiculus, on 555px by 540px image I can only replace the stock images scaled down to 50 px before I get an out of memory error.
So basically I am trying to think of solutions. Firstly I think the image effect itself may be improved by averaging every 4 pixels (2x2 square) of the original image into a single pixel and then replacing this pixel with the image, as this way the small photos will be more visible in the individual print. This should also allow me to draw the stock images at a greater size. Does anyone have any experience in this sort of image manipulation? If so what tricks have you discovered to produce a nice image.
Ultimately I think the way to reduce the memory errors would be to repeatedly save the image to disk and append the next line of images to the file whilst continually removing the old set of rendered images from memory. How can this be done? Is it similar to appending a normal file.
Any help in this last matter would be greatly appreciated.
Thanks,
Alex
I suggest looking into the Java Advanced Imaging (JAI) API. You're probably using BufferedImage right now, which does keep everything in memory: source images as well as output images. This is known as "immediate mode" processing. When you call a method to resize the image, it happens immediately. As a result, you're still keeping the stock images in memory.
With JAI, there are two benefits you can take advantage of.
Deferred mode processing.
Tile computation.
Deferred mode means that the output images are not computed right when you call methods on the images. Instead, a call to resize an image creates a small "operator" object that can do the resizing later. This lets you construct chains, trees, or pipelines of operations. So, your work would build a tree of operations like "crop, resize, composite" for each stock image. The nice part is that the operations are just command objects so you aren't consuming all the memory while you build up your commands.
This API is pull-based. It defers computation until some output action pulls pixels from the operators. This quickly helps save time and memory by avoiding needless pixel operations.
For example, suppose you need an output image that is 2048 x 2048 pixels, scaled up from a 512x512 crop out of a source image that's 1600x512 pixels. Obviously, it doesn't make sense to scale up the entire 1600x512 source image, just to throw away 2/3 of the pixels. Instead, the scaling operator will have a "region of interest" (ROI) based on it's output dimensions. The scaling operator projects the ROI onto the source image and only computes those pixels.
The commands must eventually get evaluated. This happens in a few situations, mostly relating to output of the final image. So, asking for a BufferedImage to display the output on the screen will force all the commands to evaluate. Similarly, writing the output image to disk will force evaluation.
In some cases, you can keep the second benefit of JAI, which is tile based rendering. Whereas BufferedImage does all its work right away, across all pixels, tile rendering just operates on rectangular sections of the image at a time.
Using the example from before, the 2048x2048 output image will get broken into tiles. Suppose these are 256x256, then the entire image gets broken into 64 tiles. The JAI operator objects know how to work a tile at a tile. So, scaling the 512x512 section of the source image really happens 64 times on 64x64 source pixels at a time.
Computing a tile at a time means looping across the tiles, which would seem to take more time. However, two things work in your favor when doing tile computation. First, tiles can be evaluated on multiple threads concurrently. Second, the transient memory usage is much, much lower than immediate mode computation.
All of which is a long-winded explanation for why you want to use JAI for this type of image processing.
A couple of notes and caveats:
You can defeat tile based rendering without realizing it. Anywhere you've got a BufferedImage in the workstream, it cannot act as a tile source or sink.
If you render to disk using the JAI or JAI Image I/O operators for JPEG, then you're in good shape. If you try to use the JDK's built-in image classes, you'll need all the memory. (Basically, avoid mixing the two types of image manipulation. Immediate mode and deferred mode don't mix well.)
All the fancy stuff with ROIs, tiles, and deferred mode are transparent to the program. You just make API call on the JAI class. You only deal with the machinery if you need more control over things like tile sizes, caching, and concurrency.
Here's a suggestion that might be useful;
Try segregating the two main tasks into individual programs. Your first task is to decide which images go where, and that can be a simple mapping from coordinates to filenames, which can be represented as lines of text:
0,0,image123.jpg
0,1,image542.jpg
.....
After that task is done (and it sounds like you have it well handled), then you can have a separate program handle the compilation.
This compilation could be done by appending to an image, but you probably don't want to mess around with file formats yourself. It's better to let your programming environment do it by using a Java Image object of some sort. The biggest one you can fit in memory pixelwise will be 2GB leading to sqrt(2x10^9) maximum height and width. From this number and dividing by the number of images you have for height and width, you will get the overall pixels per subimage allowed., and can paint them into the appropriate places.
Every time you 'append' are you perhaps implicitly creating a new object with one more pixel to replace the old one (ie, a parallel to the classic problem of repeatedly appending to a String instead of using a StringBuilder) ?
If you post the portion of your code that does the storing and appending, someone will probably help you find an efficient way of recoding it.

Categories

Resources