I wonder that extracting similar area of images is possible. if it is possible,i am going to find insigne of copmany which created it.
There are two images below. The red rectangles in the images are that i try to find area. The program is going to find similar area comparing images. I tried to find it using opencv,but i couldn't do it.
First thing in mind:
Convert images to grayscale
Divide image into small areas (patches)
Each patch should be labelled as 1 if entropy of image is high and 0 if low (to discard patches without letters)
For two images, compare all patches across images based on:
Histogram on sobel image (Bhattacharya distance is normalized)
Correlation (Minmax normalization)
Advanced descriptors (like SIFT) (L2 normalization)
Min distance wins.
You can narrow down the '1' patches with a text detector (Algorithm to detect presence of text on image).
Related
I have a system which crawls the net and takes a screenshot of web pages. At the moment i simply just hash the image file ( stored as a png ). However this doesn't work well with pages that either have a count of comments on a article in a blog. Or a view count.
So my question is what would be the best way to detect these changes? Which algorithm would work best?
A naive but very easily implemented method would be to wipe out all numeric characters from every page and compare just the character content of them.
So first we want to detect the areas with the change. A simple good way is just to take the difference between the two images and than look for all the areas with difference higher than zero.
After that we would look at every group of points and look at those points in the original image and try to detect numbers using some OCR software.
General algorithm:
Diff = Im1 - Im2
Threshold Diff to get the threshold image ThIm, i.e. if Diff(x,y) > 0 = ThIm(x,y) = 1 other wise ThIm(x,y) = 0.
Find connected component in ThIm
For each connected component find the bounding box around it.
Crop the original image using the bounding box
Run OCR on the cropped area and check if you found numbers
Is it somehow possible to merge two images captured by searching for identical features in OpenCV in both images?
My images will always contain sheets of papers which are too big to be captured in one single frame thus I need to take two or more frames - the images are captured so that there is some overlapping area, see:
Top picture: http://i.imgur.com/0tvGVKG.jpg
Bottom picture: http://i.imgur.com/nmlO4gL.jpg
IDEA: Restrict features to only move up/down and not to the right etc.. Even more so, the length of all vectors between two identical features in both images has to be more or less identical, as the area is overlapping. OpenCV also only has to look at the very bottom of picture one and at the very top of picture two for overlapping features. All the other areas are NOT of interest...
Edit: Okay I found some sources about this finally with some good links in the answers: how to find overlapping region between images in opencv?
Yes the linked question you have posted is the way to go. Local features might not work very well for your case (sheet of paper), if it doesn't maybe you should do a kind of row scanner where you compare a few rows from one image to rows of the other image until the row comparison has a similarity above some threshold. Basically finding where the images should intersect.
For Java image stitching you might check this OpenCV alternative: BoofCV
Or this dedicated stitching library: Hugin
At worst case you can check how they do the stitching and code yourself a simpler version.
If I have an image of a table of boxes, with some coloured in, is there an image processing library that can help me turn this into an array?
Thanks
You can use a thresholding function to binarize the image into dark/light pixels so dark pixels are 0 and light ones are 1.
Then you would want to remove image artifacts using dilation and erosion functions to remove noise (all these are well defined on Wikipedia).
Finally if you know where the boxes are, you can just get the value in the center of each box to determine the array value, or possibly use an area near the center and take the prevailing value (i.e. more 0's is a filled in square, more 1's is and empty square).
If you are scanning these boxes and there is a lot of variation in the position of the boxes, you will have to perform some level of image registration using known points, or fiducials.
As far as what tools to use to do this, I'd recommend first trying this manually using a tool like ImageJ, which has a UI and can also be used programatically since it is written all in Java.
Other good libraries for this include OpenCV and the Java Advanced Imaging API.
Your results will definitely vary depending on the input images and how consistenly lit and positioned they are.
The best way to see how it will do for your data is to try applying these processing steps manually to see where your threshold value should be, how much dilating/eroding you need to get consistent results.
What is the best way to identify an image's type? rwong's answer on this question suggests that Google segments images into the following groups:
Photo - continuous-tone
Clip art - smooth shading
Line drawing - bitonal
What is the best strategy for classifying an image into one of those groups? I'm currently using Java but any general approaches are welcome.
Thanks!
Update:
I tried the unique colour counting method that tyjkenn mentioned in a comment and it seems to work for about 90% of the cases that I've tried. In particular black and white photos are hard to correctly detect using unique colour count alone.
Getting the image histogram and counting the peeks alone doesn't seem like it will be a viable option. For example this image only has two peaks:
Here are two more images I've checked out:
Rather simple, but effective approaches to differentiate between drawings and photos. Use them in combination to achieve a the best accuracy:
1) Mime type or file extension
PNGs are typically clip arts or drawings, while JPEGs are mostly photos.
2) Transparency
If the image has an alpha channel, it's most likely a drawing. In case an alpha channel exists, you can additionally iterate over all pixels to check if transparency is indeed used. Here a Python example code:
from PIL import Image
img = Image.open('test.png')
transparency = False
if img.mode in ('RGBA', 'RGBa', 'LA') or (img.mode == 'P' and 'transparency' in img.info):
if img.mode != 'RGBA': img = img.convert('RGBA')
transparency = any(px for px in img.getdata() if px[3] < 220)
print 'Transparency:', transparency
3) Color distribution
Clip arts often have regions with identical colors. If a few color make up a significant part of the image, it's rather a drawing than a photo. This code outputs the percentage of the image area that is made from the ten most used colors (Python example):
from PIL import Image
img = Image.open('test.jpg')
img.thumbnail((200, 200), Image.ANTIALIAS)
w, h = img.size
print sum(x[0] for x in sorted(img.convert('RGB').getcolors(w*h), key=lambda x: x[0], reverse=True)[:10])/float((w*h))
You need to adapt and optimize those values. Is ten colors enough for your data? What percentage is working best for you. Find it out by testing a larger number of sample images. 30% or more is typically a clip art. Not for sky photos or the likes, though. Therefore, we need another method - the next one.
4) Sharp edge detection via FFT
Sharp edges result in high frequencies in a Fourier spectrum. And typically such features are more often found in drawings (another Python snippet):
from PIL import Image
import numpy as np
img = Image.open('test.jpg').convert('L')
values = abs(numpy.fft.fft2(numpy.asarray(img.convert('L')))).flatten().tolist()
high_values = [x for x in values if x > 10000]
high_values_ratio = 100*(float(len(high_values))/len(values))
print high_values_ratio
This code gives you the number of frequencies that are above one million per area. Again: optimize such numbers according to your sample images.
Combine and optimize these methods for your image set. Let me know if you can improve this - or just edit this answer, please. I'd like to improve it myself :-)
This problem can be solved by image classification and that's probably Google's solution to the problem. Basically, what you have to do is (i) get a set of images labeled into 3 categories: photo, clip-art and line drawing; (ii) extract features from these images; (iii) use the image's features and label to train a classifier.
Feature Extraction:
In this step you have to extract visual information that may be useful for the classifier to discriminate between the 3 categories of images:
A very basic yet useful visual feature is the image histogram and its variants. For example, the gray level histogram of a photo is probably smoother than a histogram of a clipart, where you have regions that may be all of the same color value.
Another feature that one can use is to convert the image to the frequency domain (e.g. using FFT or DCT) and measure the energy of high frequency components. Because line drawings will probably have sharp transitions of colors, its high frequency components will tend to accumulate more energy.
There's also a number of other feature extraction algorithms that may be used.
Training a Classifier:
After the feature extraction phase, we will have for each image a vector of numeric values (let's call it the image feature vector) and its tuple. That's a suitable input for a training a classifier. As for the classifier, one may consider Neural Networks, SVM and others.
Classification:
Now that we have a trained classifier, to classify an image (i.e. detect a image category) we simply have to extract its features and input it to the classifier and it will return its predicted category
Histograms would be a first way to do this.
Convert the color image to grayscale and calculate the histogram.
A very bi-modal histogram with 2 sharp peaks in black (or dark) and white (or right), probably with much more white, are a good indication for line-drawing.
If you have just a few more peaks then it is likely a clip-art type image.
Otherwise it's a photo.
In addition to color histograms, also consider edge information and the consistency of line widths throughout the image.
Photo - natural edges will have a variety of edge strengths, and it's less likely that there will be many parallel edges.
Clip art - A watershed algorithm could help identify large, connected regions of consistent brightness. In clip art and synthetic images designed for high visibility there are more likely to be perfectly straight lines and parallel lines. A histogram of edge strengths is likely to have a few very strong peaks.
Line drawing - synthetic lines are likely to have very consistent width. The Stroke Width Transform could help you identify strokes. (One of the basic principles is to find edge gradients that "point at" each other.) A histogram of edge strengths may have only one strong peak.
Given:
two images of the same subject matter;
the images have the same resolution, colour depth, and file format;
the images differ in size and rotation; and
two lists of (x, y) co-ordinates that correlate the images.
I would like to know:
How do you transform the larger image so that it visually aligns to the second image?
(Optional.) What are the minimum number of points needed to get an accurate transformation?
(Optional.) How far apart do the points need to be to get an accurate transformation?
The transformation would need to rotate, scale, and possibly shear the larger image. Essentially, I want to create (or find) a program that does the following:
Input two images (e.g., TIFFs).
Click several anchor points on the small image.
Click the several corresponding anchor points on the large image.
Transform the large image such that it maps to the small image by aligning the anchor points.
This would help align pictures of the same stellar object. (For example, a hand-drawn picture from 1855 mapped to a photograph taken by Hubble in 2000.)
Many thanks in advance for any algorithms (preferably Java or similar pseudo-code), ideas or links to related open-source software packages.
This is called Image Registration.
Mathworks discusses this, Matlab has this ability, and more information is in the Elastix Manual.
Consider:
Open source Matlab equivalents
IRTK
IRAF
Hugin
you can use the javax.imageio or Java Advanced Imaging api's for rotating, shearing and scaling the images once you found out what you want to do with them.
For a C++ implementation (without GUI), try the old KLT (Kanade-Lucas-Tomasi) tracker.
http://www.ces.clemson.edu/~stb/klt/