Java - Image Recognition

Java - Image Recognition - java

I have about 5000 images with water marks on them and 5000 identical images with no watermarks. The file names of each set of images are not correlated to each other in any way. I'm looking for an API in Java preferably that I can use to pair each water marked image with its non-water marked pair.

You can use the OpenCV library. It can be used in Java. Please follow http://docs.opencv.org/doc/tutorials/introduction/desktop_java/java_dev_intro.html
Regarding image compare, you can see another useful answer here: Checking images for similarity with OpenCV

I think this is more about performance then about the image comparison itself and the answer is written in such manner so if you need help with the comparison itself comment me ...
create simplified histogram for each image
let say 8 values per each channel limiting to 4 bits per each intensity level. That will lead to 3*8*4=3*32 bits per image
sort images
take above histogram and consider it as a single number and sort the images of A group by it does not matter if ascending or descending
matching A and B grouped images
now the corresponding images should have similar histograms so take image from unsorted group B (watermarked), bin-search all the closest match in A group (original) and then compare more with more robust methods just against selected images instead of 5000.
add flag if image from A group is already matched
so you can ignore already matched images in bullet #3 to gain more speed
[Notes]
there are other ways to improvement like use Perceptual hash algorithms

Related

Combine FisherFaces and LBPH algorithm to improve accuracy

Hi everyone i’m trying to implement a face recognition system for a video surveillance application.
In this context test images are low quality, illumination change from an image to another, and, moreover, the detected subjects are not always in the same pose.
As first recognizer i used FisherFaces and, with 49 test images, i obtain an accuracy of 35/49, without considering the distances of each classified subject (i just considered labels). Trying to get a better accuracy i attempt to make a preprocessing both of training images and test images; the preprocessing i choose is described in “Mastering OpenCV with Practical Computer Vision Projects” book. The steps are:
detection of the eyes in order to allign and rotate a face;
separate histogram equalization to standardize the lighting in the image;
filtering to reduce the effect of pixel noise because the histogram equalization increase it;
the last step is to apply an elliptical mask to the face in order to delete some details of the face that are not significant for the recognition.
Well, with this type of preprocessing, i obtain worse results than before (4/49 subjects properly classified). So i thought of using another classifier, the LBPH recognizer, to improve the accuracy of the recognition since these two types of algorithms have different features and different ways to classify a face; if one use them together maybe the accuracy increase.
So my question is about the ways to combine these two algorithms; anyone knows how to merge the two outputs in order to obtain better accuracy? I thought at this: if FisherFaces and LBPH give the same result (the same label) then there is no problem; otherwise if they disagree my idea is to take the vector of the labels and the vector of the distances for each algorithm and for each subject sum the corresponding distances; at this point the label of the test image is the one that has the shortest distance.
This is just my idea but there are other ways to fuse the output of both algorithm also because i should change the code of the predict function of face module in OpenCV since it returns an int type not a vector of int.

Using pHash to search agaist a huge image database, what is the best approach?

I need to search a huge image database to find possible duplicate using pHash assuming those image records have the hash code generated using the pHash.
Now I have to compare a new image and I have to create the hash for this using pHash against existing records. But as per my understanding the has comparison is NOT straight forward like
hash1 - has2 < threshold
Looks like I need to pass the both hash codes into a pHash API to do the matching.So I have to retrieve all hash codes from DB in batches and compare one by one using the pHash API.
But this looks not the best approach if I have about 1000 images in queue to be compared against the millions of already exiting images.
I need to know the followings.
Is my understanding/approach on using pHash to compare with existing image db is correct?
Is there a better approach to handle this (without using cbir libraries like lire)?
I heard that there is an algorithm called dHash which also can be used for image comparison with hash codes..is there any java libraries for this and can this be used together with pHash to optimize this task of large image and repeated image processing tasks.
Thanks in advance.

I think some part of this question is discussed on the pHash support forum.
You will need to use the mvptree storage mechanism
http://lists.phash.org/htdig.cgi/phash-support-phash.org/2011-May/000122.html
and
http://lists.phash.org/htdig.cgi/phash-support-phash.org/2010-October/000103.html

Depending on your definition of "huge", a good solution here is to implement a BK-Tree hash tree (human readable description).
I'm working with a similar project, and I implemented a BK tree in cython. It's fairly performant (searching with a hamming distance of 2 takes less then 50 ms for a 12 million item dataset, and touches ~0.01-0.02% of the tree nodes).
Larger scale searches (edit distance of 8) take longer (~500 ms) and touch about 5% of the tree nodes.
This is with a 64 bit hash size.

JTransforms FFT on Image

I have an image that I want to transform to the frequency domain using FFT, there seems to be a lack of libraries for this for Java but I have found two. One is JTransforms and the other was less well known and doesn't have a name.
With the less well known one the 2D could only have length vales of powers of two but had simple to use methods like FastFourierTransform.fastFT(real, imaginary, true); with the real being the 2D array of doubles full of every pixel values and the imaginary part being a 2D array the same size full of zeroes. The Boolean value would depend on a forward or reverse transform. This made sense to me and it worked except for the power of two requirement which ruined any transform I did (I initially added black space around the image to fit it to the closest power of two), what I am struggling with is working out how to use the equivalent methods for JTransforms and would appreciate any guidance in doing so. I will state what I am currently doing.
I believe the relevant class would be DoubleFFT_2D, its constructor takes a number of rows and columns which I would assume to be the width and height of my image. Because my image has no imaginary parts I think I can use doubleFFT.realForwardFull(real); which treats imaginary parts as zero and pass the real 2D array full of pixels. Unfortunately this doesn't work at all. The JavaDoc states the input array must be of size rows*2*columns, with only the first rows*columns elements filled with real data But I don't see how this related to my image and what I would have to do to meet this requirement.
Sorry about the lengthy and poor explanation, if any additional information is needed I would be happy to provide it.
JTransforms Library and Docs can be found here: https://sites.google.com/site/piotrwendykier/software/jtransforms

It's too bad the documentation for JTransforms isn't available online other than a zipped download. It's very complete and helpful, you should check it out!
To answer your question: DoubleFFT_2D.realForwardFull(double[][] a) takes an array of real numbers (your pixels). However, the result of the FFT will have two output values for each input value - a the real and the imaginary part of each frequency bin. This is why your input array needs to be twice as big as the actual image array, with half of it empty / filled with zeroes.
Note that all the FFT functions use a not only for input, but also for output - this means any image data in there will be lost, so it might be desirable to copy to a different / larger array anyway!
The easy and obvious fix for your scenario would be to use DoubleFFT_2D.realForward(double[][] a) instead. This one will only calculate the positive spectrum, because the negative side will be symmetrical to it. This is because your input values are real.
Also, check out the RealFFTUtils_2D class in JTransforms, which will make it a lot easier for you to retrieve your results from the array afterwards :)

Understand if two different pdf are the same research paper

I'm thinking to write a simple research paper manager.
The idea is to have a repository containing for each paper its metadata
paper_id -> [title, authors, journal, comments...]
Since it would be nice to have the possibility to import the paper dump of a friend,
I'm thinking on how to generate the paper_id of a paper: IMHO should be produced
by the text of the pdf, to garantee that two different collections have the same ids only for the same papers.
At the moment, I extract the text of the first page using the iText library (removing the possible annotations), and i compute a simhash footprint from the text.
the main problem is that sometime text is slightly different (yes, it happens! for example this and this) so i would like to be tolerant.
With simhash i can compute how much the are similar the original document, so in case the footprint is not in the repo, i'll have to iterate over the collection looking for
'near' footprints.
I'm not convinced by this method, could you suggest some better way to produce a signature
(short, numerical or alphanumerical) for those kind of documents?
UPDATE I had this idea: divide the first page in 8 (more or less) not-overlapping squares, covering all the page, then consider the text in each square
and generate a simhash signature. At the end I'll have a 8x64=512bit signature and I can consider
two papers the same if the sum of the differences between their simhash signatures sets is under a certain treshold.

In case you actually have a function that inputs two texts and returns a measure of their similarity, you do not have to iterate the entire Repository.
Given an article that is not in the repository, you can iterate only articles that have approximately the same length. for example, given an article that have 1000 characters, you will compare it to articles having between 950 and 1050 characters. For this you will need to have a data structure that maps ranges to articles and you will have to fine tune the size of the range. Range too large- too many items in each range. Range too small- higher potential of a miss.
Of course this will fail on some edge cases. For example, if you have two documents that the second is simply the first that was copy pasted twice: you would probably want them to be considered equal, but you will not even compare them since they are too far apart in length. There are methods to deal with that also, but you probably 'Ain't gonna need it'.

I need a class to perform hypothesis testing on a normal population

In particular, I want to generate a tolerance interval, for which I would need to have the values of Zx for x some value on the standard normal.
Does the Java standard library have anything like this, or should I roll my own?
EDIT: Specifically, I'm looking to do something akin to linear regression on a set of images. I have two images, and I want to see what the degree of correlation is between their pixels. I suppose this might fall under computer vision as well.

Simply calculate Pearson correlation coefficient between those two images.
You will have 3 coefficients because of R,G,B channels needs to be analyzed separately.
Or you can calculate 1 coefficient just for intensity levels of images,... or you could calculate correlation between Hue values of images after converting to HSV or HSL color space.
Do whatever your see fits :-)
EDIT: Correlation coefficient may be maximized only after scaling and/or rotating some image. This may be a problem or not - depends on your needs.

You can use the complete statistical power of R using rJava/JRI. This includes correlations between pixels and so on.
Another option is to look around at imageJ, which contains libraries for many image manipulations, mathematics and statistics. It's an application allright, but the library is useable in development as well. It comes with an extensive developers manual. On a sidenote, imageJ can be combined with R as well.
imageJ allows you to use the correct methods for finding image similarity measures, based on fourier transformations or other methods. More info can be found in Digital Image Processing with Java an ImageJ. See also this paper.
Another one is the Commons-Math. This one also contains the basic statistical tools.
See also the answers on this question and this question.

It seems you want to compare to images to see how similar they are. In this case, the first two things to try are SSD (sum of squared differences) and normalized correlation (this is closely related to what 0x69 suggests, Pearson correlation) between the two images.
You can also try normalized correlation over small (corresponding) windows in the two images and add up the results over several (all) small windows in the image.
These two are very simple methods which you can write in a few minutes.
I'm not sure however what this has to do with hypothesis testing or linear regression, you might want to edit to clarify this part of your question.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.