euclidean algorithm for image comparison - java

i am going to develop an application for image comparison on java. For this i have choosen euclidean algorithm. This application involves with 2 images.
1. Actual image
2. Part of the actual image.
Algorithm should compare the part of the image with actual image. If the part is existed in actual image, it should return one value as matching success.
Can anyone give me the algorithmic steps? code on java will be appreciated..!

Here is a relatively simple idea, with some parts left out intentionally, since the question smells like homework.
public static boolean contains(Image large, Image small) {
final int largeWidth = large.getWidth(), largeHeight = large.getHeight();
final int smallWidth = small.getWidth(), smallHeight = small.getHeight();
if (smallWidth > largeWidth || smallHeight > largeHeight) {
return false;
}
for (int x = 0; x < largeWidth - smallWidth; x++) {
for (int y = 0; y < largeHeight - smallHeight; y++) {
if (subImageEquals(large, x, y, small)) {
return true;
}
}
}
return false;
}
private static boolean subImageEquals(Image large, int x, int y, Image small) {
// TODO: checks whether all pixels starting at (x, y) match
// those of the small image.
}

Related

BufferedImage slows down performance

I'm working on a game, nothing serious, just for fun.
I wrote a class 'ImageBuilder' to help creating some images.
Everything works fine, except one thing.
I initialize a variabile like this:
// other stuff
m_tile = new ImageBuilder(TILE_SIZE, TILE_SIZE, BufferedImage.TYPE_INT_RGB).paint(0xff069dee).paintBorder(0xff4c4a4a, 1).build();
// other stuff
Then, in the rendering method, i have:
for (int x = 0; x < 16; x++) {
for (int y = 0; y < 16; y++) {
g.drawImage(m_tile, x * (TILE_SIZE + m_padding.x) + m_margin.x, y * (TILE_SIZE + m_padding.y) + m_margin.y, null);
}
}
Note: m_padding and m_margin are just two Vector2i
This draws on the screen a simple 16x16 table using that image, but the game is almost frozen, i can't get more than like 10 FPS.
I tried to creating the image without that class, by doing this (TILE_SIZE = 32):
m_tile = new BufferedImage(TILE_SIZE, TILE_SIZE, BufferedImage.TYPE_INT_RGB);
for (int x = 0; x < TILE_SIZE; x++) {
for (int y = 0; y < TILE_SIZE; y++) {
if (x == 0 || y == 0 || x + 1 == TILE_SIZE || y + 1 == TILE_SIZE)
m_tile.setRGB(x, y, 0x4c4a4a);
else
m_tile.setRGB(x, y, 0x069dee);
}
}
This time, i get 60 FPS.
I can't figure out with is the difference, i used to creating image using 'ImageBuilder' and all is fine, but not this time.
ImageBuilder class:
// Constructor
public ImageBuilder(int width, int height, int imageType) {
this.m_width = width;
this.m_height = height;
this.m_image = new BufferedImage(m_width, m_height, imageType);
this.m_pixels = ((DataBufferInt) m_image.getRaster().getDataBuffer()).getData();
this.m_image_type = imageType;
}
public ImageBuilder paint(int color) {
for (int i = 0; i < m_pixels.length; i++) m_pixels[i] = color;
return this;
}
public ImageBuilder paintBorder(int color, int stroke) {
for (int x = 0; x < m_width; x++) {
for (int y = 0; y < m_height; y++) {
if (x < stroke || y < stroke || x + stroke >= m_width || y + stroke >= m_height) {
m_pixels[x + y * m_width] = color;
}
}
}
return this;
}
public BufferedImage build() {
return m_image;
}
There are other methods, but i don't call them, so i don't think is necessary to write them
What am i doing wrong?
My guess is that the problem is your ImageBuilder accessing the backing data array of the data buffer:
this.m_pixels = ((DataBufferInt) m_image.getRaster().getDataBuffer()).getData();
Doing so, may (will) ruin the chances for this image being hardware accelerated. This is documented behaviour, from the getData() API doc:
Note that calling this method may cause this DataBuffer object to be incompatible with performance optimizations used by some implementations (such as caching an associated image in video memory).
You could probably get around this easily, by using a temporary image in your bilder, and returning a copy of the temp image from the build() method, that has not been "tampered" with.
For best performance, always using a compatible image (as in createCompatibleImage(), mentioned by #VGR in the comments) is a good idea too. This should ensure you have the fastest possible hardware blits.

Problem with Scan-Line Polygon Filling algorithm in java

(please don't mark this question as not clear, I spent a lot of time posting it ;) )
Okay, I am trying to make a simple 2d java game engine as a learning project, and part of it is rendering a filled polygon as a feature.
I am creating this algorithm my self, and I really can't figure out what I am doing wrong.
My though process is something like so:
Loop through every line, get the number of points in that line, then get the X location of every point in that line,
Then loop through the line again this time checking if the x in the loop is inside one of the lines in the points array, if so, draw it.
Disclaimer: the Polygon class is another type of mesh, and its draw method returns an int array with lines drawn through each vertex.
Disclaimer 2: I've tried other people's solutions but none really helped me and none really explained it properly (which is not the point in a learning project).
The draw methods are called one per frame.
FilledPolygon:
#Override
public int[] draw() {
int[] pixels = new Polygon(verts).draw();
int[] filled = new int[width * height];
for (int y = 0; y < height; y++) {
int count = 0;
for (int x = 0; x < width; x++) {
if (pixels[x + y * width] == 0xffffffff) {
count++;
}
}
int[] points = new int[count];
int current = 0;
for (int x = 0; x < width; x++) {
if (pixels[x + y * width] == 0xffffffff) {
points[current] = x;
current++;
}
}
if (count >= 2) {
int num = count;
if (count % 2 != 0)
num--;
for (int i = 0; i < num; i += 2) {
for (int x = points[i]; x < points[i+1]; x++) {
filled[x + y * width] = 0xffffffff;
}
}
}
}
return filled;
}
The Polygon class simply uses Bresenham's line algorithm and has nothing to do with the problem.
The game class:
#Override
public void load() {
obj = new EngineObject();
obj.addComponent(new MeshRenderer(new FilledPolygon(new int[][] {
{0,0},
{60, 0},
{0, 60},
{80, 50}
})));
((MeshRenderer)(obj.getComponent(MeshRenderer.class))).color = CYAN;
obj.transform.position.Y = 100;
}
The expected result is to get this shape filled up.(it was created using the polygon mesh):
The actual result of using the FilledPolygon mesh:
You code seems to have several problems and I will not focus on that.
Your approach based on drawing the outline then filling the "inside" runs cannot work in the general case because the outlines join at the vertices and intersections, and the alternation outside-edge-inside-edge-outside is broken, in an unrecoverable way (you can't know which segment to fill by just looking at a row).
You'd better use a standard polygon filling algorithm. You will find many descriptions on the Web.
For a simple but somewhat inefficient solution, work as follows:
process all lines between the minimum and maximum ordinates; let Y be the current ordinate;
loop on the edges;
assign every vertex a positive or negative sign if y ≥ Y or y < Y (mind the asymmetry !);
whenever the endpoints of an edge have a different sign, compute the intersection between the edge and the line;
you will get an even number of intersections; sort them horizontally;
draw between every other point.
You can get a more efficient solution by keeping a trace of which edges cross the current line, in a so-called "active list". Check the algorithms known as "scanline fill".
Note that you imply that pixels[] has the same width*height size as filled[]. Based on the mangled output, I would say that they are just not the same.
Otherwise if you just want to fill a scanline (assuming everything is convex), that code is overcomplicated, simply look for the endpoints and loop between them:
public int[] draw() {
int[] pixels = new Polygon(verts).draw();
int[] filled = new int[width * height];
for (int y = 0; y < height; y++) {
int left = -1;
for (int x = 0; x < width; x++) {
if (pixels[x + y * width] == 0xffffffff) {
left = x;
break;
}
}
if (left >= 0) {
int right = left;
for (int x = width - 1; x > left; x--) {
if (pixels[x + y * width] == 0xffffffff) {
right = x;
break;
}
}
for (int x = left; x <= right; x++) {
filled[x + y * width] = 0xffffffff;
}
}
}
return filled;
}
However this kind of approach relies on having the entire polygon in the view, which may not always be the case in real life.

2d Collision Detection - Trying To Get All The Non Transparent Pixels Of Two Sprites

I am in the process of building a 2d game and I am trying to implement pixel level/perfect collision detection.
My problem is I am trying to get all the non transparent pixels of my sprites by using the Buffered Image classes getRGB() method however I can only use this method on Buffered Images.
I was hoping you could point me in the right direction as to what I am trying to do. Below are the methods of my game class I am working in:
This Method Is Supposed To Get All The Non Transparent Pixels In My Sprite
public HashSet<String> getMask(Sprite character){
HashSet <String> mask = new HashSet<String>();
int pixel;
int alpha;
for(int i = 0; i < character.getWidth(); i++){
for(int j = 0; j < character.getHeight(); i++){
pixel = character.getRGB(i,j);
alpah = (pixel >> 24) & 0xff;
if(alpha != 0){
mask.add((character.getX + i) + "," + (character.getY - j));
}
}
}
return mask;
}
Method To Check The Collisions
public boolean checkCollision(Sprite a, Sprite b){
// This method detects to see if the images overlap at all. If they do, collision is possible
int ax1 = a.getX();
int ay1 = a.getY();
int ax2 = ax1 + a.getWidth();
int ay2 = ay1 + a.getHeight();
int bx1 = b.getX();
int by1 = b.getY();
int bx2 = bx1 + b.getWidth();
int by2 = by1 + b.getHeight();
if(by2 < ay1 || ay2 < by1 || bx2 < ax1 || ax2 < bx1){
return false; // Collision is impossible.
}
else {// Collision is possible.
// get the masks for both images
HashSet<String> maskPlayer1 = getMask(shark);
HashSet<String> maskPlayer2 = getMask(torpedo);
maskPlayer1.retainAll(maskPlayer2); // Check to see if any pixels in maskPlayer2 are the same as those in maskPlayer1
if(maskPlayer1.size() > 0){ // if so, than there exists at least one pixel that is the same in both images, thus
System.out.println("Collision" + count);// collision has occurred.
count++;
return true;
}
}
return false;
}
In the getMask() method above you can see that I am saying: character.getRGB() however because character is of type Sprite I am getting an error as I can only use getRGB() with a buffered image.
So as far as I am aware the getRGB() is happy to get the current pixels that the buffered image is moving over in the game but not happy to get the current pixels for a Sprite. I could be misunderstanding how this method works?
So I am wondering if there is any way around this error or if not, would you be able to point me in the right direction
Thanks everyone
Sprite is some class that extends Rectangle or has such properties.
You can add a BufferedImage member to it: whenever you get the RGB from the character you get it from that BufferedImage
int getRGB(int i, int j) {
return myBufferedImage.getRGB(i, j);
}
where you have in Sprite
class Sprite {
BufferedImage myBufferedImage;
public int getRGB(int i, int j) {
......... as shown above
}
....
}

Text extraction and segmentation open CV

I've never used OpenCV before, but I'm trying to write my neural network system to recognize text and I need some tool for text extraction/ segmentation.
How can I use java OpenCV to preprocess and segmentate an image containing text.I don't need to recognize the text, I just need to get each letter in a separate image.
Something like this :
Try this code .No need of OpenCV
import java.awt.image.BufferedImage;
import java.util.ArrayList;
import java.util.List;
import org.neuroph.imgrec.ImageUtilities;
public class CharExtractor {
private int cropTopY = 0;//up locked coordinate
private int cropBottomY = 0;//down locked coordinate
private int cropLeftX = 0;//left locked coordinate
private int cropRightX = 0;//right locked coordinate
private BufferedImage imageWithChars = null;
private boolean endOfImage;//end of picture
private boolean endOfRow;//end of current reading row
/**
* Creates new char extractor with soecified text image
* #param imageWithChars - image with text
*/
public CharExtractor(BufferedImage imageWithChars) {
this.imageWithChars = imageWithChars;
}
public void setImageWithChars(BufferedImage imageWithChars) {
this.imageWithChars = imageWithChars;
}
/**
* This method scans image pixels until it finds the first black pixel (TODO: use foreground color which is black by default).
* When it finds black pixel, it sets cropTopY and returns true. if it reaches end of image and does not find black pixels,
* it sets endOfImage flag and returns false.
* #return - returns true when black pixel is found and cropTopY value is changed, and false if cropTopY value is not changed
*/
private boolean findCropTopY() {
for (int y = cropBottomY; y < imageWithChars.getHeight(); y++) { // why cropYDown? - for multiple lines of text using cropBottomY from previous line above; for first line its zero
for (int x = cropLeftX; x < imageWithChars.getWidth(); x++) { // scan starting from the previous left crop position - or it shoud be right???
if (imageWithChars.getRGB(x, y) == -16777216) { // if its black rixel (also consider condition close to black or not white or different from background)
this.cropTopY = y; // save the current y coordiante
return true; // and return true
}
}
}
endOfImage = true; //sets this flag if no black pixels are found
return false; // and return false
}
/**
* This method scans image pixels until it finds first row with white pixels. (TODO: background color which is white by default).
* When it finds line whith all white pixels, it sets cropBottomY and returns true
* #return - returns true when cropBottomY value is set, false otherwise
*/
private boolean findCropBottomY() {
for (int y = cropTopY + 1; y < imageWithChars.getHeight(); y++) { // scan image from top to bottom
int whitePixCounter = 0; //counter of white pixels in a row
for (int x = cropLeftX; x < imageWithChars.getWidth(); x++) { // scan all pixels to right starting from left crop position
if (imageWithChars.getRGB(x, y) == -1) { // if its white pixel
whitePixCounter++; // increase counter
}
}
if (whitePixCounter == imageWithChars.getWidth()-1) { // if we have reached end of line counting white pixels (x pos)
cropBottomY = y;// that means that we've found white line, so set current y coordinate minus 1
return true; // as cropBottomY and finnish with true
}
if (y == imageWithChars.getHeight() - 1) { // if we have reached end of image
cropBottomY = y; // set crop bottom
endOfImage = true; // set corresponding endOfImage flag
return true; // and return true
}
}
return false; // this should never happen, however its possible if image has non white bg
}
private boolean findCropLeftX() {
int whitePixCounter = 0; // white pixel counter between the letters
for (int x = cropRightX; x < imageWithChars.getWidth(); x++) { // start from previous righ crop position (previous letter), and scan following pixels to the right
for (int y = cropTopY; y <= cropBottomY; y++) { // vertical pixel scan at current x coordinate
if (imageWithChars.getRGB(x, y) == -16777216) { // when we find black pixel
cropLeftX = x; // set cropLeftX
return true; // and return true
}
}
// BUG?: this condition looks strange.... we might not need whitePixCounter at all, it might be used for 'I' letter
whitePixCounter++; // if its not black pixel assume that its white pixel
if (whitePixCounter == 3) { // why 3 pixels? its hard coded for some case and does not work in general...!!!
whitePixCounter = 0; // why does it sets to zero, this has no purporse at all...
}
}
endOfRow = true; // if we have reached end of row and we have not found black pixels, set the endOfRow flag
return false; // and return false
}
/**
* This method scans image pixels to the right until it finds next row where all pixel are white, y1 and y2.
* #return - return true when x2 value is changed and false when x2 value is not changed
*/
private boolean findCropRightX() {
for (int x = cropLeftX + 1; x < imageWithChars.getWidth(); x++) { // start from current cropLeftX position and scan pixels to the right
int whitePixCounter = 0;
for (int y = cropTopY; y <= cropBottomY; y++) { // vertical pixel scan at current x coordinate
if (imageWithChars.getRGB(x, y) == -1) { // if we have white pixel at current (x, y)
whitePixCounter++; // increase whitePixCounter
}
}
// this is for space!
int heightPixels = cropBottomY - cropTopY; // calculate crop height
if (whitePixCounter == heightPixels+1) { // if white pixel count is equal to crop height+1 then this is white vertical line, means end of current char/ (+1 is for case when there is only 1 pixel; a 'W' bug fix)
cropRightX = x; // so set cropRightX
return true; // and return true
}
// why we need this when we allready have condiiton in the for loop? - for the last letter in the row.
if (x == imageWithChars.getWidth() - 1) { // if we have reached end of row with x position
cropRightX = x; // set cropRightX
endOfRow = true; // set endOfRow flag
return true; // and return true
}
}
}
public List<BufferedImage> extractCharImagesToRecognize() {
List<BufferedImage> trimedImages = new ArrayList<BufferedImage>();
int i = 0;
while (endOfImage == false) {
endOfRow = false;
boolean foundTop = findCropTopY();
boolean foundBottom = false;
if (foundTop == true) {
foundBottom = findCropBottomY();
if (foundBottom == true) {
while (endOfRow == false) {
boolean foundLeft = false;
boolean foundRight = false;
foundLeft = findCropLeftX();
if (foundLeft == true) {
foundRight = findCropRightX();
if (foundRight == true) {
BufferedImage image = ImageUtilities.trimImage(ImageUtilities.cropImage(imageWithChars, cropLeftX, cropTopY, cropRightX, cropBottomY));
trimedImages.add(image);
i++;
}
}
}
cropLeftX = 0;
cropRightX = 0;
}
}
}
cropTopY = 0;
cropBottomY = 0;
endOfImage = false;
return trimedImages;
}
public static void main(String[] args) throws Exception {
File f=new File("./written.png");
BufferedImage img=ImageIO.read(f);
CharExtractor ch=new CharExtractor(img);
List<BufferedImage> list=ch.extractCharImagesToRecognize();
for(int i=0;i<list.size();i++)
{
File outputfile = new File("./char_" +i+ ".png");
ImageIO.write(list.get(i),"png", outputfile);
}
}
}
What you are trying to do is a general scene text localization problem, and it's pretty difficult. Check out this article for inspiration - http://www.maseltov.eu/wp-content/uploads/2014/02/CTU-03_Real-Time-Scene-Text-Localization-and-Recognition.pdf
What you could do is:
write a program which extracts MSER objects from an image
extract features from every patch determined by each individual MSER (which features is outlined in the article)
train your classifier (in your case a neural network I guess?) so that it is able to distinguish between character and non-character regions
write a program which uses your classifier to extract MSERs and classify them using the trained NN.
The MSER algorithm is implemented in OpenCV, so that is a plus. There are also neural network classifiers there, but since I only used SVM, I can not comment on that too much. I should say that we had to solve this problem as well and it is perfectly possible to do so using OpenCV. Just don't expect to get everything on a silver platter - there is a lot of work involved; especially when choosing and extracting the blob features.
I am not familiar with neural networks, but if you just want to find letters in an image with respecting scale and rotation, I can recommend this Project http://www.codeproject.com/Articles/196168/Contour-Analysis-for-Image-Recognition-in-C
It is written in C#, but you could port it to java or at least get a good amount of insights on this topic.

Getting a NullPointerException at seemingly random intervals, not sure why

I'm running an example from a Kinect library for Processing (http://www.shiffman.net/2010/11/14/kinect-and-processing/) and sometimes get a NullPointerException pointing to this line:
int rawDepth = depth[offset];
The depth array is created in this line:
int[] depth = kinect.getRawDepth();
I'm not exactly sure what a NullPointerException is, and much googling hasn't really helped. It seems odd to me that the code compiles 70% of the time and returns the error unpredictably. Could the hardware itself be affecting it?
Here's the whole example if it helps:
// Daniel Shiffman
// Kinect Point Cloud example
// http://www.shiffman.net
// https://github.com/shiffman/libfreenect/tree/master/wrappers/java/processing
import org.openkinect.*;
import org.openkinect.processing.*;
// Kinect Library object
Kinect kinect;
float a = 0;
// Size of kinect image
int w = 640;
int h = 480;
// We'll use a lookup table so that we don't have to repeat the math over and over
float[] depthLookUp = new float[2048];
void setup() {
size(800,600,P3D);
kinect = new Kinect(this);
kinect.start();
kinect.enableDepth(true);
// We don't need the grayscale image in this example
// so this makes it more efficient
kinect.processDepthImage(false);
// Lookup table for all possible depth values (0 - 2047)
for (int i = 0; i < depthLookUp.length; i++) {
depthLookUp[i] = rawDepthToMeters(i);
}
}
void draw() {
background(0);
fill(255);
textMode(SCREEN);
text("Kinect FR: " + (int)kinect.getDepthFPS() + "\nProcessing FR: " + (int)frameRate,10,16);
// Get the raw depth as array of integers
int[] depth = kinect.getRawDepth();
// We're just going to calculate and draw every 4th pixel (equivalent of 160x120)
int skip = 4;
// Translate and rotate
translate(width/2,height/2,-50);
rotateY(a);
for(int x=0; x<w; x+=skip) {
for(int y=0; y<h; y+=skip) {
int offset = x+y*w;
// Convert kinect data to world xyz coordinate
int rawDepth = depth[offset];
PVector v = depthToWorld(x,y,rawDepth);
stroke(255);
pushMatrix();
// Scale up by 200
float factor = 200;
translate(v.x*factor,v.y*factor,factor-v.z*factor);
// Draw a point
point(0,0);
popMatrix();
}
}
// Rotate
a += 0.015f;
}
// These functions come from: http://graphics.stanford.edu/~mdfisher/Kinect.html
float rawDepthToMeters(int depthValue) {
if (depthValue < 2047) {
return (float)(1.0 / ((double)(depthValue) * -0.0030711016 + 3.3309495161));
}
return 0.0f;
}
PVector depthToWorld(int x, int y, int depthValue) {
final double fx_d = 1.0 / 5.9421434211923247e+02;
final double fy_d = 1.0 / 5.9104053696870778e+02;
final double cx_d = 3.3930780975300314e+02;
final double cy_d = 2.4273913761751615e+02;
PVector result = new PVector();
double depth = depthLookUp[depthValue];//rawDepthToMeters(depthValue);
result.x = (float)((x - cx_d) * depth * fx_d);
result.y = (float)((y - cy_d) * depth * fy_d);
result.z = (float)(depth);
return result;
}
void stop() {
kinect.quit();
super.stop();
}
And here are the errors:
processing.app.debug.RunnerException: NullPointerException
at processing.app.Sketch.placeException(Sketch.java:1543)
at processing.app.debug.Runner.findException(Runner.java:583)
at processing.app.debug.Runner.reportException(Runner.java:558)
at processing.app.debug.Runner.exception(Runner.java:498)
at processing.app.debug.EventThread.exceptionEvent(EventThread.java:367)
at processing.app.debug.EventThread.handleEvent(EventThread.java:255)
at processing.app.debug.EventThread.run(EventThread.java:89)
Exception in thread "Animation Thread" java.lang.NullPointerException
at org.openkinect.processing.Kinect.enableDepth(Kinect.java:70)
at PointCloud.setup(PointCloud.java:48)
at processing.core.PApplet.handleDraw(PApplet.java:1583)
at processing.core.PApplet.run(PApplet.java:1503)
at java.lang.Thread.run(Thread.java:637)
You are getting a NullPointerException since the value of the depth array is null. You can see from the source code of the Kinect class, there is a chance of a null value being returned by the getRawDepth() method. It is likely that there is no image being displayed at the time.
The code can be found at:
https://github.com/shiffman/libfreenect/blob/master/wrappers/java/processing/KinectProcessing/src/org/openkinect/processing/Kinect.java
Your code should check if the depth array is null before trying to process it. For example...
int[] depth = kinect.getRawDepth();
if (depth == null) {
// do something here where you handle there being no image
} else {
// We're just going to calculate and draw every 4th pixel (equivalent of 160x120)
int skip = 4;
// Translate and rotate
translate(width/2,height/2,-50);
rotateY(a);
for(int x=0; x<w; x+=skip) {
for(int y=0; y<h; y+=skip) {
int offset = x+y*w;
// Convert kinect data to world xyz coordinate
int rawDepth = depth[offset];
PVector v = depthToWorld(x,y,rawDepth);
stroke(255);
pushMatrix();
// Scale up by 200
float factor = 200;
translate(v.x*factor,v.y*factor,factor-v.z*factor);
// Draw a point
point(0,0);
popMatrix();
}
}
// Rotate
a += 0.015f;
}
I would suggest using a Java Debugger so that you can see the state of the variables at the time the exception is thrown. Some people also like to use log statements to output the values of the variables at different points in the application.
You can then trace the problem back to a point where one of the values is not populated with a non-null value.
The null pointer is happening when offset > kinect.getRawDepth();
You have a lot of code here, I'm not going to look at it all. Why can you assume that offset is < kinect.getRawDepth()?
Edit:
On second though, #Asaph's comment is probably right.
Null Pointer exception happens when depth[offset] does not exist or has not been allocated. Check when depth[offset] is undefined and that is the cause of the nullpointer exception.
Check when kinect.getRawDepth(); is greater than offset.

Categories

Resources