Find known sub image in larger image - java

Does anyone know of an algorithm (or search terms / descriptions) to locate a known image within a larger image?
e.g.
I have an image of a single desktop window containing various buttons and areas (target). I also have code to capture a screen shot of the current desktop. I would like an algorithm that will help me find the target image within the larger desktop image (what exact x and y coordinates the window is located at). The target image may be located anywhere in the larger image and may not be 100% exactly the same (very similar but not exact possibly b/c of OS display differences)
Does anyone know of such an algorithm or class of algorithms?
I have found various image segmentation and computer vision algorithms but they seem geared to "fuzzy" classification of regions and not locating a specific image within another.
** My goal is to create a framework that, given some seed target images, can find "look" at the desktop, find the target area and "watch" it for changes. **

Have a look at the paper I wrote: http://werner.yellowcouch.org/Papers/subimg/index.html. It's highly detailed and appears to be the only article discussing how to apply fourier transformation to the problem of subimage finding.
In short, if you want to use the fourier transform one could apply the following formula: the correlation between image A and image B when image A is shifted over dx,dy is given in the following matrix: C=ifft(fft(A) x conjugate(fft(B)). So, the position in image C that has the highest value, has the highest correlation and that position reflects dx,dy.
This result works well for subimages that are relatively large. For smaller images, some more work is necessary as explained in the article. Nevertheless, such fourier transforms are quite fast. It results in around 3*sxsylog_2(sx*sy)+3*sx*sy operations.

You said your image may not be exactly the same, but then say you don't want "fuzzy" algorithms. I'm not sure those are compatible. In general, though, I think you want to look at image registration algorithms. There's an open source C++ package called ITK that might provide some hints. Also ImageJ is a popular open source Java package. Both of these have at least some registration capabilities available if you poke around.

Here's the skeleton of code you'd want to use:
// look for all (x,y) positions where target appears in desktop
List<Loc> findMatches(Image desktop, Image target, float threshold) {
List<Loc> locs;
for (int y=0; y<desktop.height()-target.height(); y++) {
for (int x=0; x<desktop.width()-target.width(); x++) {
if (imageDistance(desktop, x, y, target) < threshold) {
locs.append(Loc(x,y));
}
}
}
return locs;
}
// computes the root mean squared error between a rectangular window in
// bigImg and target.
float imageDistance(Image bigImg, int bx, int by, Image target) {
float sum_dist2 = 0.0;
for (int y=0; y<target.height(); y++) {
for (int x=0; x<target.width(); x++) {
// assume RGB images...
for (int colorChannel=0; colorChannel<3; colorChannel++) {
float dist = target.getPixel(x,y) - bigImg.getPixel(bx+x,by+y);
sum_dist2 += dist * dist;
}
}
}
return Math.sqrt(sum_dist2 / target.width() / target.height());
}
You could consider other image distances (see a similar question). For your application, the RMS error is probably a good choice.
There are probably various Java libraries that compute this distance for you efficiently.

You could use unique visual elements of this target area to determine its position. These unique visual elements are like a "signature". Examples: unique icons, images and symbols. This approach works independently of the window resolution if you have unique elements in the corners. For fixed sized windows, just one element is sufficient to find all window coordinates.
Below I illustrate the idea with a simple example using Marvin Framework.
Unique elements:
Program output:
Original Image:
window.png
Source code:
import static marvin.MarvinPluginCollection.*;
public class FindSubimageWindow {
public FindSubimageWindow(){
MarvinImage window = MarvinImageIO.loadImage("./res/window.png");
MarvinImage eclipse = MarvinImageIO.loadImage("./res/eclipse_icon.png");
MarvinImage progress = MarvinImageIO.loadImage("./res/progress_icon.png");
MarvinSegment seg1, seg2;
seg1 = findSubimage(eclipse, window, 0, 0);
drawRect(window, seg1.x1, seg1.y1, seg1.x2-seg1.x1, seg1.y2-seg1.y1);
seg2 = findSubimage(progress, window, 0, 0);
drawRect(window, seg2.x1, seg2.y1, seg2.x2-seg2.x1, seg2.y2-seg2.y1);
drawRect(window, seg1.x1-10, seg1.y1-10, (seg2.x2-seg1.x1)+25, (seg2.y2-seg1.y1)+20);
MarvinImageIO.saveImage(window, "./res/window_out.png");
}
private void drawRect(MarvinImage image, int x, int y, int width, int height){
x-=4; y-=4; width+=8; height+=8;
image.drawRect(x, y, width, height, Color.red);
image.drawRect(x+1, y+1, width-2, height-2, Color.red);
image.drawRect(x+2, y+2, width-4, height-4, Color.red);
}
public static void main(String[] args) {
new FindSubimageWindow();
}
}

I considered the solution of Werner Van Belle (since all other approaches are incredible slow - not practicable at all):
An Adaptive Filter for the Correct Localization of Subimages: FFT
based Subimage Localization Requires Image Normalization to work
properly
I wrote the code in C# where I need my solution, but I am getting highly inaccurate results. Does it really not work well, in contrary to Van Belle's statement, or did I do something wrong? I used https://github.com/tszalay/FFTWSharp as a C# wrapper for FFTW.
Here is my translated code: (original in C++ at http://werner.yellowcouch.org/Papers/subimg/index.html)
using System.Diagnostics;
using System;
using System.Runtime.InteropServices;
using System.Drawing;
using System.Drawing.Imaging;
using System.IO;
using FFTWSharp;
using unsigned1 = System.Byte;
using signed2 = System.Int16;
using signed8 = System.Int64;
public class Subimage
{
/**
* This program finds a subimage in a larger image. It expects two arguments.
* The first is the image in which it must look. The secon dimage is the
* image that is to be found. The program relies on a number of different
* steps to perform the calculation.
*
* It will first normalize the input images in order to improve the
* crosscorrelation matching. Once the best crosscorrelation is found
* a sad-matchers is applied in a grid over the larger image.
*
* The following two article explains the details:
*
* Werner Van Belle; An Adaptive Filter for the Correct Localization
* of Subimages: FFT based Subimage Localization Requires Image
* Normalization to work properly; 11 pages; October 2007.
* http://werner.yellowcouch.org/Papers/subimg/
*
* Werner Van Belle; Correlation between the inproduct and the sum
* of absolute differences is -0.8485 for uniform sampled signals on
* [-1:1]; November 2006
*/
unsafe public static Point FindSubimage_fftw(string[] args)
{
if (args == null || args.Length != 2)
{
Console.Write("Usage: subimg\n" + "\n" + " subimg is an image matcher. It requires two arguments. The first\n" + " image should be the larger of the two. The program will search\n" + " for the best position to superimpose the needle image over the\n" + " haystack image. The output of the the program are the X and Y\n" + " coordinates.\n\n" + " http://werner.yellowouch.org/Papers/subimg/\n");
return new Point();
}
/**
* The larger image will be called A. The smaller image will be called B.
*
* The code below relies heavily upon fftw. The indices necessary for the
* fast r2c and c2r transforms are at best confusing. Especially regarding
* the number of rows and colums (watch our for Asx vs Asx2 !).
*
* After obtaining all the crosscorrelations we will scan through the image
* to find the best sad match. As such we make a backup of the original data
* in advance
*
*/
int Asx = 0, Asy = 0;
signed2[] A = read_image(args[0], ref Asx, ref Asy);
int Asx2 = Asx / 2 + 1;
int Bsx = 0, Bsy = 0;
signed2[] B = read_image(args[1], ref Bsx, ref Bsy);
unsigned1[] Asad = new unsigned1[Asx * Asy];
unsigned1[] Bsad = new unsigned1[Bsx * Bsy];
for (int i = 0; i < Bsx * Bsy; i++)
{
Bsad[i] = (unsigned1)B[i];
Asad[i] = (unsigned1)A[i];
}
for (int i = Bsx * Bsy; i < Asx * Asy; i++)
Asad[i] = (unsigned1)A[i];
/**
* Normalization and windowing of the images.
*
* The window size (wx,wy) is half the size of the smaller subimage. This
* is useful to have as much good information from the subimg.
*/
int wx = Bsx / 2;
int wy = Bsy / 2;
normalize(ref B, Bsx, Bsy, wx, wy);
normalize(ref A, Asx, Asy, wx, wy);
/**
* Preparation of the fourier transforms.
* Aa is the amplitude of image A. Af is the frequence of image A
* Similar for B. crosscors will hold the crosscorrelations.
*/
IntPtr Aa = fftw.malloc(sizeof(double) * Asx * Asy);
IntPtr Af = fftw.malloc(sizeof(double) * 2 * Asx2 * Asy);
IntPtr Ba = fftw.malloc(sizeof(double) * Asx * Asy);
IntPtr Bf = fftw.malloc(sizeof(double) * 2 * Asx2 * Asy);
/**
* The forward transform of A goes from Aa to Af
* The forward tansform of B goes from Ba to Bf
* In Bf we will also calculate the inproduct of Af and Bf
* The backward transform then goes from Bf to Aa again. That
* variable is aliased as crosscors;
*/
//#original: fftw_plan_dft_r2c_2d
//IntPtr forwardA = fftwf.dft(2, new int[] { Asy, Asx }, Aa, Af, fftw_direction.Forward, fftw_flags.Estimate);//equal results
IntPtr forwardA = fftwf.dft_r2c_2d(Asy, Asx, Aa, Af, fftw_flags.Estimate);
//#original: fftw_plan_dft_r2c_2d
//IntPtr forwardB = fftwf.dft(2, new int[] { Asy, Asx }, Ba, Bf, fftw_direction.Forward, fftw_flags.Estimate);//equal results
IntPtr forwardB = fftwf.dft_r2c_2d(Asy, Asx, Ba, Bf, fftw_flags.Estimate);
double* crosscorrs = (double*)Aa;
//#original: fftw_plan_dft_c2r_2d
//IntPtr backward = fftwf.dft(2, new int[] { Asy, Asx }, Bf, Aa, fftw_direction.Backward, fftw_flags.Estimate);//equal results
IntPtr backward = fftwf.dft_c2r_2d(Asy, Asx, Bf, Aa, fftw_flags.Estimate);
/**
* The two forward transforms of A and B. Before we do so we copy the normalized
* data into the double array. For B we also pad the data with 0
*/
for (int row = 0; row < Asy; row++)
for (int col = 0; col < Asx; col++)
((double*)Aa)[col + Asx * row] = A[col + Asx * row];
fftw.execute(forwardA);
for (int j = 0; j < Asx * Asy; j++)
((double*)Ba)[j] = 0;
for (int row = 0; row < Bsy; row++)
for (int col = 0; col < Bsx; col++)
((double*)Ba)[col + Asx * row] = B[col + Bsx * row];
fftw.execute(forwardB);
/**
* The inproduct of the two frequency domains and calculation
* of the crosscorrelations
*/
double norm = Asx * Asy;
for (int j = 0; j < Asx2 * Asy; j++)
{
double a = ((double*)Af)[j * 2];//#Af[j][0];
double b = ((double*)Af)[j * 2 + 1];//#Af[j][1];
double c = ((double*)Bf)[j * 2];//#Bf[j][0];
double d = ((double*)Bf)[j * 2 + 1];//#-Bf[j][1];
double e = a * c - b * d;
double f = a * d + b * c;
((double*)Bf)[j * 2] = (double)(e / norm);//#Bf[j][0] = (fftw_real)(e / norm);
((double*)Bf)[j * 2 + 1] = (double)(f / norm);//Bf[j][1] = (fftw_real)(f / norm);
}
fftw.execute(backward);
/**
* We now have a correlation map. We can spent one more pass
* over the entire image to actually find the best matching images
* as defined by the SAD.
* We calculate this by gridding the entire image according to the
* size of the subimage. In each cel we want to know what the best
* match is.
*/
int sa = 1 + Asx / Bsx;
int sb = 1 + Asy / Bsy;
int sadx = 0;
int sady = 0;
signed8 minsad = Bsx * Bsy * 256L;
for (int a = 0; a < sa; a++)
{
int xl = a * Bsx;
int xr = xl + Bsx;
if (xr > Asx) continue;
for (int b = 0; b < sb; b++)
{
int yl = b * Bsy;
int yr = yl + Bsy;
if (yr > Asy) continue;
// find the maximum correlation in this cell
int cormxat = xl + yl * Asx;
double cormx = crosscorrs[cormxat];
for (int x = xl; x < xr; x++)
for (int y = yl; y < yr; y++)
{
int j = x + y * Asx;
if (crosscorrs[j] > cormx)
cormx = crosscorrs[cormxat = j];
}
int corx = cormxat % Asx;
int cory = cormxat / Asx;
// We dont want subimages that fall of the larger image
if (corx + Bsx > Asx) continue;
if (cory + Bsy > Asy) continue;
signed8 sad = 0;
for (int x = 0; sad < minsad && x < Bsx; x++)
for (int y = 0; y < Bsy; y++)
{
int j = (x + corx) + (y + cory) * Asx;
int i = x + y * Bsx;
sad += Math.Abs((int)Bsad[i] - (int)Asad[j]);
}
if (sad < minsad)
{
minsad = sad;
sadx = corx;
sady = cory;
// printf("* ");
}
// printf("Grid (%d,%d) (%d,%d) Sip=%g Sad=%lld\n",a,b,corx,cory,cormx,sad);
}
}
//Console.Write("{0:D}\t{1:D}\n", sadx, sady);
/**
* Aa, Ba, Af and Bf were allocated in this function
* crosscorrs was an alias for Aa and does not require deletion.
*/
fftw.free(Aa);
fftw.free(Ba);
fftw.free(Af);
fftw.free(Bf);
return new Point(sadx, sady);
}
private static void normalize(ref signed2[] img, int sx, int sy, int wx, int wy)
{
/**
* Calculate the mean background. We will subtract this
* from img to make sure that it has a mean of zero
* over a window size of wx x wy. Afterwards we calculate
* the square of the difference, which will then be used
* to normalize the local variance of img.
*/
signed2[] mean = boxaverage(img, sx, sy, wx, wy);
signed2[] sqr = new signed2[sx * sy];
for (int j = 0; j < sx * sy; j++)
{
img[j] -= mean[j];
signed2 v = img[j];
sqr[j] = (signed2)(v * v);
}
signed2[] var = boxaverage(sqr, sx, sy, wx, wy);
/**
* The normalization process. Currenlty still
* calculated as doubles. Could probably be fixed
* to integers too. The only problem is the sqrt
*/
for (int j = 0; j < sx * sy; j++)
{
double v = Math.Sqrt(Math.Abs((double)var[j]));//#double v = sqrt(fabs(var[j])); <- ambigous
Debug.Assert(!double.IsInfinity(v) && v >= 0);
if (v < 0.0001) v = 0.0001;
img[j] = (signed2)(img[j] * (32 / v));
if (img[j] > 127) img[j] = 127;
if (img[j] < -127) img[j] = -127;
}
/**
* As a last step in the normalization we
* window the sub image around the borders
* to become 0
*/
window(ref img, sx, sy, wx, wy);
}
private static signed2[] boxaverage(signed2[] input, int sx, int sy, int wx, int wy)
{
signed2[] horizontalmean = new signed2[sx * sy];
Debug.Assert(horizontalmean != null);
int wx2 = wx / 2;
int wy2 = wy / 2;
int from = (sy - 1) * sx;
int to = (sy - 1) * sx;
int initcount = wx - wx2;
if (sx < initcount) initcount = sx;
int xli = -wx2;
int xri = wx - wx2;
for (; from >= 0; from -= sx, to -= sx)
{
signed8 sum = 0;
int count = initcount;
for (int c = 0; c < count; c++)
sum += (signed8)input[c + from];
horizontalmean[to] = (signed2)(sum / count);
int xl = xli, x = 1, xr = xri;
/**
* The area where the window is slightly outside the
* left boundary. Beware: the right bnoundary could be
* outside on the other side already
*/
for (; x < sx; x++, xl++, xr++)
{
if (xl >= 0) break;
if (xr < sx)
{
sum += (signed8)input[xr + from];
count++;
}
horizontalmean[x + to] = (signed2)(sum / count);
}
/**
* both bounds of the sliding window
* are fully inside the images
*/
for (; xr < sx; x++, xl++, xr++)
{
sum -= (signed8)input[xl + from];
sum += (signed8)input[xr + from];
horizontalmean[x + to] = (signed2)(sum / count);
}
/**
* the right bound is falling of the page
*/
for (; x < sx; x++, xl++)
{
sum -= (signed8)input[xl + from];
count--;
horizontalmean[x + to] = (signed2)(sum / count);
}
}
/**
* The same process as aboe but for the vertical dimension now
*/
int ssy = (sy - 1) * sx + 1;
from = sx - 1;
signed2[] verticalmean = new signed2[sx * sy];
Debug.Assert(verticalmean != null);
to = sx - 1;
initcount = wy - wy2;
if (sy < initcount) initcount = sy;
int initstopat = initcount * sx;
int yli = -wy2 * sx;
int yri = (wy - wy2) * sx;
for (; from >= 0; from--, to--)
{
signed8 sum = 0;
int count = initcount;
for (int d = 0; d < initstopat; d += sx)
sum += (signed8)horizontalmean[d + from];
verticalmean[to] = (signed2)(sum / count);
int yl = yli, y = 1, yr = yri;
for (; y < ssy; y += sx, yl += sx, yr += sx)
{
if (yl >= 0) break;
if (yr < ssy)
{
sum += (signed8)horizontalmean[yr + from];
count++;
}
verticalmean[y + to] = (signed2)(sum / count);
}
for (; yr < ssy; y += sx, yl += sx, yr += sx)
{
sum -= (signed8)horizontalmean[yl + from];
sum += (signed8)horizontalmean[yr + from];
verticalmean[y + to] = (signed2)(sum / count);
}
for (; y < ssy; y += sx, yl += sx)
{
sum -= (signed8)horizontalmean[yl + from];
count--;
verticalmean[y + to] = (signed2)(sum / count);
}
}
return verticalmean;
}
private static void window(ref signed2[] img, int sx, int sy, int wx, int wy)
{
int wx2 = wx / 2;
int sxsy = sx * sy;
int sx1 = sx - 1;
for (int x = 0; x < wx2; x++)
for (int y = 0; y < sxsy; y += sx)
{
img[x + y] = (signed2)(img[x + y] * x / wx2);
img[sx1 - x + y] = (signed2)(img[sx1 - x + y] * x / wx2);
}
int wy2 = wy / 2;
int syb = (sy - 1) * sx;
int syt = 0;
for (int y = 0; y < wy2; y++)
{
for (int x = 0; x < sx; x++)
{
/**
* here we need to recalculate the stuff (*y/wy2)
* to preserve the discrete nature of integers.
*/
img[x + syt] = (signed2)(img[x + syt] * y / wy2);
img[x + syb] = (signed2)(img[x + syb] * y / wy2);
}
/**
* The next row for the top rows
* The previous row for the bottom rows
*/
syt += sx;
syb -= sx;
}
}
private static signed2[] read_image(string filename, ref int sx, ref int sy)
{
Bitmap image = new Bitmap(filename);
sx = image.Width;
sy = image.Height;
signed2[] GreyImage = new signed2[sx * sy];
BitmapData bitmapData1 = image.LockBits(new Rectangle(0, 0, image.Width, image.Height), ImageLockMode.ReadOnly, PixelFormat.Format32bppArgb);
unsafe
{
byte* imagePointer = (byte*)bitmapData1.Scan0;
for (int y = 0; y < bitmapData1.Height; y++)
{
for (int x = 0; x < bitmapData1.Width; x++)
{
GreyImage[x + y * sx] = (signed2)((imagePointer[0] + imagePointer[1] + imagePointer[2]) / 3.0);
//4 bytes per pixel
imagePointer += 4;
}//end for x
//4 bytes per pixel
imagePointer += bitmapData1.Stride - (bitmapData1.Width * 4);
}//end for y
}//end unsafe
image.UnlockBits(bitmapData1);
return GreyImage;
}
}

Your don't need fuzzy as in "neural network" because (as I understand) you don't have rotation, tilts or similar. If OS display differences are the only modifications the difference should be minimal.
So WernerVanBelle's paper is nice but not really necessary and MrFooz's code works - but is terribly innefficient (O(width * height * patter_width * pattern_height)!).
The best algorithm I can think of is the Boyer-Moore algorithm for string searching, modified to allow 2 dimensional searches.
http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm
Instead of one displacement you will have to store a pair of displacements dx and dy for each color. When checking a pixel you move only in the x direction x = x + dx and store only the minimum of the dy's DY = min(DY, dy) to set the new y value after a whole line has been tested (ie x > width).
Creating a table for all possible colors probably is prohibitve due to the imense number of possible colors, so either use a map to store the rules (and default to the pattern dimensions if a color is not inside the map) or create tables for each color seperately and set dx = max(dx(red), dx(green), dx(blue)) - which is only an approximation but removes the overhead of a map.
In the preprocessing of the bad-character rule, you can account for small deviations of colors by spreading rules from all colors to their "neighbouring" colors (however you wish to define neighbouring).

Related

Java Perlin Noise height map generation lacks desired randomness

I am trying to generate a height map using Perlin Noise, but am having trouble with generating truly unique maps. That is, each one is a minor variation of all the others. Two examples are below:
And here is my code (most was just copied and pasted from Ken Perlin's implementation, though adapted for 2D):
public class HeightMap {
private ArrayList<Point> map = new ArrayList<>();
private double elevationMax, elevationMin;
private final int[] P = new int[512], PERMUTATION = { 151,160,137,91,90,15,
131,13,201,95,96,53,194,233,7,225,140,36,103,30,69,142,8,99,37,240,21,10,23,
190, 6,148,247,120,234,75,0,26,197,62,94,252,219,203,117,35,11,32,57,177,33,
88,237,149,56,87,174,20,125,136,171,168, 68,175,74,165,71,134,139,48,27,166,
77,146,158,231,83,111,229,122,60,211,133,230,220,105,92,41,55,46,245,40,244,
102,143,54, 65,25,63,161, 1,216,80,73,209,76,132,187,208, 89,18,169,200,196,
135,130,116,188,159,86,164,100,109,198,173,186, 3,64,52,217,226,250,124,123,
5,202,38,147,118,126,255,82,85,212,207,206,59,227,47,16,58,17,182,189,28,42,
223,183,170,213,119,248,152, 2,44,154,163, 70,221,153,101,155,167, 43,172,9,
129,22,39,253, 19,98,108,110,79,113,224,232,178,185, 112,104,218,246,97,228,
251,34,242,193,238,210,144,12,191,179,162,241, 81,51,145,235,249,14,239,107,
49,192,214, 31,181,199,106,157,184, 84,204,176,115,121,50,45,127, 4,150,254,
138,236,205,93,222,114,67,29,24,72,243,141,128,195,78,66,215,61,156,180
};
public HeightMap() {
this.map = null;
this.elevationMax = 0.0;
this.elevationMin = 0.0;
}
public HeightMap(HeightMap map) {
this.map = map.getPoints();
this.elevationMax = map.getElevationMax();
this.elevationMin = map.getElevationMin();
}
/**
* Generates a Height Map that is, along an imaginary z-axis, centered around the median elevation, given the following parameters:
* #param mapWidth the width [x] of the map
* #param mapHeight the height [y] of the map
* #param tileWidth the width [x] of each tile, or Point
* #param tileHeight the height [y] of each tile, or Point
* #param elevationMax the maximum elevation [z] of the map
* #param elevationMin the minimum elevation [z] of the map
*/
public HeightMap(int mapWidth, int mapHeight, int tileWidth, int tileHeight, double elevationMax, double elevationMin) {
this.elevationMax = elevationMax;
this.elevationMin = elevationMin;
for (int i=0; i < 256 ; i++) {
P[256+i] = P[i] = PERMUTATION[i];
}
int numTilesX = mapWidth / tileWidth;
int numTilesY = mapHeight / tileHeight;
Random r = new Random();
for (int t = 0; t < numTilesX * numTilesY; t++) {
double x = t % numTilesX;
double y = (t - x) / numTilesX;
r = new Random();
x += r.nextDouble();
y += r.nextDouble();
this.map.add(new Point(x, y, lerp(noise(x, y, 13), (elevationMin + elevationMax) / 2, elevationMax), tileWidth, tileHeight));
}
}
/**
* Ken Perlin's Improved Noise Java Implementation (https://mrl.cs.nyu.edu/~perlin/noise/)
* Adapted for 2D
* #param x the x-coordinate on the map
* #param y the y-coordinate on the map
* #param stretch the factor by which adjacent points are smoothed
* #return a value between -1.0 and 1.0 to represent the height of the terrain at (x, y)
*/
private double noise(double x, double y, double stretch) {
x /= stretch;
y /= stretch;
int X = (int)Math.floor(x) & 255, Y = (int)Math.floor(y) & 255;
x -= Math.floor(x);
y -= Math.floor(y);
double u = fade(x),
v = fade(y);
int AA = P[P[X ] + Y ],
AB = P[P[X ] + Y + 1],
BA = P[P[X + 1] + Y ],
BB = P[P[X + 1] + Y + 1];
return lerp(v, lerp(u, grad(P[AA], x, y), grad(P[BA], x - 1, y)), lerp(u, grad(P[AB], x, y - 1), grad(P[BB], x - 1, y - 1)));
}
private double fade(double t) {
return t * t * t * (t * (t * 6 - 15) + 10);
}
private double lerp(double t, double a, double b) {
return a + t * (b - a);
}
//Riven's Optimization (http://riven8192.blogspot.com/2010/08/calculate-perlinnoise-twice-as-fast.html)
private double grad(int hash, double x, double y) {
switch(hash & 0xF)
{
case 0x0:
case 0x8:
return x + y;
case 0x1:
case 0x9:
return -x + y;
case 0x2:
case 0xA:
return x - y;
case 0x3:
case 0xB:
return -x - y;
case 0x4:
case 0xC:
return y + x;
case 0x5:
case 0xD:
return -y + x;
case 0x6:
case 0xE:
return y - x;
case 0x7:
case 0xF:
return -y - x;
default: return 0; // never happens
}
}
}
Is this problem inherent in Perlin Noise because the 'height' is calculated from nearly the same (x, y) coordinate each time? Is there a way to implement the noise function so that it doesn't depend on the (x, y) coordinate of each point but still looks like terrain? Any help is greatly appreciated.
With some help from a friend of mine, I resolved the problem. Because I was using the same PERMUTATION array each generation cycle, the noise calculation was using the same base values each time. To fix this, I made a method permute() that filled PERMUTATION with the numbers 0 to 255 in a random, non-repeating order. I changed the instantiation of PERMUTATION to just be a new int[].
private final int[] P = new int[512], PERMUTATION = new int[256];
...
public void permute() {
for (int i = 0; i < PERMUTATION.length; i++) {
PERMUTATION[i] = i;
}
Random r = new Random();
int rIndex, rIndexVal;
for (int i = 0; i < PERMUTATION.length; i++) {
rIndex = r.nextInt(PERMUTATION.length);
rIndexVal = PERMUTATION[rIndex];
PERMUTATION[rIndex] = PERMUTATION[i];
PERMUTATION[i] = rIndexVal;
}
}

I try to rotat without lib but it make black points in picture

I am trying to rotate image without standard method , making color array and manipulate it, but when I invoke the, rotation I get black points (look the picture)
Here is my code, colScaled is the picture I am trying to convert to an array:
public void arrays() {
colScaled = zoom2();
int j = 0;
int i = 0;
angel = Integer.parseInt(this.mn.jTextField1.getText());
float degree = (float) Math.toRadians(angel);
float cos = (float) Math.cos(degree);
float sin = (float) Math.sin(degree);
int W = Math.round(colScaled[0].length * Math.abs(sin) + colScaled.length * Math.abs(cos));
int H = Math.round(colScaled[0].length * Math.abs(cos) + colScaled.length * Math.abs(sin));
int x;
int y;
int xn = (int) W / 2;
int yn = (int) H / 2;
int hw = (int) colScaled.length / 2;
int hh = (int) colScaled[0].length / 2;
BufferedImage image = new BufferedImage(W + 1, H + 1, im.getType());
for (i = 0; i < colScaled.length; i++) {
for (j = 0; j < colScaled[0].length; j++) {
x = Math.round((i - hw) * cos - (j - hh) * sin + xn);
y = Math.round((i - hw) * sin + (j - hh) * cos + yn);
image.setRGB(x, y, colScaled[i][j]);
}
}
ImageIcon ico = new ImageIcon(image);
this.mn.jLabel1.setIcon(ico);
}
Notice this block in your code :-
for (i = 0; i < colScaled.length; i++) {
for (j = 0; j < colScaled[0].length; j++) {
x = Math.round((i - hw) * cos - (j - hh) * sin + xn);
y = Math.round((i - hw) * sin + (j - hh) * cos + yn);
image.setRGB(x, y, colScaled[i][j]);
}
}
The x and y is pixel coordinate in source image (colScaled).
The objective of this code is to fill all pixels in destination image (image).
In your loop, there is no guarantee that all pixels in the destination image will be filled, even it is in the rectangle zone.
The above image depict the problem.
See? It is possible that the red pixel in the destination image will not be written.
The correct solution is to iterating pixel in destination image, then find a corresponding pixel in source image later.
Edit: After posting, I just saw the Spektre's comment.
I agree, it seems to be a duplicated question. The word "pixel array" made me thing it is not.

Coloring heightmap faces instead of vertices

I'm trying to create a heightmap colored by face, instead of vertex. For example, this is what I currently have:
But this is what I want:
I read that I have to split each vertex into multiple vertices, then index each separately for the triangles. I also know that blender has a function like this for its models (split vertices, or something?), but I'm not sure what kind of algorithm I would follow for this. This would be the last resort, because multiplying the amount of vertices in the mesh for no reason other than color doesn't seem efficient.
I also discovered something called flatshading (using the flat qualifier on the pixel color in the shaders), but it seems to only draw squares instead of triangles. Is there a way to make it shade triangles?
For reference, this is my current heightmap generation code:
public class HeightMap extends GameModel {
private static final float START_X = -0.5f;
private static final float START_Z = -0.5f;
private static final float REFLECTANCE = .1f;
public HeightMap(float minY, float maxY, float persistence, int width, int height, float spikeness) {
super(createMesh(minY, maxY, persistence, width, height, spikeness), REFLECTANCE);
}
protected static Mesh createMesh(final float minY, final float maxY, final float persistence, final int width,
final int height, float spikeness) {
SimplexNoise noise = new SimplexNoise(128, persistence, 2);// Utils.getRandom().nextInt());
float xStep = Math.abs(START_X * 2) / (width - 1);
float zStep = Math.abs(START_Z * 2) / (height - 1);
List<Float> positions = new ArrayList<>();
List<Integer> indices = new ArrayList<>();
for (int z = 0; z < height; z++) {
for (int x = 0; x < width; x++) {
// scale from [-1, 1] to [minY, maxY]
float heightY = (float) ((noise.getNoise(x * xStep * spikeness, z * zStep * spikeness) + 1f) / 2
* (maxY - minY) + minY);
positions.add(START_X + x * xStep);
positions.add(heightY);
positions.add(START_Z + z * zStep);
// Create indices
if (x < width - 1 && z < height - 1) {
int leftTop = z * width + x;
int leftBottom = (z + 1) * width + x;
int rightBottom = (z + 1) * width + x + 1;
int rightTop = z * width + x + 1;
indices.add(leftTop);
indices.add(leftBottom);
indices.add(rightTop);
indices.add(rightTop);
indices.add(leftBottom);
indices.add(rightBottom);
}
}
}
float[] verticesArr = Utils.listToArray(positions);
Color c = new Color(147, 105, 59);
float[] colorArr = new float[positions.size()];
for (int i = 0; i < colorArr.length; i += 3) {
float brightness = (Utils.getRandom().nextFloat() - 0.5f) * 0.5f;
colorArr[i] = (float) c.getRed() / 255f + brightness;
colorArr[i + 1] = (float) c.getGreen() / 255f + brightness;
colorArr[i + 2] = (float) c.getBlue() / 255f + brightness;
}
int[] indicesArr = indices.stream().mapToInt((i) -> i).toArray();
float[] normalArr = calcNormals(verticesArr, width, height);
return new Mesh(verticesArr, colorArr, normalArr, indicesArr);
}
private static float[] calcNormals(float[] posArr, int width, int height) {
Vector3f v0 = new Vector3f();
Vector3f v1 = new Vector3f();
Vector3f v2 = new Vector3f();
Vector3f v3 = new Vector3f();
Vector3f v4 = new Vector3f();
Vector3f v12 = new Vector3f();
Vector3f v23 = new Vector3f();
Vector3f v34 = new Vector3f();
Vector3f v41 = new Vector3f();
List<Float> normals = new ArrayList<>();
Vector3f normal = new Vector3f();
for (int row = 0; row < height; row++) {
for (int col = 0; col < width; col++) {
if (row > 0 && row < height - 1 && col > 0 && col < width - 1) {
int i0 = row * width * 3 + col * 3;
v0.x = posArr[i0];
v0.y = posArr[i0 + 1];
v0.z = posArr[i0 + 2];
int i1 = row * width * 3 + (col - 1) * 3;
v1.x = posArr[i1];
v1.y = posArr[i1 + 1];
v1.z = posArr[i1 + 2];
v1 = v1.sub(v0);
int i2 = (row + 1) * width * 3 + col * 3;
v2.x = posArr[i2];
v2.y = posArr[i2 + 1];
v2.z = posArr[i2 + 2];
v2 = v2.sub(v0);
int i3 = (row) * width * 3 + (col + 1) * 3;
v3.x = posArr[i3];
v3.y = posArr[i3 + 1];
v3.z = posArr[i3 + 2];
v3 = v3.sub(v0);
int i4 = (row - 1) * width * 3 + col * 3;
v4.x = posArr[i4];
v4.y = posArr[i4 + 1];
v4.z = posArr[i4 + 2];
v4 = v4.sub(v0);
v1.cross(v2, v12);
v12.normalize();
v2.cross(v3, v23);
v23.normalize();
v3.cross(v4, v34);
v34.normalize();
v4.cross(v1, v41);
v41.normalize();
normal = v12.add(v23).add(v34).add(v41);
normal.normalize();
} else {
normal.x = 0;
normal.y = 1;
normal.z = 0;
}
normal.normalize();
normals.add(normal.x);
normals.add(normal.y);
normals.add(normal.z);
}
}
return Utils.listToArray(normals);
}
}
Edit
I've tried doing a couple things. I tried rearranging the indices with flat shading, but that didn't give me the look I wanted. I tried using a uniform vec3 colors and indexing it with gl_VertexID or gl_InstanceID (I'm not entirely sure the difference), but I couldn't get the arrays to compile.
Here is the github repo, by the way.
flat qualified fragment shader inputs will receive the same value for the same primitive. In your case, a triangle.
Of course, a triangle is composed of 3 vertices. And if the vertex shaders output 3 different values, how does the fragment shader know which value to get?
This comes down to what is called the "provoking vertex." When you render, you specify a particular primitive to use in your glDraw* call (GL_TRIANGLE_STRIP, GL_TRIANGLES, etc). These primitive types will generate a number of base primitives (ie: single triangle), based on how many vertices you provided.
When a base primitive is generated, one of the vertices in that base primitive is said to be the "provoking vertex". It is that vertex's data that is used for all flat parameters.
The reason you're seeing what you are seeing is because the two adjacent triangles just happen to be using the same provoking vertex. Your mesh is smooth, so two adjacent triangles share 2 vertices. Your mesh generation just so happens to be generating a mesh such that the provoking vertex for each triangle is shared between them. Which means that the two triangles will get the same flat value.
You will need to adjust your index list or otherwise alter your mesh generation so that this doesn't happen. Or you can just divide your mesh into individual triangles; that's probably much easier.
As a final resort, I just duplicated the vertices, and it seems to work. I haven't been able to profile it to see if it makes a big performance drop. I'd be open to any other suggestions!
for (int z = 0; z < height; z++) {
for (int x = 0; x < width; x++) {
// scale from [-1, 1] to [minY, maxY]
float heightY = (float) ((noise.getNoise(x * xStep * spikeness, z * zStep * spikeness) + 1f) / 2
* (maxY - minY) + minY);
positions.add(START_X + x * xStep);
positions.add(heightY);
positions.add(START_Z + z * zStep);
positions.add(START_X + x * xStep);
positions.add(heightY);
positions.add(START_Z + z * zStep);
}
}
for (int z = 0; z < height - 1; z++) {
for (int x = 0; x < width - 1; x++) {
int leftTop = z * width + x;
int leftBottom = (z + 1) * width + x;
int rightBottom = (z + 1) * width + x + 1;
int rightTop = z * width + x + 1;
indices.add(2 * leftTop);
indices.add(2 * leftBottom);
indices.add(2 * rightTop);
indices.add(2 * rightTop + 1);
indices.add(2 * leftBottom + 1);
indices.add(2 * rightBottom + 1);
}
}

MATLAB vs. FIJI statistical region merging algorithm speed: why is FIJI faster?

I have assembled an algorithm in MATLAB and as part of that am using statistical region merging. I have versions of this in MATLAB and Fiji (ImageJ), both included below. Both versions are based on the Nock and Nielson model and seem similar (although I'm not too familiar with Java/Fiji). When I run this on a 1500x500 2d grayscale image in MATLAB it takes around 3 seconds to process versus less than a second in Fiji. I've tried trimming down the MATLAB code as much as I can but am still running into this speed issue. What's causing this timing difference? Thanks
MATLAB:
% Statistical Region Merging
%
% Nock, Richard and Nielsen, Frank 2004. Statistical Region Merging. IEEE Trans. Pattern Anal. Mach. Intell. 26, 11 (Nov. 2004), 1452-1458.
% DOI= http://dx.doi.org/10.1109/TPAMI.2004.110
%Segmentation parameter Q; Q small few segments, Q large may segments
function [maps,images]=srm3(image,Qlevels)
% Smoothing the image, comment this line if you work on clean or synthetic images
h=fspecial('gaussian',[3 3],1);
image=imfilter(image,h,'symmetric');
smallest_region_allowed=10;
size_image=size(image);
n_pixels=size_image(1)*size_image(2);
% Compute image gradient
[Ix,Iy]=srm_imgGrad(image(:,:,:));
Ix=max(abs(Ix),[],3);
Iy=max(abs(Iy),[],3);
normgradient=sqrt(Ix.^2+Iy.^2);
Ix(:,end)=[];
Iy(end,:)=[];
[~,index]=sort(abs([Iy(:);Ix(:)]));
n_levels=numel(Qlevels);
maps=cell(n_levels,1);
images=cell(n_levels,1);
im_final=zeros(size_image);
Q=256;
map=reshape(1:n_pixels,size_image(1:2));
% gaps=zeros(size(map)); % For future release
treerank=zeros(size_image(1:2));
size_segments=ones(size_image(1:2));
image_seg=image;
%Building pairs
n_pairs=numel(index);
idx2=reshape(map(:,1:end-1),[],1);
idx1=reshape(map(1:end-1,:),[],1);
pairs1=[ idx1;idx2 ];
pairs2=[ idx1+1;idx2+size_image(1) ];
for i=1:n_pairs
C1=pairs1(index(i));
C2=pairs2(index(i));
%Union-Find structure, here are the finds, average complexity O(1)
while (map(C1)~=C1 ); C1=map(C1); end
while (map(C2)~=C2 ); C2=map(C2); end
% Compute the predicate, region merging test
g=256;
logdelta=2*log(6*n_pixels);
dR=(image_seg(C1)-image_seg(C2))^2;
logreg1 = min(g,size_segments(C1))*log(1.0+size_segments(C1));
logreg2 = min(g,size_segments(C2))*log(1.0+size_segments(C2));
dev1=((g*g)/(2.0*Q*size_segments(C1)))*(logreg1 + logdelta);
dev2=((g*g)/(2.0*Q*size_segments(C2)))*(logreg2 + logdelta);
dev=dev1+dev2;
predicat=( (dR<dev) );
if (((C1~=C2)&&predicat) || xor(size_segments(C1)<=smallest_region_allowed, size_segments(C2)<=smallest_region_allowed))
% Find the new root for both regions
if treerank(C1) > treerank(C2)
map(C2) = C1; reg=C1;
elseif treerank(C1) < treerank(C2)
map(C1) = C2; reg=C2;
elseif C1 ~= C2
map(C2) = C1; reg=C1;
treerank(C1) = treerank(C1) + 1;
end
if C1~=C2
% Merge region
nreg=size_segments(C1)+size_segments(C2);
size_segments(C1);
size_segments(C2);
image_seg(reg)=(size_segments(C1)*image_seg(C1)+size_segments(C2)*image_seg(C2))/nreg;
size_segments(reg)=nreg;
end
end
end
while 1
map_ = map(map) ;
if isequal(map_,map) ; break ; end
map = map_ ;
end
im_final(:,:,1)=image_seg(map+(1-1)*n_pixels);
images{1}=im_final;
Fiji:
* Statistical Region Merging.
* %%
* Copyright (C) 2009 - 2013 Johannes Schindelin.
* %%
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
* The views and conclusions contained in the software and documentation are
* those of the authors and should not be interpreted as representing official
* policies, either expressed or implied, of any organization.
* #L%
*/
import ij.IJ;
import ij.ImagePlus;
import ij.ImageStack;
import ij.gui.GenericDialog;
import ij.plugin.filter.PlugInFilter;
import ij.process.ByteProcessor;
import ij.process.FloatProcessor;
import ij.process.ImageProcessor;
import ij.process.ShortProcessor;
import java.util.Arrays;
/*
* The Statistical Region Merging algorithm is described in
*
* R. Nock, F. Nielsen: Statistical Region Merging.
* IEEE Trans. Pattern Anal. Mach. Intell. 26(11): 1452-1458 (2004)
*/
public class SRM_ implements PlugInFilter {
ImagePlus image;
public int setup(String arg, ImagePlus image) {
this.image = image;
return DOES_8G | NO_CHANGES;
}
public void run(ImageProcessor ip) {
boolean isStack = image.getStackSize() > 1;
GenericDialog gd = new GenericDialog("SRM");
gd.addNumericField("Q", Q, 2);
gd.addCheckbox("showAverages", true);
if (isStack)
gd.addCheckbox("3d", true);
gd.showDialog();
if (gd.wasCanceled())
return;
Q = (float)gd.getNextNumber();
boolean showAverages = gd.getNextBoolean();
boolean do3D = isStack ? gd.getNextBoolean() : false;
if (do3D)
srm3D(showAverages).show();
else
srm2D(ip, showAverages).show();
}
final float g = 256; // number of different intensity values
protected float Q = 25; //25; // complexity of the assumed distributions
protected float delta;
/*
* The predicate: is the difference of the averages of the two
* regions R and R' smaller than
*
* g sqrt(1/2Q (1/|R| + 1/|R'|) ln 2/delta)
*
* Instead of calculating the square root all the time, we calculate
* the factor g^2 / 2Q ln 2/delta, and compare
*
* (<R> - <R'>)^2 < factor (1/|R| + 1/|R'|)
*
* instead.
*/
protected float factor, logDelta;
/*
* For performance reasons, these are held in w * h arrays
*/
float[] average;
int[] count;
int[] regionIndex; // if < 0, it is -1 - actual_regionIndex
/*
* The statistical region merging wants to merge regions in a specific
* order: for all neighboring pixel pairs, in ascending order of
* intensity differences.
*
* In that order, it is tested if the regions these two pixels belong
* to (by construction, the regions must be distinct) should be merged.
*
* For efficiency, we do it by bucket sorting, because there are only
* g + 1 possible differences.
*
* After sorting, for each difference the pixel pair with the largest
* index is stored in neighborBuckets[difference], and every
* nextNeighbor[index] points to the pixel pair with the same
* difference and the next smaller index (or -1 if there is none).
*
* The pixel pairs are identified by
*
* 2 * (x + (w - 1) * y) + direction
*
* where direction = 0 means "right neighbor", and direction = 1 means
* " lower neighbor". (We do not need "left" or "up", as the order
* within the pair is not important.)
*
* In n dimensions, it must be n * pixel_count, and "direction"
* specifies the Cartesian unit vector (axis) determining the neighbor.
*/
int[] nextNeighbor, neighborBucket;
protected ImagePlus srm3D(boolean showAverages) {
int w = image.getWidth(), h = image.getHeight();
int d = image.getStackSize();
delta = 1f / (6 * w * h * d);
/*
* This would be the non-relaxed formula:
*
* factor = g * g / 2 / Q * (float)Math.log(2 / delta);
*
* The paper claims that this is more prone to oversegmenting.
*/
factor = g * g / 2 / Q;
logDelta = 2f * (float)Math.log(6 * w * h * d);
IJ.showStatus("Initializing regions");
initializeRegions3D(w, h, d);
IJ.showStatus("Initializing neighbors");
initializeNeighbors3D(w, h, d);
IJ.showStatus("Merging neighbors");
mergeAllNeighbors3D(w, h);
IJ.showStatus("Making stack");
ImageStack stack = new ImageStack(w, h);
if (showAverages)
for (int k = 0; k < d; k++) {
int off = k * w * h;
float[] p = new float[w * h];
for (int i = 0; i < w * h; i++)
p[i] = average[getRegionIndex(i + off)];
stack.addSlice(null, new FloatProcessor(w, h,
p, null));
}
else {
int regionCount = consolidateRegions();
if (regionCount > 1<<16)
IJ.showMessage("Found " + regionCount
+ " regions, which does not fit"
+ " in 16-bit.");
for (int k = 0; k < d; k++) {
ImageProcessor ip;
int off = k * w * h;
if (regionCount > 1<<8) {
short[] p = new short[w * h];
for (int i = 0; i < p.length; i++)
p[i] = (short)regionIndex[i
+ off];
ip = new ShortProcessor(w, h, p, null);
}
else {
byte[] p = new byte[w * h];
for (int i = 0; i < p.length; i++)
p[i] = (byte)regionIndex[i
+ off];
ip = new ByteProcessor(w, h, p, null);
}
stack.addSlice(null, ip);
}
}
IJ.showStatus("");
String title = image.getTitle() + " (SRM3D Q=" + Q + ")";
return new ImagePlus(title, stack);
}
protected ImagePlus srm2D(ImageProcessor ip, boolean showAverages) {
int w = ip.getWidth(), h = ip.getHeight();
delta = 1f / (6 * w * h);
/*
* This would be the non-relaxed formula:
*
* factor = g * g / 2 / Q * (float)Math.log(2 / delta);
*
* The paper claims that this is more prone to oversegmenting.
*/
factor = g * g / 2 / Q;
logDelta = 2f * (float)Math.log(6 * w * h);
byte[] pixel = (byte[])ip.getPixels();
initializeRegions2D(pixel, ip.getWidth(), ip.getHeight());
initializeNeighbors2D(pixel, w, h);
mergeAllNeighbors2D(w);
if (showAverages) {
for (int i = 0; i < average.length; i++)
average[i] = average[getRegionIndex(i)];
ip = new FloatProcessor(w, h, average, null);
}
else {
int regionCount = consolidateRegions();
if (regionCount > 1<<8) {
if (regionCount > 1<<16)
IJ.showMessage("Found " + regionCount
+ " regions, which does not fit"
+ " in 16-bit.");
short[] pixel16 = new short[w * h];
for (int i = 0; i < pixel16.length; i++)
pixel16[i] = (short)regionIndex[i];
ip = new ShortProcessor(w, h, pixel16, null);
}
else {
pixel = new byte[w * h];
for (int i = 0; i < pixel.length; i++)
pixel[i] = (byte)regionIndex[i];
ip = new ByteProcessor(w, h, pixel, null);
}
}
String title = image.getTitle() + " (SRM Q=" + Q + ")";
return new ImagePlus(title, ip);
}
void initializeRegions2D(byte[] pixel, int w, int h) {
average = new float[w * h];
count = new int[w * h];
regionIndex = new int[w * h];
for (int i = 0; i < average.length; i++) {
average[i] = pixel[i] & 0xff;
count[i] = 1;
regionIndex[i] = i;
}
}
void initializeRegions3D(int w, int h, int d) {
average = new float[w * h * d];
count = new int[w * h * d];
regionIndex = new int[w * h * d];
for (int j = 0; j < d; j++) {
byte[] pixel =
(byte[])image.getStack().getProcessor(j
+ 1).getPixels();
int offset = j * w * h;
for (int i = 0; i < w * h; i++) {
average[offset + i] = pixel[i] & 0xff;
count[offset + i] = 1;
regionIndex[offset + i] = offset + i;
}
}
}
protected void addNeighborPair(int neighborIndex,
byte[] pixel, int i1, int i2) {
int difference = Math.abs((pixel[i1] & 0xff)
- (pixel[i2] & 0xff));
nextNeighbor[neighborIndex] = neighborBucket[difference];
neighborBucket[difference] = neighborIndex;
}
void initializeNeighbors2D(byte[] pixel, int w, int h) {
nextNeighbor = new int[2 * w * h];
// bucket sort
neighborBucket = new int[256];
Arrays.fill(neighborBucket, -1);
for (int j = h - 1; j >= 0; j--)
for (int i = w - 1; i >= 0; i--) {
int index = i + w * j;
int neighborIndex = 2 * index;
// vertical
if (j < h - 1)
addNeighborPair(neighborIndex + 1,
pixel, index, index + w);
// horizontal
if (i < w - 1)
addNeighborPair(neighborIndex,
pixel, index, index + 1);
}
}
protected void addNeighborPair(int neighborIndex,
byte[] pixel, byte[] nextPixel, int i) {
int difference = Math.abs((pixel[i] & 0xff)
- (nextPixel[i] & 0xff));
nextNeighbor[neighborIndex] = neighborBucket[difference];
neighborBucket[difference] = neighborIndex;
}
void initializeNeighbors3D(int w, int h, int d) {
nextNeighbor = new int[3 * w * h * d];
// bucket sort
neighborBucket = new int[256];
Arrays.fill(neighborBucket, -1);
byte[] nextPixel = null;
for (int k = d - 1; k >= 0; k--) {
byte[] pixel =
(byte[])image.getStack().getProcessor(k
+ 1).getPixels();
for (int j = h - 1; j >= 0; j--)
for (int i = w - 1; i >= 0; i--) {
int index = i + w * j;
int neighborIndex =
3 * (index + k * w * h);
// depth
if (nextPixel != null)
addNeighborPair(neighborIndex
+ 2, pixel,
nextPixel, index);
// vertical
if (j < h - 1)
addNeighborPair(neighborIndex
+ 1, pixel,
index, index + w);
// horizontal
if (i < w - 1)
addNeighborPair(neighborIndex,
pixel,
index, index + 1);
}
nextPixel = pixel;
}
}
// recursively find out the region index for this pixel
int getRegionIndex(int i) {
i = regionIndex[i];
while (i < 0)
i = regionIndex[-1 - i];
return i;
}
// should regions i1 and i2 be merged?
boolean predicate(int i1, int i2) {
float difference = average[i1] - average[i2];
/*
* This would be the non-relaxed predicate mentioned in the
* paper.
*
* return difference * difference <
factor * (1f / count[i1] + 1f / count[i2]);
*
*/
float log1 = (float)Math.log(1 + count[i1])
* (g < count[i1] ? g : count[i1]);
float log2 = (float)Math.log(1 + count[i2])
* (g < count[i2] ? g : count[i2]);
return difference * difference <
.1f * factor * ((log1 + logDelta) / count[i1]
+ ((log2 + logDelta) / count[i2]));
}
void mergeAllNeighbors2D(int w) {
for (int i = 0; i < neighborBucket.length; i++) {
int neighborIndex = neighborBucket[i];
while (neighborIndex >= 0) {
int i1 = neighborIndex / 2;
int i2 = i1
+ (0 == (neighborIndex & 1) ? 1 : w);
i1 = getRegionIndex(i1);
i2 = getRegionIndex(i2);
if (predicate(i1, i2))
mergeRegions(i1, i2);
neighborIndex = nextNeighbor[neighborIndex];
}
}
}
void mergeAllNeighbors3D(int w, int h) {
for (int i = 0; i < neighborBucket.length; i++) {
int neighborIndex = neighborBucket[i];
IJ.showProgress(i, neighborBucket.length);
while (neighborIndex >= 0) {
int i1 = neighborIndex / 3;
int i2 = i1
+ (0 == (neighborIndex % 3) ? 1 :
(1 == (neighborIndex % 3) ? w :
w * h));
i1 = getRegionIndex(i1);
i2 = getRegionIndex(i2);
if (i1 != i2 && predicate(i1, i2))
mergeRegions(i1, i2);
neighborIndex = nextNeighbor[neighborIndex];
}
}
IJ.showProgress(neighborBucket.length, neighborBucket.length);
}
void mergeRegions(int i1, int i2) {
if (i1 == i2)
return;
int mergedCount = count[i1] + count[i2];
float mergedAverage = (average[i1] * count[i1]
+ average[i2] * count[i2]) / mergedCount;
// merge larger index into smaller index
if (i1 > i2) {
average[i2] = mergedAverage;
count[i2] = mergedCount;
regionIndex[i1] = -1 - i2;
}
else {
average[i1] = mergedAverage;
count[i1] = mergedCount;
regionIndex[i2] = -1 - i1;
}
}
int consolidateRegions() {
/*
* By construction, a negative regionIndex will always point
* to a smaller regionIndex.
*
* So we can get away by iterating from small to large and
* replacing the positive ones with running numbers, and the
* negative ones by the ones they are pointing to (that are
* now guaranteed to contain a non-negative index).
*/
int count = 0;
for (int i = 0; i < regionIndex.length; i++)
if (regionIndex[i] < 0)
regionIndex[i] =
regionIndex[-1 - regionIndex[i]];
else
regionIndex[i] = count++;
return count;
}
}

canny edge detector in java

Hi I am working a on project that I need to implement an edge detector. I need to do it in VHDL however I am a little better at Java so I am looking at getting a working code in Java first then transfering it over. The code below I found but I can't get it working, I keep getting an error in the main on this line: detector.setSourceImage(frame); error says frame can not be resolved to a variable. I understand why I'm getting the error but not sure how to fix it because I don't know how to get the picture in. I am just looking for a quick fix to make this work so I can get started on the VHDL part. Thanks for any help you can give.
package CannyEdgeDetector;
public class CannyEdgeDetector {
// statics
private final static float GAUSSIAN_CUT_OFF = 0.005f;
private final static float MAGNITUDE_SCALE = 100F;
private final static float MAGNITUDE_LIMIT = 1000F;
private final static int MAGNITUDE_MAX = (int) (MAGNITUDE_SCALE * MAGNITUDE_LIMIT);
// fields
private int height;
private int width;
private int picsize;
private int[] data;
private int[] magnitude;
private BufferedImage sourceImage;
private BufferedImage edgesImage;
private float gaussianKernelRadius;
private float lowThreshold;
private float highThreshold;
private int gaussianKernelWidth;
private boolean contrastNormalized;
private float[] xConv;
private float[] yConv;
private float[] xGradient;
private float[] yGradient;
// constructors
/**
* Constructs a new detector with default parameters.
*/
public CannyEdgeDetector() {
lowThreshold = 2.5f;
highThreshold = 7.5f;
gaussianKernelRadius = 2f;
gaussianKernelWidth = 16;
contrastNormalized = false;
}
// accessors
/**
* The image that provides the luminance data used by this detector to
* generate edges.
*
* #return the source image, or null
*/
public BufferedImage getSourceImage() {
return sourceImage;
}
/**
* Specifies the image that will provide the luminance data in which edges
* will be detected. A source image must be set before the process method
* is called.
*
* #param image a source of luminance data
*/
public void setSourceImage(BufferedImage image) {
sourceImage = image;
}
/**
* Obtains an image containing the edges detected during the last call to
* the process method. The buffered image is an opaque image of type
* BufferedImage.TYPE_INT_ARGB in which edge pixels are white and all other
* pixels are black.
*
* #return an image containing the detected edges, or null if the process
* method has not yet been called.
*/
public BufferedImage getEdgesImage() {
return edgesImage;
}
/**
* Sets the edges image. Calling this method will not change the operation
* of the edge detector in any way. It is intended to provide a means by
* which the memory referenced by the detector object may be reduced.
*
* #param edgesImage expected (though not required) to be null
*/
public void setEdgesImage(BufferedImage edgesImage) {
this.edgesImage = edgesImage;
}
/**
* The low threshold for hysteresis. The default value is 2.5.
*
* #return the low hysteresis threshold
*/
public float getLowThreshold() {
return lowThreshold;
}
/**
* Sets the low threshold for hysteresis. Suitable values for this parameter
* must be determined experimentally for each application. It is nonsensical
* (though not prohibited) for this value to exceed the high threshold value.
*
* #param threshold a low hysteresis threshold
*/
public void setLowThreshold(float threshold) {
if (threshold < 0) throw new IllegalArgumentException();
lowThreshold = threshold;
}
/**
* The high threshold for hysteresis. The default value is 7.5.
*
* #return the high hysteresis threshold
*/
public float getHighThreshold() {
return highThreshold;
}
/**
* Sets the high threshold for hysteresis. Suitable values for this
* parameter must be determined experimentally for each application. It is
* nonsensical (though not prohibited) for this value to be less than the
* low threshold value.
*
* #param threshold a high hysteresis threshold
*/
public void setHighThreshold(float threshold) {
if (threshold < 0) throw new IllegalArgumentException();
highThreshold = threshold;
}
/**
* The number of pixels across which the Gaussian kernel is applied.
* The default value is 16.
*
* #return the radius of the convolution operation in pixels
*/
public int getGaussianKernelWidth() {
return gaussianKernelWidth;
}
/**
* The number of pixels across which the Gaussian kernel is applied.
* This implementation will reduce the radius if the contribution of pixel
* values is deemed negligable, so this is actually a maximum radius.
*
* #param gaussianKernelWidth a radius for the convolution operation in
* pixels, at least 2.
*/
public void setGaussianKernelWidth(int gaussianKernelWidth) {
if (gaussianKernelWidth < 2) throw new IllegalArgumentException();
this.gaussianKernelWidth = gaussianKernelWidth;
}
/**
* The radius of the Gaussian convolution kernel used to smooth the source
* image prior to gradient calculation. The default value is 16.
*
* #return the Gaussian kernel radius in pixels
*/
public float getGaussianKernelRadius() {
return gaussianKernelRadius;
}
/**
* Sets the radius of the Gaussian convolution kernel used to smooth the
* source image prior to gradient calculation.
*
* #return a Gaussian kernel radius in pixels, must exceed 0.1f.
*/
public void setGaussianKernelRadius(float gaussianKernelRadius) {
if (gaussianKernelRadius < 0.1f) throw new IllegalArgumentException();
this.gaussianKernelRadius = gaussianKernelRadius;
}
/**
* Whether the luminance data extracted from the source image is normalized
* by linearizing its histogram prior to edge extraction. The default value
* is false.
*
* #return whether the contrast is normalized
*/
public boolean isContrastNormalized() {
return contrastNormalized;
}
/**
* Sets whether the contrast is normalized
* #param contrastNormalized true if the contrast should be normalized,
* false otherwise
*/
public void setContrastNormalized(boolean contrastNormalized) {
this.contrastNormalized = contrastNormalized;
}
// methods
public void process() {
width = sourceImage.getWidth();
height = sourceImage.getHeight();
picsize = width * height;
initArrays();
readLuminance();
if (contrastNormalized) normalizeContrast();
computeGradients(gaussianKernelRadius, gaussianKernelWidth);
int low = Math.round(lowThreshold * MAGNITUDE_SCALE);
int high = Math.round( highThreshold * MAGNITUDE_SCALE);
performHysteresis(low, high);
thresholdEdges();
writeEdges(data);
}
// private utility methods
private void initArrays() {
if (data == null || picsize != data.length) {
data = new int[picsize];
magnitude = new int[picsize];
xConv = new float[picsize];
yConv = new float[picsize];
xGradient = new float[picsize];
yGradient = new float[picsize];
}
}
//NOTE: The elements of the method below (specifically the technique for
//non-maximal suppression and the technique for gradient computation)
//are derived from an implementation posted in the following forum (with the
//clear intent of others using the code):
// http://forum.java.sun.com/thread.jspa?threadID=546211&start=45&tstart=0
//My code effectively mimics the algorithm exhibited above.
//Since I don't know the providence of the code that was posted it is a
//possibility (though I think a very remote one) that this code violates
//someone's intellectual property rights. If this concerns you feel free to
//contact me for an alternative, though less efficient, implementation.
private void computeGradients(float kernelRadius, int kernelWidth) {
//generate the gaussian convolution masks
float kernel[] = new float[kernelWidth];
float diffKernel[] = new float[kernelWidth];
int kwidth;
for (kwidth = 0; kwidth < kernelWidth; kwidth++) {
float g1 = gaussian(kwidth, kernelRadius);
if (g1 <= GAUSSIAN_CUT_OFF && kwidth >= 2) break;
float g2 = gaussian(kwidth - 0.5f, kernelRadius);
float g3 = gaussian(kwidth + 0.5f, kernelRadius);
kernel[kwidth] = (g1 + g2 + g3) / 3f / (2f * (float) Math.PI * kernelRadius * kernelRadius);
diffKernel[kwidth] = g3 - g2;
}
int initX = kwidth - 1;
int maxX = width - (kwidth - 1);
int initY = width * (kwidth - 1);
int maxY = width * (height - (kwidth - 1));
//perform convolution in x and y directions
for (int x = initX; x < maxX; x++) {
for (int y = initY; y < maxY; y += width) {
int index = x + y;
float sumX = data[index] * kernel[0];
float sumY = sumX;
int xOffset = 1;
int yOffset = width;
for(; xOffset < kwidth ;) {
sumY += kernel[xOffset] * (data[index - yOffset] + data[index + yOffset]);
sumX += kernel[xOffset] * (data[index - xOffset] + data[index + xOffset]);
yOffset += width;
xOffset++;
}
yConv[index] = sumY;
xConv[index] = sumX;
}
}
for (int x = initX; x < maxX; x++) {
for (int y = initY; y < maxY; y += width) {
float sum = 0f;
int index = x + y;
for (int i = 1; i < kwidth; i++)
sum += diffKernel[i] * (yConv[index - i] - yConv[index + i]);
xGradient[index] = sum;
}
}
for (int x = kwidth; x < width - kwidth; x++) {
for (int y = initY; y < maxY; y += width) {
float sum = 0.0f;
int index = x + y;
int yOffset = width;
for (int i = 1; i < kwidth; i++) {
sum += diffKernel[i] * (xConv[index - yOffset] - xConv[index + yOffset]);
yOffset += width;
}
yGradient[index] = sum;
}
}
initX = kwidth;
maxX = width - kwidth;
initY = width * kwidth;
maxY = width * (height - kwidth);
for (int x = initX; x < maxX; x++) {
for (int y = initY; y < maxY; y += width) {
int index = x + y;
int indexN = index - width;
int indexS = index + width;
int indexW = index - 1;
int indexE = index + 1;
int indexNW = indexN - 1;
int indexNE = indexN + 1;
int indexSW = indexS - 1;
int indexSE = indexS + 1;
float xGrad = xGradient[index];
float yGrad = yGradient[index];
float gradMag = hypot(xGrad, yGrad);
//perform non-maximal supression
float nMag = hypot(xGradient[indexN], yGradient[indexN]);
float sMag = hypot(xGradient[indexS], yGradient[indexS]);
float wMag = hypot(xGradient[indexW], yGradient[indexW]);
float eMag = hypot(xGradient[indexE], yGradient[indexE]);
float neMag = hypot(xGradient[indexNE], yGradient[indexNE]);
float seMag = hypot(xGradient[indexSE], yGradient[indexSE]);
float swMag = hypot(xGradient[indexSW], yGradient[indexSW]);
float nwMag = hypot(xGradient[indexNW], yGradient[indexNW]);
float tmp;
/*
* An explanation of what's happening here, for those who want
* to understand the source: This performs the "non-maximal
* supression" phase of the Canny edge detection in which we
* need to compare the gradient magnitude to that in the
* direction of the gradient; only if the value is a local
* maximum do we consider the point as an edge candidate.
*
* We need to break the comparison into a number of different
* cases depending on the gradient direction so that the
* appropriate values can be used. To avoid computing the
* gradient direction, we use two simple comparisons: first we
* check that the partial derivatives have the same sign (1)
* and then we check which is larger (2). As a consequence, we
* have reduced the problem to one of four identical cases that
* each test the central gradient magnitude against the values at
* two points with 'identical support'; what this means is that
* the geometry required to accurately interpolate the magnitude
* of gradient function at those points has an identical
* geometry (upto right-angled-rotation/reflection).
*
* When comparing the central gradient to the two interpolated
* values, we avoid performing any divisions by multiplying both
* sides of each inequality by the greater of the two partial
* derivatives. The common comparand is stored in a temporary
* variable (3) and reused in the mirror case (4).
*
*/
if (xGrad * yGrad <= (float) 0 /*(1)*/
? Math.abs(xGrad) >= Math.abs(yGrad) /*(2)*/
? (tmp = Math.abs(xGrad * gradMag)) >= Math.abs(yGrad * neMag - (xGrad + yGrad) * eMag) /*(3)*/
&& tmp > Math.abs(yGrad * swMag - (xGrad + yGrad) * wMag) /*(4)*/
: (tmp = Math.abs(yGrad * gradMag)) >= Math.abs(xGrad * neMag - (yGrad + xGrad) * nMag) /*(3)*/
&& tmp > Math.abs(xGrad * swMag - (yGrad + xGrad) * sMag) /*(4)*/
: Math.abs(xGrad) >= Math.abs(yGrad) /*(2)*/
? (tmp = Math.abs(xGrad * gradMag)) >= Math.abs(yGrad * seMag + (xGrad - yGrad) * eMag) /*(3)*/
&& tmp > Math.abs(yGrad * nwMag + (xGrad - yGrad) * wMag) /*(4)*/
: (tmp = Math.abs(yGrad * gradMag)) >= Math.abs(xGrad * seMag + (yGrad - xGrad) * sMag) /*(3)*/
&& tmp > Math.abs(xGrad * nwMag + (yGrad - xGrad) * nMag) /*(4)*/
) {
magnitude[index] = gradMag >= MAGNITUDE_LIMIT ? MAGNITUDE_MAX : (int) (MAGNITUDE_SCALE * gradMag);
//NOTE: The orientation of the edge is not employed by this
//implementation. It is a simple matter to compute it at
//this point as: Math.atan2(yGrad, xGrad);
} else {
magnitude[index] = 0;
}
}
}
}
//NOTE: It is quite feasible to replace the implementation of this method
//with one which only loosely approximates the hypot function. I've tested
//simple approximations such as Math.abs(x) + Math.abs(y) and they work fine.
private float hypot(float x, float y) {
return (float) Math.hypot(x, y);
}
private float gaussian(float x, float sigma) {
return (float) Math.exp(-(x * x) / (2f * sigma * sigma));
}
private void performHysteresis(int low, int high) {
//NOTE: this implementation reuses the data array to store both
//luminance data from the image, and edge intensity from the processing.
//This is done for memory efficiency, other implementations may wish
//to separate these functions.
Arrays.fill(data, 0);
int offset = 0;
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
if (data[offset] == 0 && magnitude[offset] >= high) {
follow(x, y, offset, low);
}
offset++;
}
}
}
private void follow(int x1, int y1, int i1, int threshold) {
int x0 = x1 == 0 ? x1 : x1 - 1;
int x2 = x1 == width - 1 ? x1 : x1 + 1;
int y0 = y1 == 0 ? y1 : y1 - 1;
int y2 = y1 == height -1 ? y1 : y1 + 1;
data[i1] = magnitude[i1];
for (int x = x0; x <= x2; x++) {
for (int y = y0; y <= y2; y++) {
int i2 = x + y * width;
if ((y != y1 || x != x1)
&& data[i2] == 0
&& magnitude[i2] >= threshold) {
follow(x, y, i2, threshold);
return;
}
}
}
}
private void thresholdEdges() {
for (int i = 0; i < picsize; i++) {
data[i] = data[i] > 0 ? -1 : 0xff000000;
}
}
private int luminance(float r, float g, float b) {
return Math.round(0.299f * r + 0.587f * g + 0.114f * b);
}
private void readLuminance() {
int type = sourceImage.getType();
if (type == BufferedImage.TYPE_INT_RGB || type == BufferedImage.TYPE_INT_ARGB) {
int[] pixels = (int[]) sourceImage.getData().getDataElements(0, 0, width, height, null);
for (int i = 0; i < picsize; i++) {
int p = pixels[i];
int r = (p & 0xff0000) >> 16;
int g = (p & 0xff00) >> 8;
int b = p & 0xff;
data[i] = luminance(r, g, b);
}
} else if (type == BufferedImage.TYPE_BYTE_GRAY) {
byte[] pixels = (byte[]) sourceImage.getData().getDataElements(0, 0, width, height, null);
for (int i = 0; i < picsize; i++) {
data[i] = (pixels[i] & 0xff);
}
} else if (type == BufferedImage.TYPE_USHORT_GRAY) {
short[] pixels = (short[]) sourceImage.getData().getDataElements(0, 0, width, height, null);
for (int i = 0; i < picsize; i++) {
data[i] = (pixels[i] & 0xffff) / 256;
}
} else if (type == BufferedImage.TYPE_3BYTE_BGR) {
byte[] pixels = (byte[]) sourceImage.getData().getDataElements(0, 0, width, height, null);
int offset = 0;
for (int i = 0; i < picsize; i++) {
int b = pixels[offset++] & 0xff;
int g = pixels[offset++] & 0xff;
int r = pixels[offset++] & 0xff;
data[i] = luminance(r, g, b);
}
} else {
throw new IllegalArgumentException("Unsupported image type: " + type);
}
}
private void normalizeContrast() {
int[] histogram = new int[256];
for (int i = 0; i < data.length; i++) {
histogram[data[i]]++;
}
int[] remap = new int[256];
int sum = 0;
int j = 0;
for (int i = 0; i < histogram.length; i++) {
sum += histogram[i];
int target = sum*255/picsize;
for (int k = j+1; k <=target; k++) {
remap[k] = i;
}
j = target;
}
for (int i = 0; i < data.length; i++) {
data[i] = remap[data[i]];
}
}
private void writeEdges(int pixels[]) {
//NOTE: There is currently no mechanism for obtaining the edge data
//in any other format other than an INT_ARGB type BufferedImage.
//This may be easily remedied by providing alternative accessors.
if (edgesImage == null) {
edgesImage = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);
}
edgesImage.getWritableTile(0, 0).setDataElements(0, 0, width, height, pixels);
}
}
public static void main(String []Args){
//create the detector
CannyEdgeDetector detector = new CannyEdgeDetector();
//adjust its parameters as desired
detector.setLowThreshold(0.5f);
detector.setHighThreshold(1f);
//apply it to an image
detector.setSourceImage(frame);
detector.process();
BufferedImage edges = detector.getEdgesImage();
}
}
Why don't you just read a test image from a file? That way, you can verify that it's working properly before transferring.

Categories

Resources