I'm using OpenCV to get an birdseye view of the captured frames. This is done by providing a chessboard pattern on the plane which will form the birdseye view.
Although it seems like the camera is already pretty on top of this plain, I need it to be perfect in order to determine the relationship between pixels and centimeters.
In the next phase the captures frames are being warped. It gives the expected result:
However, by performing this transformation, data outside the chessboard pattern is being lost. What I need is rotating the image instead of warping a known quadrangle.
Question: How to rotate an image by an camera angle so that it's top-down?
Some code to illustrate what I'm currently doing:
Size chessboardSize = new Size(12, 8); // Size of the chessboard
Size captureSize = new Size(1920, 1080); // Size of the captured frames
Size viewSize = new Size((chessboardSize.width / chessboardSize.height) * captureSize.height, captureSize.height); // Size of the view
MatOfPoint2f imageCorners; // Contains the imageCorners obtained in a earlier stage
Mat H; // Homography
The code which finds the corners:
Mat grayImage = new Mat();
//Imgproc.resize(source, temp, new Size(source.width(), source.height()));
Imgproc.cvtColor(source, grayImage, Imgproc.COLOR_BGR2GRAY);
Imgproc.threshold(grayImage, grayImage, 0.0, 255.0, Imgproc.THRESH_OTSU);
imageCorners = new MatOfPoint2f();
Imgproc.GaussianBlur(grayImage, grayImage, new Size(5, 5), 5);
boolean found = Calib3d.findChessboardCorners(grayImage, chessboardSize, imageCorners, Calib3d.CALIB_CB_NORMALIZE_IMAGE + Calib3d.CALIB_CB_ADAPTIVE_THRESH + Calib3d.CALIB_CB_FILTER_QUADS);
if (found) {
determineHomography();
}
The code which determines the homography:
Point[] data = imageCorners.toArray();
if (data.length < chessboardSize.area()) {
return;
}
Point[] roi = new Point[] {
data[0 * (int)chessboardSize.width - 0], // Top left
data[1 * (int)chessboardSize.width - 1], // Top right
data[((int)chessboardSize.height - 1) * (int)chessboardSize.width - 0], // Bottom left
data[((int)chessboardSize.height - 0) * (int)chessboardSize.width - 1], // Bottom right
};
Point[] roo = new Point[] {
new Point(0, 0),
new Point(viewSize.width, 0),
new Point(0, viewSize.height),
new Point(viewSize.width, viewSize.height)
};
MatOfPoint2f objectPoints = new MatOfPoint2f(), imagePoints = new MatOfPoint2f();
objectPoints.fromArray(roo);
imagePoints.fromArray(roi);
Mat H = Imgproc.getPerspectiveTransform(imagePoints, objectPoints);
Finally, the captured frames are being warped:
Imgproc.warpPerspective(capture, view, H, viewSize);
[Edit2] updated progress
There may be more then just rotation present so I would try this instead:
preprocess image
You can apply many filters to remove noise from the image and or normalize illumination conditions (looks like your posted image does not need it). Then simply binarise the image to simplify further steps. see related:
OpenCV for OCR: How to compute thresholding levels for gray image OCR
detect square corner points
and store their coordinates in some array with their topology
double pnt[col][row][2];
where (col,row) is the chessboard index and [2] stores (x,y). You can use int but double/float will avoid unnecessary conversions and rounding during fitting ...
The corners can be detected (unless skew/rotation is near 45 degrees) by scanning the diagonal neighbor pixels like this:
One diagonal should be in one color and the other one in different. This pattern will detect cluster of points around the crossing so find close such points and compute their average.
If you scan the whole image the upper for cycle axis will also sort the point list so no need for further sorting. After averaging sort/order the points to the grid topology (for example by direction between 2 closest points)
Topology
To make it robust I use rotated&skewed image so the topology detection is a bit tricky. After a while of elaborating I come to this:
find point p0 near middle of image
That should ensure that there are neighbors for that point.
find closest point p to it
But ignore diagonal points (|x/y| -> 1 +/- scale of squares). From this point compute first basis vector, let call it u for now.
find closest point p to it
In te same manner as #2 but this time also ignore points in +/-u direction (|(u.v)|/(|u|.|v|) -> 1 +/- skew/rotations). From this point compute second basis vector, let call it v for now.
normalize u,v
I chose that u vector points to +x and v to +ydirection. So basis vector with bigger |x| value should be u and with bigger |y| should be v. So test and swap if needed. Then just negate if the wrong sign. Now we have basis vectors for middle of screen (further away they might change).
compute topology
Set the p0 point as (u=0,v=0) as a start point. Now loop through all yet unmatched points p. For each compute predicted position of the neighbors by adding/substracting basis vectors from its position. Then find closest point to this location and if found it should be neighbor so set its (u,v) coordinate to +/-1 of the original point p. Now update the basis vectors for these points and loop the whole thing until no new match found. The result should be that most of the points should have computed their (u,v) coordinates which is what we need.
After this you can find the min(u),min(v) and shift it to (0,0) so the indexes are not negative if needed.
fit a polynomial for the corner points
for example something like:
pnt[i][j][0]=fx(i,j)
pnt[i][j][1]=fy(i,j)
where fx,fy are polynomial functions. You can try any fitting process. I tried cubic polynomial fit with usage of approximation search but the result was not as good as native bi-cubic interpolation(possibly because of non uniform distortion of test image) so I switched to bi-cubic interpolation instead of fitting. That is more simple but makes computing inverse very difficult, but it can be avoided at cost of speed. If you need to compute the inverse anyway see
Reverse complex 2D lookup table
I am using simple interpolation cubic like this:
d1=0.5*(pp[2]-pp[0]);
d2=0.5*(pp[3]-pp[1]);
a0=pp[1];
a1=d1;
a2=(3.0*(pp[2]-pp[1]))-(2.0*d1)-d2;
a3=d1+d2+(2.0*(-pp[2]+pp[1])); }
coordinate = a0+(a1*t)+(a2*t*t)+(a3*t*t*t);
where pp[0..3] are 4 consequent known control points (our grid crossings), a0..a3 are computed polynomial coefficients and coordinate is point on curve with parameter t. This can be expanded to any number of dimensions.
The properties of this curve are simple it is continuous, starting at pp[1] and ending at pp[2] while t=<0.0,1.0>. The continuity with neighboring segments is ensured with sequence common to all cubic curves.
remap pixels
simply use i,j as floating values with step around 75% of pixel size to avoid gaps. Then just simply loop through all the positions (i,j) compute (x,y) and copy pixel from source image at (x,y) to (i*sz,j*sz)+/-offset where sz is wanted grid size in pixels.
Here the C++:
//---------------------------------------------------------------------------
picture pic0,pic1; // pic0 - original input image,pic1 output
//---------------------------------------------------------------------------
struct _pnt
{
int x,y,n;
int ux,uy,vx,vy;
_pnt(){};
_pnt(_pnt& a){ *this=a; };
~_pnt(){};
_pnt* operator = (const _pnt *a) { x=a->x; y=a->y; return this; };
//_pnt* operator = (const _pnt &a) { ...copy... return this; };
};
//---------------------------------------------------------------------------
void vision()
{
pic1=pic0; // copy input image pic0 to pic1
pic1.enhance_range(); // maximize dynamic range of all channels
pic1.treshold_AND(0,127,255,0); // binarize (remove gray shades)
pic1&=0x00FFFFFF; // clear alpha channel for exact color matching
pic1.save("out_binarised.png");
int i0,i,j,k,l,x,y,u,v,ux,uy,ul,vx,vy,vl;
int qi[4],ql[4],e,us,vs,**uv;
_pnt *p,*q,p0;
List<_pnt> pnt;
// detect square crossings point clouds into pnt[]
pnt.allocate(512); pnt.num=0;
p0.ux=0; p0.uy=0; p0.vx=0; p0.vy=0;
for (p0.n=1,p0.y=2;p0.y<pic1.ys-2;p0.y++) // sorted by y axis, each point has usage n=1
for ( p0.x=2;p0.x<pic1.xs-2;p0.x++)
if (pic1.p[p0.y-2][p0.x+2].dd==pic1.p[p0.y+2][p0.x-2].dd)
if (pic1.p[p0.y-1][p0.x+1].dd==pic1.p[p0.y+1][p0.x-1].dd)
if (pic1.p[p0.y-1][p0.x+1].dd!=pic1.p[p0.y+1][p0.x+1].dd)
if (pic1.p[p0.y-1][p0.x-1].dd==pic1.p[p0.y+1][p0.x+1].dd)
if (pic1.p[p0.y-2][p0.x-2].dd==pic1.p[p0.y+2][p0.x+2].dd)
pnt.add(p0);
// merge close points (deleted point has n=0)
for (p=pnt.dat,i=0;i<pnt.num;i++,p++)
if (p->n) // skip deleted points
for (p0=*p,j=i+1,q=p+1;j<pnt.num;j++,q++) // scan all remaining points
if (q->n) // skip deleted points
{
if (q->y>p0.y+4) continue; // scan only up do y distance <=4 (clods are not bigger then that)
x=p0.x-q->x; x*=x; // compute distance^2
y=p0.y-q->y; y*=y; x+=y;
if (x>25) continue; // skip too distant points
p->x+=q->x; // add coordinates (average)
p->y+=q->y;
p->n++; // increase ussage
q->n=0; // mark current point as deleted
}
// divide the average coordinates and delete marked points
for (p=pnt.dat,i=0,j=0;i<pnt.num;i++,p++)
if (p->n) // skip deleted points
{
p->x/=p->n;
p->y/=p->n;
p->n=1;
pnt.dat[j]=*p; j++;
} pnt.num=j;
// n is now encoded (u,v) so set it as unmatched (u,v) first
#define uv2n(u,v) ((((v+32768)&65535)<<16)|((u+32768)&65535))
#define n2uv(n) { u=n&65535; u-=32768; v=(n>>16)&65535; v-=32768; }
for (p=pnt.dat,i=0;i<pnt.num;i++,p++) p->n=0;
// p0,i0 find point near middle of image
x=pic1.xs>>2;
y=pic1.ys>>2;
for (p=pnt.dat,i=0;i<pnt.num;i++,p++)
if ((p->x>=x)&&(p->x<=x+x+x)
&&(p->y>=y)&&(p->y<=y+y+y)) break;
p0=*p; i0=i;
// q,j find closest point to p0
vl=pic1.xs+pic1.ys; k=0;
for (p=pnt.dat,i=0;i<pnt.num;i++,p++)
if (i!=i0)
{
x=p->x-p0.x;
y=p->y-p0.y;
l=sqrt((x*x)+(y*y));
if (abs(abs(x)-abs(y))*5<l) continue; // ignore diagonals
if (l<=vl) { k=i; vl=l; } // remember smallest distance
}
q=pnt.dat+k; j=k;
ux=q->x-p0.x;
uy=q->y-p0.y;
ul=sqrt((ux*ux)+(uy*uy));
// q,k find closest point to p0 not in u direction
vl=pic1.xs+pic1.ys; k=0;
for (p=pnt.dat,i=0;i<pnt.num;i++,p++)
if (i!=i0)
{
x=p->x-p0.x;
y=p->y-p0.y;
l=sqrt((x*x)+(y*y));
if (abs(abs(x)-abs(y))*5<l) continue; // ignore diagonals
if (abs((100*ux*y)/((x*uy)+1))>75) continue;// ignore paralel to u directions
if (l<=vl) { k=i; vl=l; } // remember smallest distance
}
q=pnt.dat+k;
vx=q->x-p0.x;
vy=q->y-p0.y;
vl=sqrt((vx*vx)+(vy*vy));
// normalize directions u -> +x, v -> +y
if (abs(ux)<abs(vx))
{
x=j ; j =k ; k =x;
x=ux; ux=vx; vx=x;
x=uy; uy=vy; vy=x;
x=ul; ul=vl; vl=x;
}
if (abs(vy)<abs(uy))
{
x=ux; ux=vx; vx=x;
x=uy; uy=vy; vy=x;
x=ul; ul=vl; vl=x;
}
x=1; y=1;
if (ux<0) { ux=-ux; uy=-uy; x=-x; }
if (vy<0) { vx=-vx; vy=-vy; y=-y; }
// set (u,v) encoded in n for already found points
p0.n=uv2n(0,0); // middle point
p0.ux=ux; p0.uy=uy;
p0.vx=vx; p0.vy=vy;
pnt.dat[i0]=p0;
p=pnt.dat+j; // p0 +/- u basis vector
p->n=uv2n(x,0);
p->ux=ux; p->uy=uy;
p->vx=vx; p->vy=vy;
p=pnt.dat+k; // p0 +/- v basis vector
p->n=uv2n(0,y);
p->ux=ux; p->uy=uy;
p->vx=vx; p->vy=vy;
// qi[k],ql[k] find closest point to p0
#define find_neighbor \
for (ql[k]=0x7FFFFFFF,qi[k]=-1,q=pnt.dat,j=0;j<pnt.num;j++,q++) \
{ \
x=q->x-p0.x; \
y=q->y-p0.y; \
l=(x*x)+(y*y); \
if (ql[k]>=l) { ql[k]=l; qi[k]=j; } \
}
// process all matched points
for (e=1;e;)
for (e=0,p=pnt.dat,i=0;i<pnt.num;i++,p++)
if (p->n)
{
// prepare variables
ul=(p->ux*p->ux)+(p->uy*p->uy);
vl=(p->vx*p->vx)+(p->vy*p->vy);
// find neighbors near predicted position p0
k=0; p0.x=p->x-p->ux; p0.y=p->y-p->uy; find_neighbor; if (ql[k]<<1>ul) qi[k]=-1; // u-1,v
k++; p0.x=p->x+p->ux; p0.y=p->y+p->uy; find_neighbor; if (ql[k]<<1>ul) qi[k]=-1; // u+1,v
k++; p0.x=p->x-p->vx; p0.y=p->y-p->vy; find_neighbor; if (ql[k]<<1>vl) qi[k]=-1; // u,v-1
k++; p0.x=p->x+p->vx; p0.y=p->y+p->vy; find_neighbor; if (ql[k]<<1>vl) qi[k]=-1; // u,v+1
// update local u,v basis vectors for found points (and remember them)
n2uv(p->n); ux=p->ux; uy=p->uy; vx=p->vx; vy=p->vy;
k=0; if (qi[k]>=0) { q=pnt.dat+qi[k]; if (!q->n) { e=1; q->n=uv2n(u-1,v); q->ux=-(q->x-p->x); q->uy=-(q->y-p->y); } ux=q->ux; uy=q->uy; }
k++; if (qi[k]>=0) { q=pnt.dat+qi[k]; if (!q->n) { e=1; q->n=uv2n(u+1,v); q->ux=+(q->x-p->x); q->uy=+(q->y-p->y); } ux=q->ux; uy=q->uy; }
k++; if (qi[k]>=0) { q=pnt.dat+qi[k]; if (!q->n) { e=1; q->n=uv2n(u,v-1); q->vx=-(q->x-p->x); q->vy=-(q->y-p->y); } vx=q->vx; vy=q->vy; }
k++; if (qi[k]>=0) { q=pnt.dat+qi[k]; if (!q->n) { e=1; q->n=uv2n(u,v+1); q->vx=+(q->x-p->x); q->vy=+(q->y-p->y); } vx=q->vx; vy=q->vy; }
// copy remembered local u,v basis vectors to points where are those missing
k=0; if (qi[k]>=0) { q=pnt.dat+qi[k]; if (!q->vy) { q->vx=vx; q->vy=vy; }}
k++; if (qi[k]>=0) { q=pnt.dat+qi[k]; if (!q->vy) { q->vx=vx; q->vy=vy; }}
k++; if (qi[k]>=0) { q=pnt.dat+qi[k]; if (!q->ux) { q->ux=ux; q->uy=uy; }}
k++; if (qi[k]>=0) { q=pnt.dat+qi[k]; if (!q->ux) { q->ux=ux; q->uy=uy; }}
}
// find min,max (u,v)
ux=0; uy=0; vx=0; vy=0;
for (p=pnt.dat,i=0;i<pnt.num;i++,p++)
if (p->n)
{
n2uv(p->n);
if (ux>u) ux=u;
if (vx>v) vx=v;
if (uy<u) uy=u;
if (vy<v) vy=v;
}
// normalize (u,v)+enlarge and create topology table
us=uy-ux+1;
vs=vy-vx+1;
uv=new int*[us];
for (u=0;u<us;u++) uv[u]=new int[vs];
for (u=0;u<us;u++)
for (v=0;v<vs;v++)
uv[u][v]=-1;
for (p=pnt.dat,i=0;i<pnt.num;i++,p++)
if (p->n)
{
n2uv(p->n);
u-=ux; v-=vx;
p->n=uv2n(u,v);
uv[u][v]=i;
}
// bi-cubic interpolation
double a0,a1,a2,a3,d1,d2,pp[4],qx[4],qy[4],t,fu,fv,fx,fy;
// compute cubic curve coefficients a0..a3 from 1D points pp[0..3]
#define cubic_init { d1=0.5*(pp[2]-pp[0]); d2=0.5*(pp[3]-pp[1]); a0=pp[1]; a1=d1; a2=(3.0*(pp[2]-pp[1]))-(2.0*d1)-d2; a3=d1+d2+(2.0*(-pp[2]+pp[1])); }
// compute cubic curve cordinates =f(t)
#define cubic_xy (a0+(a1*t)+(a2*t*t)+(a3*t*t*t));
// safe access to grid (u,v) point copies it to p0
// points utside grid are computed by mirroring
#define point_uv(u,v) \
{ \
if ((u>=0)&&(u<us)&&(v>=0)&&(v<vs)) p0=pnt.dat[uv[u][v]]; \
else{ \
int uu=u,vv=v; \
if (uu<0) uu=0; \
if (uu>=us) uu=us-1; \
if (vv<0) vv=0; \
if (vv>=vs) vv=vs-1; \
p0=pnt.dat[uv[uu][vv]]; \
uu=u-uu; vv=v-vv; \
p0.x+=(uu*p0.ux)+(vv*p0.vx); \
p0.y+=(uu*p0.uy)+(vv*p0.vy); \
} \
}
//----------------------------------------
//--- Debug draws: -----------------------
//----------------------------------------
// debug recolor white to gray to emphasize debug render
pic1.recolor(0x00FFFFFF,0x00404040);
// debug draw basis vectors
for (p=pnt.dat,i=0;i<pnt.num;i++,p++)
{
pic1.bmp->Canvas->Pen->Color=clRed;
pic1.bmp->Canvas->Pen->Width=1;
pic1.bmp->Canvas->MoveTo(p->x,p->y);
pic1.bmp->Canvas->LineTo(p->x+p->ux,p->y+p->uy);
pic1.bmp->Canvas->Pen->Color=clBlue;
pic1.bmp->Canvas->MoveTo(p->x,p->y);
pic1.bmp->Canvas->LineTo(p->x+p->vx,p->y+p->vy);
pic1.bmp->Canvas->Pen->Width=1;
}
// debug draw crossings
AnsiString s;
pic1.bmp->Canvas->Font->Height=12;
pic1.bmp->Canvas->Brush->Style=bsClear;
for (p=pnt.dat,i=0;i<pnt.num;i++,p++)
{
n2uv(p->n);
if (p->n)
{
pic1.bmp->Canvas->Font->Color=clWhite;
s=AnsiString().sprintf("%i,%i",u,v);
}
else{
pic1.bmp->Canvas->Font->Color=clGray;
s=i;
}
x=p->x-(pic1.bmp->Canvas->TextWidth(s)>>1);
y=p->y-(pic1.bmp->Canvas->TextHeight(s)>>1);
pic1.bmp->Canvas->TextOutA(x,y,s);
}
pic1.bmp->Canvas->Brush->Style=bsSolid;
pic1.save("out_topology.png");
// debug draw of bi-cubic interpolation fit/coveradge with half square step
pic1=pic0;
pic1.treshold_AND(0,200,0x40,0); // binarize (remove gray shades)
pic1.bmp->Canvas->Pen->Color=clAqua;
pic1.bmp->Canvas->Brush->Color=clBlue;
for (fu=-1;fu<double(us)+0.01;fu+=0.5)
for (fv=-1;fv<double(vs)+0.01;fv+=0.5)
{
u=floor(fu);
v=floor(fv);
// 4x cubic curve in v direction
t=fv-double(v);
for (i=0;i<4;i++)
{
point_uv(u-1+i,v-1); pp[0]=p0.x;
point_uv(u-1+i,v+0); pp[1]=p0.x;
point_uv(u-1+i,v+1); pp[2]=p0.x;
point_uv(u-1+i,v+2); pp[3]=p0.x;
cubic_init; qx[i]=cubic_xy;
point_uv(u-1+i,v-1); pp[0]=p0.y;
point_uv(u-1+i,v+0); pp[1]=p0.y;
point_uv(u-1+i,v+1); pp[2]=p0.y;
point_uv(u-1+i,v+2); pp[3]=p0.y;
cubic_init; qy[i]=cubic_xy;
}
// 1x cubic curve in u direction on the resulting 4 points
t=fu-double(u);
for (i=0;i<4;i++) pp[i]=qx[i]; cubic_init; fx=cubic_xy;
for (i=0;i<4;i++) pp[i]=qy[i]; cubic_init; fy=cubic_xy;
t=1.0;
pic1.bmp->Canvas->Ellipse(fx-t,fy-t,fx+t,fy+t);
}
pic1.save("out_fit.png");
// linearizing of original image
DWORD col;
double grid_size=32.0; // linear grid square size in pixels
double grid_step=0.01; // u,v step <= 1 pixel
pic1.resize((us+1)*grid_size,(vs+1)*grid_size); // resize target image
pic1.clear(0); // clear target image
for (fu=-1;fu<double(us)+0.01;fu+=grid_step) // copy/transform source image to target
for (fv=-1;fv<double(vs)+0.01;fv+=grid_step)
{
u=floor(fu);
v=floor(fv);
// 4x cubic curve in v direction
t=fv-double(v);
for (i=0;i<4;i++)
{
point_uv(u-1+i,v-1); pp[0]=p0.x;
point_uv(u-1+i,v+0); pp[1]=p0.x;
point_uv(u-1+i,v+1); pp[2]=p0.x;
point_uv(u-1+i,v+2); pp[3]=p0.x;
cubic_init; qx[i]=cubic_xy;
point_uv(u-1+i,v-1); pp[0]=p0.y;
point_uv(u-1+i,v+0); pp[1]=p0.y;
point_uv(u-1+i,v+1); pp[2]=p0.y;
point_uv(u-1+i,v+2); pp[3]=p0.y;
cubic_init; qy[i]=cubic_xy;
}
// 1x cubic curve in u direction on the resulting 4 points
t=fu-double(u);
for (i=0;i<4;i++) pp[i]=qx[i]; cubic_init; fx=cubic_xy; x=fx;
for (i=0;i<4;i++) pp[i]=qy[i]; cubic_init; fy=cubic_xy; y=fy;
// here (x,y) contains source image coordinates coresponding to grid (fu,fv) so copy it to col
col=0; if ((x>=0)&&(x<pic0.xs)&&(y>=0)&&(y<pic0.ys)) col=pic0.p[y][x].dd;
// compute liner image coordinates (x,y) by scaling (fu,fv)
fx=(fu+1.0)*grid_size; x=fx;
fy=(fv+1.0)*grid_size; y=fy;
// copy col to it
if ((x>=0)&&(x<pic1.xs)&&(y>=0)&&(y<pic1.ys)) pic1.p[y][x].dd=col;
}
pic1.save("out_linear.png");
// release memory and cleanup macros
for (u=0;u<us;u++) delete[] uv[u]; delete[] uv;
#undef uv2n
#undef n2uv
#undef find_neighbor
#undef cubic_init
#undef cubic_xy
#undef point_uv(u,v)
}
//---------------------------------------------------------------------------
Sorry I know its a lot of code but at least I commented it as much as I could. The code is not optimized for the sake of simplicity and understand-ability the final image linearization can be written a lot faster. Also I chose the grid_size and grid_step in that part of code manually. It should be computed from the image and known physical properties instead.
I use my own picture class for images so some members are:
xs,ys size of image in pixels
p[y][x].dd is pixel at (x,y) position as 32 bit integer type
clear(color) - clears entire image
resize(xs,ys) - resizes image to new resolution
bmp - VCL encapsulated GDI Bitmap with Canvas access
I also use mine dynamic list template so:
List<double> xxx; is the same as double xxx[];
xxx.add(5); adds 5 to end of the list
xxx[7] access array element (safe)
xxx.dat[7] access array element (unsafe but fast direct access)
xxx.num is the actual used size of the array
xxx.reset() clears the array and set xxx.num=0
xxx.allocate(100) preallocate space for 100 items
Here are the sub result output images. To make the stuff more robust I changed input image to more distorted one:
To make it visually more pleasing I recolored the white to gray. Red lines are local u basis and Blue are the local v basis vectors. White 2D vector numbers are topology (u,v) coordinates and gray scalar numbers are crossings index in pnt[] for topology yet unmatched points.
[Notes]
This approach will not work for rotations near 45 degree. For such cases you need to change the crossing detection from cross to plus pattern and also the topology conditions and equations changes a bit. Not to mention u,v direction selection.
Related
EDIT: I found out that all the pixels were upside down because of the difference between screen and world coordinates, so that is no longer a problem.
EDIT: After following a suggestion from #TheVee (using absolute values), my image got much better, but I'm still seeing issues with color.
I having a little trouble with ray-tracing triangles. This is a follow-up to my previous question about the same topic. The answers to that question made me realize that I needed to take a different approach. The new approach I took worked much better, but I'm seeing a couple of issues with my raytracer now:
There is one triangle that never renders in color (it is always black, even though it's color is supposed to be yellow).
Here is what I am expecting to see:
But here is what I am actually seeing:
Addressing debugging the first problem, even if I remove all other objects (including the blue triangle), the yellow triangle is always rendered black, so I don't believe that it is an issues with my shadow rays that I am sending out. I suspect that it has to do with the angle that the triangle/plane is at relative to the camera.
Here is my process for ray-tracing triangles which is based off of the process in this website.
Determine if the ray intersects the plane.
If it does, determine if the ray intersects inside of the triangle (using parametric coordinates).
Here is the code for determining if the ray hits the plane:
private Vector getPlaneIntersectionVector(Ray ray)
{
double epsilon = 0.00000001;
Vector w0 = ray.getOrigin().subtract(getB());
double numerator = -(getPlaneNormal().dotProduct(w0));
double denominator = getPlaneNormal().dotProduct(ray.getDirection());
//ray is parallel to triangle plane
if (Math.abs(denominator) < epsilon)
{
//ray lies in triangle plane
if (numerator == 0)
{
return null;
}
//ray is disjoint from plane
else
{
return null;
}
}
double intersectionDistance = numerator / denominator;
//intersectionDistance < 0 means the "intersection" is behind the ray (pointing away from plane), so not a real intersection
return (intersectionDistance >= 0) ? ray.getLocationWithMagnitude(intersectionDistance) : null;
}
And once I have determined that the ray intersects the plane, here is the code to determine if the ray is inside the triangle:
private boolean isIntersectionVectorInsideTriangle(Vector planeIntersectionVector)
{
//Get edges of triangle
Vector u = getU();
Vector v = getV();
//Pre-compute unique five dot-products
double uu = u.dotProduct(u);
double uv = u.dotProduct(v);
double vv = v.dotProduct(v);
Vector w = planeIntersectionVector.subtract(getB());
double wu = w.dotProduct(u);
double wv = w.dotProduct(v);
double denominator = (uv * uv) - (uu * vv);
//get and test parametric coordinates
double s = ((uv * wv) - (vv * wu)) / denominator;
if (s < 0 || s > 1)
{
return false;
}
double t = ((uv * wu) - (uu * wv)) / denominator;
if (t < 0 || (s + t) > 1)
{
return false;
}
return true;
}
Is think that I am having some issue with my coloring. I think that it has to do with the normals of the various triangles. Here is the equation I am considering when I am building my lighting model for spheres and triangles:
Now, here is the code that does this:
public Color calculateIlluminationModel(Vector normal, boolean isInShadow, Scene scene, Ray ray, Vector intersectionPoint)
{
//c = cr * ca + cr * cl * max(0, n \dot l)) + cl * cp * max(0, e \dot r)^p
Vector lightSourceColor = getColorVector(scene.getLightColor()); //cl
Vector diffuseReflectanceColor = getColorVector(getMaterialColor()); //cr
Vector ambientColor = getColorVector(scene.getAmbientLightColor()); //ca
Vector specularHighlightColor = getColorVector(getSpecularHighlight()); //cp
Vector directionToLight = scene.getDirectionToLight().normalize(); //l
double angleBetweenLightAndNormal = directionToLight.dotProduct(normal);
Vector reflectionVector = normal.multiply(2).multiply(angleBetweenLightAndNormal).subtract(directionToLight).normalize(); //r
double visibilityTerm = isInShadow ? 0 : 1;
Vector ambientTerm = diffuseReflectanceColor.multiply(ambientColor);
double lambertianComponent = Math.max(0, angleBetweenLightAndNormal);
Vector diffuseTerm = diffuseReflectanceColor.multiply(lightSourceColor).multiply(lambertianComponent).multiply(visibilityTerm);
double angleBetweenEyeAndReflection = scene.getLookFrom().dotProduct(reflectionVector);
angleBetweenEyeAndReflection = Math.max(0, angleBetweenEyeAndReflection);
double phongComponent = Math.pow(angleBetweenEyeAndReflection, getPhongConstant());
Vector phongTerm = lightSourceColor.multiply(specularHighlightColor).multiply(phongComponent).multiply(visibilityTerm);
return getVectorColor(ambientTerm.add(diffuseTerm).add(phongTerm));
}
I am seeing that the dot product between the normal and the light source is -1 for the yellow triangle, and about -.707 for the blue triangle, so I'm not sure if the normal being the wrong way is the problem. Regardless, when I added made sure the angle between the light and the normal was positive (Math.abs(directionToLight.dotProduct(normal));), it caused the opposite problem:
I suspect that it will be a small typo/bug, but I need another pair of eyes to spot what I couldn't.
Note: My triangles have vertices(a,b,c), and the edges (u,v) are computed using a-b and c-b respectively (also, those are used for calculating the plane/triangle normal). A Vector is made up of an (x,y,z) point, and a Ray is made up of a origin Vector and a normalized direction Vector.
Here is how I am calculating normals for all triangles:
private Vector getPlaneNormal()
{
Vector v1 = getU();
Vector v2 = getV();
return v1.crossProduct(v2).normalize();
}
Please let me know if I left out anything that you think is important for solving these issues.
EDIT: After help from #TheVee, this is what I have at then end:
There are still problems with z-buffering, And with phong highlights with the triangles, but the problem I was trying to solve here was fixed.
It is an usual problem in ray tracing of scenes including planar objects that we hit them from a wrong side. The formulas containing the dot product are presented with an inherent assumption that light is incident at the object from a direction to which the outer-facing normal is pointing. This can be true only for half the possible orientations of your triangle and you've been in bad luck to orient it with its normal facing away from the light.
Technically speaking, in a physical world your triangle would not have zero volume. It's composed of some layer of material which is just thin. On either side it has a proper normal that points outside. Assigning a single normal is a simplification that's fair to take because the two only differ in sign.
However, if we made a simplification we need to account for it. Having what technically is an inwards facing normal in our formulas gives negative dot products, which case they are not made for. It's like light was coming from the inside of the object or that it hit a surface could not possibly be in its way. That's why they give an erroneous result. The negative value will subtract light from other sources, and depending on the magnitude and implementation may result in darkening, full black, or numerical underflow.
But because we know the correct normal is either what we're using or its negative, we can simply fix the cases at once by taking a preventive absolute value where a positive dot product is implicitly assumed (in your code, that's angleBetweenLightAndNormal). Some libraries like OpenGL do that for you, and on top use the additional information (the sign) to choose between two different materials (front and back) you may provide if desired. Alternatively, they can be set to not draw the back faces for solid object at all because they will be overdrawn by front faces in solid objects anyway (known as face culling), saving about half of the numerical work.
Basically , I have a binary image that contains an object , I applied contours and moments functions to find the center of mass , and detect the object in this image . ( irregular object )
What I want to do now is to generate lines ( at different angles ) passing through the center of mass , to the edge of the contour , and find which line is of the longest length .
Any help regarding this matter would be appreciated.
Assuming the lines are drawn from the center of mass to the perimeter of the mass, instead of using test angles just use the contour points themselves and perform a distance calculation on each set of points. See below for example.
(The example code is in C++ and the question tag is java, I will get burned one day for this.)
Mat GrayImage; // input grayscale image, set to something
Mat ContourImage;
Mat DrawImage = Mat::zeros(GrayImage.size(), CV_8UC3);
int thresh = 90;
// get a threshold image using Canny edge detector
Canny(GrayImage, ContourImage, thresh, thresh * 2, 3);
vector< vector<Point> > contours;
vector<Vec4i> hierarchy;
findContours(ContourImage, contours, hierarchy, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);// retrieves external contours, CHANGES THRESHOLD IMAGE
vector<Point2f> centerofMass(contours.size());
vector<Moments> mu(contours.size());
// for every contour in the image
for (int i = 0; i < contours.size(); i++)
{
// draw the contour
drawContours(DrawImage, contours, i, Scalar(200, 54, 120), 2, 8, hierarchy, 0, Point());
//calculate center of mass
mu[i] = moments(contours[i],false);
centerofMass[i] = Point2f(mu[i].m10 / mu[i].m00, mu[i].m01 / mu[i].m00);
double biggestDistance = 0;
Point2f farthest_Perimeter_Point;
// for each point in the current contour
for (int j = 0; j < contours[i].size(); j++)
{
// get a point that makes up the contour
Point2f perimeterofMass(contours[i][j].x, contours[i][j].y);
//calculate the distance
double dist = cv::norm(centerofMass[i] - perimeterofMass);
// determine farthest point
if (dist > biggestDistance)
{
biggestDistance = dist;
farthest_Perimeter_Point = perimeterofMass;
}
}
// now we have farthest point as farthest_Perimeter_Point for the current contour at [i]
//draw the line because why not;
line(DrawImage, centerofMass[i], farthest_Perimeter_Point, Scalar(145, 123, 201));
}
imshow("grayimage", GrayImage);
imshow("thresholdimage", ContourImage);
imshow("drawimage", DrawImage);
The other assumption would be the lines are drawn from one starting point on the mass's perimeter to the other side of the mass while intersecting the center. First start with one point of the perimeter, form line equation in point-intercept form using the start point and the center point. Second find where this line intersects the other side and now you can calculate distance. Third determine the largest distance among these lines.
References:
Related OpenCV question
Related OpenCV example
I have a set of rectangles and I would like to "reduce" the set so I have the fewest number of rectangles to describe the same area as the original set. If possible, I would like it to also be fast, but I am more concerned with getting the number of rectangles as low as possible. I have an approach now which works most of the time.
Currently, I start at the top-left most rectangle and see if I can expand it out right and down while keeping it a rectangle. I do that until it can't expand anymore, remove and split all intersecting rectangles, and add the expanded rectangle back in the list. Then I start the process again with the next top-left most rectangle, and so on. But in some cases, it doesn't work. For example:
With this set of three rectangles, the correct solution would end up with two rectangles, like this:
However, in this case, my algorithm starts by processing the blue rectangle. This expand downwards and splits the yellow rectangle (correctly). But then when the remainder of the yellow rectangle is processed, instead of expanding downwards, it expands right first and takes back the portion that was previously split off. Then the last rectangle is processed and it can't expand right or down, so the original set of rectangles is left. I could tweak the algorithm to expand down first and then right. That would fix this case, but it would cause the same problem in a similar scenario that was flipped.
Edit: Just to clarify, the original set of rectangles do not overlap and do not have to be connected. And if a subset of rectangles are connected, the polygon which completely covers them can have holes in it.
Despite the title to your question, I think you’re actually looking for the minimum dissection into rectangles of a rectilinear polygon. (Jason’s links are about minimum covers by rectangles, which is quite a different problem.)
David Eppstein discusses this problem in section 3 of his 2010 survey article Graph-Theoretic Solutions to Computational Geometry Problems, and he gives a nice summary in this answer on mathoverflow.net:
The idea is to find the maximum number of disjoint axis-parallel diagonals that have two concave vertices as endpoints, split along those, and then form one more split for each remaining concave vertex. To find the maximum number of disjoint axis-parallel diagonals, form the intersection graph of the diagonals; this graph is bipartite so its maximum independent set can be found in polynomial time by graph matching techniques.
Here’s my gloss on this admirably terse description, using figure 2 from Eppstein’s article. Suppose we have a rectilinear polygon, possibly with holes.
When the polygon is dissected into rectangles, each of the concave vertices must be met by at least one edge of the dissection. So we get the minimum dissection if as many of these edges as possible do double-duty, that is, they connect two of the concave vertices.
So let’s draw the axis-parallel diagonals between two concave vertices that are contained entirely within the polygon. (‘Axis-parallel’ means ‘horizontal or vertical’ here, and a diagonal of a polygon is a line connecting two non-adjacent vertices.) We want to use as many of these lines as possible in the dissection as long as they don’t intersect.
(If there are no axis-parallel diagonals, the dissection is trivial—just make a cut from each concave vertex. Or if there are no intersections between the axis-parallel diagonals then we use them all, plus a cut from each remaining concave vertex. Otherwise, read on.)
The intersection graph of a set of line segments has a node for every line segment, and an edge joins two nodes if the lines cross. Here’s the intersection graph for the axis-parallel diagonals:
It’s bipartite with the vertical diagonals in one part, and the horizontal diagonals in the other part. Now, we want to pick as many of the diagonals as possible as long as they don’t intersect. This corresponds to finding the maximum independent set in the intersection graph.
Finding the maximum independent set in a general graph is an NP-hard problem, but in the special case of a bipartite graph, König’s theorem shows that it’s equivalent to the problem of finding a maximum matching, which can be solved in polynomial time, for example by the Hopcroft–Karp algorithm. A given graph can have several maximum matchings, but any of them will do, as they all have the same size. In the example, all the maximum matchings have three pairs of vertices, for example {(2, 4), (6, 3), (7, 8)}:
(Other maximum matchings in this graph include {(1, 3), (2, 5), (7, 8)}; {(2, 4), (3, 6), (5, 7)}; and {(1, 3), (2, 4), (7, 8)}.)
To get from a maximum matching to the corresponding minimum vertex cover, apply the proof of König’s theorem. In the matching shown above, the left set is L = {1, 2, 6, 7}, the right set is R = {3, 4, 5, 8}, and the set of unmatched vertices in L is U = {1}. There is only one alternating path starting in U, namely 1–3–6, so the set of vertices in alternating paths is Z = {1, 3, 6} and the minimum vertex cover is thus K = (L \ Z) ∪ (R ∩ Z) = {2, 3, 7}, shown in red below, with the maximum independent set in green:
Translating this back into the dissection problem, this means that we can use five axis-parallel diagonals in the dissection:
Finally, make a cut from each remaining concave vertex to complete the dissection:
Today I found O(N^5) solution for this problem, and I will share it here.
For the first step, you need to find a way to get the sum of any rectangle in a matrix, with complexity O(1). It's pretty easy to do.
Now for the second step, you need to know dynamic programming. The idea is to store a rectangle and break it into smaller pieces. If the rectangle is empty, you can return 0. And if it's filled, return 1.
There are N^4 states to store the rectangle, plus the O(N) complexity for each state... So you will get an O(N^5) algorithm.
Here's my code. I think it will help.
The input is simple. N, M (size of matrix)
After that, the following N lines will have 1s and 0s.
Example:
4 9
010000010
111010111
101111101
000101000
#include <bits/stdc++.h>
#define MAX 51
int tab[MAX][MAX];
int N,M;
int sumed[MAX][MAX];
int t(int x,int y) {
if(x<0||y<0)return 0;
return sumed[x][y];
}
int subrec(int x1,int y1,int x2,int y2) {
return t(x2,y2)-t(x2,y1-1)-t(x1-1,y2)+t(x1-1,y1-1);
}
int resp[MAX][MAX][MAX][MAX];
bool exist[MAX][MAX][MAX][MAX];
int dp(int x1,int y1,int x2,int y2) {
if(exist[x1][y1][x2][y2])return resp[x1][y1][x2][y2];
exist[x1][y1][x2][y2]=true;
int soma = subrec(x1,y1,x2,y2);
int area = (x2-x1+1)*(y2-y1+1);
if(soma==area){return resp[x1][y1][x2][y2]=1;}
if(!soma) {return 0;}
int best = 1000000;
for(int i = x1;i!=x2;++i) {
best = std::min(best,dp(x1,y1,i,y2)+dp(i+1,y1,x2,y2));
}
for(int i = y1;i!=y2;++i) {
best = std::min(best,dp(x1,y1,x2,i)+dp(x1,i+1,x2,y2));
}
return resp[x1][y1][x2][y2]=best;
}
void backtracking(int x1,int y1,int x2,int y2) {
int soma = subrec(x1,y1,x2,y2);
int area = (x2-x1+1)*(y2-y1+1);
if(soma==area){std::cout<<x1+1<<" "<<y1+1<<" "<<x2+1<<" "<<y2+1<<"\n";return;}
if(!soma) {return;}
int best = 1000000;
int obj = resp[x1][y1][x2][y2];
for(int i = x1;i!=x2;++i) {
int ans = dp(x1,y1,i,y2)+dp(i+1,y1,x2,y2);
if(ans==obj){
backtracking(x1,y1,i,y2);
backtracking(i+1,y1,x2,y2);
return;
}
}
for(int i = y1;i!=y2;++i) {
int ans = dp(x1,y1,x2,i)+dp(x1,i+1,x2,y2);
if(ans==obj){
backtracking(x1,y1,x2,i);
backtracking(x1,i+1,x2,y2);
return;
}
}
}
int main()
{
std::cin >> N >> M;
for(int i = 0; i != N;++i) {
std::string s;
std::cin >> s;
for(int j = 0; j != M;++j) {
if(s[j]=='1')tab[i][j]++;
}
}
for(int i = 0; i != N;++i) {
int val = 0;
for(int j = 0; j != M;++j) {
val += tab[i][j];
sumed[i][j]=val;
if(i)sumed[i][j]+=sumed[i-1][j];
}
}
std::cout << dp(0,0,N-1,M-1) << std::endl;
backtracking(0,0,N-1,M-1);
}
Imagine you have a number of lines (each one represented by two points). Also you have a rectangle of a specific size and you know coordinates of its upper left corner. Now you have to identify which of these lines intersect with rectangle and for all those that do - find regions created inside the rectangle by the lines and calculate areas of those regions.
Here is simple algorithm which can be improved by deeper thinking : -
Use line clipping algorithm in the rectangle.
Line clipping
Use Flood Fill algorithm for getting different regions & areas
Flood Fill
Use convex hull for each region to get vertices of regions
Graham Scan for convex hull
Edit:-
If floodfill needs to be avoided or co-ordinate system is not discrete then use following :-
Find all intersection point inside or on rectangle by the lines.
Construct a graph from intersection such that there exist an undirected edge from each intersection to other intersection if they both exist on some common line in rectangle. And also the distances between them as edge weights. Only construct edge between closest pair on a given line. This can be done by just sorting all intersections on a line and just adding edge on between each point in the sorted sequence.
Use following to get all polygons
Find_polygon(vertex u,int iter,vertex[] path) {
if(!visited[u]) {
visited[u] = true;
path[iter] = u;
if(iter==1) {
source = u;
for all edge(u,v)
Find_polygon(v,iter+1,path);
}
else {
for all edge(u,v) {
if(slope(u,v)!=slope(path[iter-1],u)) {
Find_polygon(v,iter+1,path);
}
}
}
}
else { //loop
index = findIndex(u,path); // can use array for O(1)
polygons.add(path[index to iteration])
}
}
polygons = [];
for all vertices v in graph :
Find_polygon(v);
Given a function Intersect(Polygon, Line) -> List<Polygon> that intersects a convex polygon with a line and returns a list of polygons (that contains only the original polygon if the line does not intersect it or the two resulting polygons if the line does divide the origonal one) you can do something like the following to get all resulting polygons inside the rectangle:
List<Polygon> Divide(Rectangle rect, List<Line> lines)
{
// initialize result list with given rectangle as polygon
List<Polygon> polys;
polys.add(Polygon(rect));
for (Line line: lines)
{
List<Polygon> polysNew;
for (Polygon poly: polys)
polysNew.addAll(Intersect(poly, line));
polys = polysNew;
}
return polysNew;
}
For calculating the area of the polygons see e.g. here.
Hello I am fairly new to programming and I am trying, in Java, to create a function that creates recursive triangles from a larger triangles midpoints between corners where the new triangles points are deviated from the normal position in y-value. See the pictures below for a visualization.
The first picture shows the progression of the recursive algorithm without any deviation (order 0,1,2) and the second picture shows it with(order 0,1).
I have managed to produce a working piece of code that creates just what I want for the first couple of orders but when we reach order 2 and above I run into the problem where the smaller triangles don't use the same midpoints and therefore looks like the picture below.
So I need help with a way to store and call the correct midpoints for each of the triangles. I have been thinking of implementing a new class that controls the calculation of the midpoints and stores them and etc, but as I have said I need help with this.
Below is my current code
The point class stores a x and y value for a point
lineBetween creates a line between the the selected points
void fractalLine(TurtleGraphics turtle, int order, Point ett, Point tva, Point tre, int dev) {
if(order == 0){
lineBetween(ett,tva,turtle);
lineBetween(tva,tre,turtle);
lineBetween(tre,ett,turtle);
} else {
double deltaX = tva.getX() - ett.getX();
double deltaY = tva.getY() - ett.getY();
double deltaXtre = tre.getX() - ett.getX();
double deltaYtre = tre.getY() - ett.getY();
double deltaXtva = tva.getX() - tre.getX();
double deltaYtva = tva.getY() - tre.getY();
Point one;
Point two;
Point three;
double xt = ((deltaX/2))+ett.getX();
double yt = ((deltaY/2))+ett.getY() +RandomUtilities.randFunc(dev);
one = new Point(xt,yt);
xt = (deltaXtre/2)+ett.getX();
yt = (deltaYtre/2)+ett.getY() +RandomUtilities.randFunc(dev);
two = new Point(xt,yt);
xt = ((deltaXtva/2))+tre.getX();
yt = ((deltaYtva/2))+tre.getY() +RandomUtilities.randFunc(dev);
three = new Point(xt,yt);
fractalLine(turtle,order-1,one,tva,three,dev/2);
fractalLine(turtle,order-1,ett,one,two,dev/2);
fractalLine(turtle,order-1,two,three,tre,dev/2);
fractalLine(turtle,order-1,one,two,three,dev/2);
}
}
Thanks in Advance
Victor
You can define a triangle by 3 points(vertexes). So the vertexes a, b, and c will form a triangle. The combinations ab,ac and bc will be the edges. So the algorithm goes:
First start with the three vertexes a,b and c
Get the midpoints of the 3 edges p1,p2 and p3 and get the 4 sets of vertexes for the 4 smaller triangles. i.e. (a,p1,p2),(b,p1,p3),(c,p2,p3) and (p1,p2,p3)
Recursively find the sub-triangles of the 4 triangles till the depth is reached.
So as a rough guide, the code goes
findTriangles(Vertexes[] triangle, int currentDepth) {
//Depth is reached.
if(currentDepth == depth) {
store(triangle);
return;
}
Vertexes[] first = getFirstTriangle(triangle);
Vertexes[] second = getSecondTriangle(triangle);
Vertexes[] third = getThirdTriangle(triangle);;
Vertexes[] fourth = getFourthTriangle(triangle)
findTriangles(first, currentDepth+1);
findTriangles(second, currentDepth+1);
findTriangles(third, currentDepth+1);
findTriangles(fourth, currentDepth+1);
}
You have to store the relevant triangles in a Data structure.
You compute the midpoints of any vertex again and again in the different paths of your recursion. As long as you do not change them by random, you get the same midpoint for every path so there's no problem.
But of course, if you modify the midpoints by random, you'll end with two different midpoints in two different paths of recursion.
You could modify your algorithm in a way that you not only pass the 3 corners of the triangle along, but also the modified midpoints of each vertex. Or you keep them in a separate list or map or something and only compute them one time and look them up otherwise.