Basically, I want to achieve this, and so far, I've written the following Java code...
// Display the camera frame
public Mat onCameraFrame(CvCameraViewFrame inputFrame) {
// The object's width and height are set to 0
objectWidth = objectHeight = 0;
// frame is captured as a coloured image
frame = inputFrame.rgba();
/** Since the Canny algorithm only works on greyscale images and the captured image is
* coloured, we transform the captured cam image into a greyscale one
*/
Imgproc.cvtColor(frame, grey, Imgproc.COLOR_RGB2GRAY);
// Calculating borders of image using the Canny algorithm
Imgproc.Canny(grey, canny, 180, 210);
/** To avoid background noise (given by the camera) that makes the system too sensitive
* small variations, the image is blurred to a small extent. Blurring is one of the
* required steps before any image transformation because this eliminates small details
* that are of no use. Blur is a low-pass filter.
*/
Imgproc.GaussianBlur(canny, canny, new Size(5, 5), 5);
// Calculate the contours
Imgproc.findContours(canny, contours, new Mat(), Imgproc.RETR_EXTERNAL, Imgproc.CHAIN_APPROX_SIMPLE);
/** The contours come in different sequences
* 1 sequence for each connected component.
* Taking the assumption only 1 object is in view, if we have more than 1 connected
* component, this'll be considered part of the details of the object.
*
* For this, we put all contours together in a single sequence
* If there is at least 1 contour, I can continue processing
*/
for (MatOfPoint mat : contours) {
// Retrieve and store all contours in one giant map
mat.copyTo(allContours);
}
MatOfPoint2f allCon = new MatOfPoint2f(allContours.toArray());
// Calculating the minimal rectangle to contain the contours
RotatedRect box = Imgproc.minAreaRect(allCon);
// Getting the vertices of the rectangle
Point[] vertices = initialiseWithDefaultPointInstances(4);
box.points(vertices);
// Now the vertices are in possession, temporal smoothing can be performed.
for (int i = 0; i < 4; i++) {
// Smooth coordinate x of the vertex
vertices[i].x = alpha * lastVertices[i].x + (1.0 - alpha) * vertices[i].x;
// Smooth coordinate y of the vertex
vertices[i].y = alpha * lastVertices[i].y + (1.0 - alpha) * vertices[i].y;
// Assign the present smoothed values as lastVertices for the next smooth
lastVertices[i] = vertices[i];
}
/** With the vertices, the object size is calculated.
* The object size is calculated through pythagoras theorm. In addition, it gives
* the distance between 2 points in a bi-dimensional space.
*
* For a rectangle, considering any vertex V, its two sizes (width and height) can
* be calculated by calculating the distance of V from the previous vertex and
* calculating the distance of V from the next vertex. This is the reason why I
* calculate the distance between vertici[0]/vertici[3] and vertici[0]/vertici[1]
*/
objectWidth = (int) (conversionFactor * Math.sqrt((vertices[0].x - vertices[3].x) * (vertices[0].x - vertices[3].x) + (vertices[0].y - vertices[3].y) * (vertices[0].y - vertices[3].y)));
objectHeight = (int) (conversionFactor * Math.sqrt((vertices[0].x - vertices[1].x) * (vertices[0].x - vertices[1].x) + (vertices[0].y - vertices[1].y) * (vertices[0].y - vertices[1].y)));
/** Draw the rectangle containing the contours. The line method draws a line from 1
* point to the next, and accepts only integer coordinates; for this reason, 2
* temporary Points have been created and why I used Math.round method.
*/
Point pt1 = new Point();
Point pt2 = new Point();
for (int i = 0; i < 4; i++) {
pt1.x = Math.round(vertices[i].x);
pt1.y = Math.round(vertices[i].y);
pt2.x = Math.round(vertices[(i + 1) % 4].x);
pt2.y = Math.round(vertices[(i + 1) % 4].y);
Imgproc.line(frame, pt1, pt2, red, 3);
}
//If the width and height are non-zero, then print the object size on-screen
if (objectWidth != 0 && objectHeight != 0) {
String text;
text = String.format("%d x %d", objectWidth, objectHeight);
widthValue.setText(text);
}
// This function must return
return frame;
}
// Initialising an array of points
public static Point[] initialiseWithDefaultPointInstances(int length) {
Point[] array = new Point[length];
for (int i = 0; i < length; i++) {
array[i] = new Point();
}
return array;
}
What I want to achieve is drawing a rectangle on-screen that contains the object's contours (edges). If anyone knows the answer to my question, please feel free to comment below, as I have been stuck on this for a couple of hours
Here's the code referenced in the comment How to draw a rectangle containing an object in Android (Java, OpenCV)
public Mat onCameraFrame(CameraBridgeViewBase.CvCameraViewFrame inputFrame) {
// The object's width and height are set to 0
List<Integer> objectWidth = new ArrayList<>();
List<Integer> objectHeight = new ArrayList<>();
// frame is captured as a coloured image
Mat frame = inputFrame.rgba();
Mat gray = new Mat();
Mat canny = new Mat();
List<MatOfPoint> contours = new ArrayList<>();
/** Since the Canny algorithm only works on greyscale images and the captured image is
* coloured, we transform the captured cam image into a greyscale one
*/
Imgproc.cvtColor(frame, gray, Imgproc.COLOR_RGB2GRAY);
// Calculating borders of image using the Canny algorithm
Imgproc.Canny(gray, canny, 180, 210);
/** To avoid background noise (given by the camera) that makes the system too sensitive
* small variations, the image is blurred to a small extent. Blurring is one of the
* required steps before any image transformation because this eliminates small details
* that are of no use. Blur is a low-pass filter.
*/
Imgproc.GaussianBlur(canny, canny, new Size(5, 5), 5);
// Calculate the contours
Imgproc.findContours(canny, contours, new Mat(), Imgproc.RETR_EXTERNAL, Imgproc.CHAIN_APPROX_SIMPLE);
/** The contours come in different sequences
* 1 sequence for each connected component.
* Taking the assumption only 1 object is in view, if we have more than 1 connected
* component, this'll be considered part of the details of the object.
*
* For this, we put all contours together in a single sequence
* If there is at least 1 contour, I can continue processing
*/
if(contours.size() > 0){
// Calculating the minimal rectangle to contain the contours
List<RotatedRect> boxes = new ArrayList<>();
for(MatOfPoint contour : contours){
RotatedRect box = Imgproc.minAreaRect(new MatOfPoint2f(contour.toArray()));
boxes.add(box);
}
// Getting the vertices of the rectangle
List<Point[]> vertices = initialiseWithDefaultPointInstances(boxes.size(), 4);
for(int i=0; i<boxes.size(); i++){
boxes.get(i).points(vertices.get(i));
}
/*
double alpha = 0.5;
// Now the vertices are in possession, temporal smoothing can be performed.
for(int i = 0; i<vertices.size(); i++){
for (int j = 0; j < 4; j++) {
// Smooth coordinate x of the vertex
vertices.get(i)[j].x = alpha * lastVertices.get(i)[j].x + (1.0 - alpha) * vertices.get(i)[j].x;
// Smooth coordinate y of the vertex
vertices.get(i)[j].y = alpha * lastVertices.get(i)[j].y + (1.0 - alpha) * vertices.get(i)[j].y;
// Assign the present smoothed values as lastVertices for the next smooth
lastVertices.get(i)[j] = vertices.get(i)[j];
}
}*/
/** With the vertices, the object size is calculated.
* The object size is calculated through pythagoras theorm. In addition, it gives
* the distance between 2 points in a bi-dimensional space.
*
* For a rectangle, considering any vertex V, its two sizes (width and height) can
* be calculated by calculating the distance of V from the previous vertex and
* calculating the distance of V from the next vertex. This is the reason why I
* calculate the distance between vertici[0]/vertici[3] and vertici[0]/vertici[1]
*/
double conversionFactor = 1.0;
for(Point[] points : vertices){
int width = (int) (conversionFactor * Math.sqrt((points[0].x - points[3].x) * (points[0].x - points[3].x) + (points[0].y - points[3].y) * (points[0].y - points[3].y)));
int height = (int) (conversionFactor * Math.sqrt((points[0].x - points[1].x) * (points[0].x - points[1].x) + (points[0].y - points[1].y) * (points[0].y - points[1].y)));
objectWidth.add(width);
objectHeight.add(height);
}
/** Draw the rectangle containing the contours. The line method draws a line from 1
* point to the next, and accepts only integer coordinates; for this reason, 2
* temporary Points have been created and why I used Math.round method.
*/
Scalar red = new Scalar(255, 0, 0, 255);
for (int i=0; i<vertices.size(); i++){
Point pt1 = new Point();
Point pt2 = new Point();
for (int j = 0; j < 4; j++) {
pt1.x = Math.round(vertices.get(i)[j].x);
pt1.y = Math.round(vertices.get(i)[j].y);
pt2.x = Math.round(vertices.get(i)[(j + 1) % 4].x);
pt2.y = Math.round(vertices.get(i)[(j + 1) % 4].y);
Imgproc.line(frame, pt1, pt2, red, 3);
}
if (objectWidth.get(i) != 0 && objectHeight.get(i) != 0){
Imgproc.putText(frame, "width: " + objectWidth + ", height: " + objectHeight, new Point(Math.round(vertices.get(i)[1].x), Math.round(vertices.get(i)[1].y)), 1, 1, red);
}
}
}
// This function must return
return frame;
}
// Initialising an array of points
public static List<Point[]> initialiseWithDefaultPointInstances(int n_Contours, int n_Points) {
List<Point[]> pointsList = new ArrayList<>();
for(int i=0; i<n_Contours; i++){
Point[] array = new Point[n_Points];
for (int j = 0; j < n_Points; j++) {
array[j] = new Point();
}
pointsList.add(array);
}
return pointsList;
}
Related
I'm trying to draw functions using Java Swing and AWT. The problem is not always all of the 300 points of the graph are drawn. When I loop over the first points of the graph in debug mode, there is much more change the graph is drawn completely. I use the following code to create a JFrame and set the graphics object to the class member g.
jFrame = new JFrame();
jFrame.setSize(WIDTH, HEIGHT);
jFrame.setVisible(true);
g = jFrame.getContentPane().getGraphics();
Then I call this method for every function I want to draw.
private void drawGraph(IGraph graph, Bounds bounds, Ratios ratios) {
//contains visual information about the graph
GraphVisuals visuals = graph.getVisuals();
g.setColor(visuals.color);
//the previous point is remembered, to be able to draw a line from one point to the next
int previousXi = 0;
int previousYi = 0;
//a loop over every point of the graph. The graph object contains two arrays: the x values and the y values
for (int i = 0; i < graph.getSize(); ++i) {
//calculate the x value using the ratio between the graph's size on the x-axis and the window size and the starting point on the x-axis
int xi = (int) (ratios.xRatio * (graph.getX(i) - bounds.xMin) + 0.5);
//analogous for the y axis
int yi = HEIGHT - (int) (ratios.yRatio * (graph.getY(i) - bounds.yMin) + 0.5);
//draw
if (visuals.hasBullets) {
g.fillOval(xi, yi, visuals.bulletSize, visuals.bulletSize);
}
if (visuals.hasLine) {
if (i != 0) {
g.drawLine(previousXi, previousYi, xi, yi);
}
}
previousXi = xi;
previousYi = yi;
}
}
I have an image in Graphics2D that I need to rotate and then obtain the new co-ordinates of the image corners and the dimensions of the new bounding box.
I was originally trying to work with the image itself but I think it would be easier to work with a rectangle (or polygon) to give myself more flexibility. I was originally performing the rotation on the image simply with AffineTransform.rotate(). However, it would be cleaner if there was a way to translate each corner point individually, that would give me the values of A1, B1, C1 & D1. Is there a way in Graphics2D to rotate the individual corners?
I have found several questions relating to the bounding box dimensions of a rotated rectangle but I can't seem to get any of them to work in Java with Graphics2D.
You'll simply have to rotate the image corners yourself. The package java.awt.geom provides the classes Point2D and AffineTransform to do that by applying a rotation transform to individual points. The width and height of the rotated bounding box can be computed as the difference between the maximum and maximum rotated x and y coordinates, with the minimum x and y coordinate as offset.
The following program implements this algorithm and displays the results for several rotations from 0° to 360° in 30° steps:
package stackoverflow;
import java.awt.geom.AffineTransform;
import java.awt.geom.Point2D;
import java.awt.geom.Rectangle2D;
/**
* Demonstration of an implementation to rotate rectangles.
* #author Franz D.
*/
public class ImageRotate
{
/**
* Rotates a rectangle with offset (0,0).
* #param originalWidth original rectangle width
* #param originalHeight original rectangle height
* #param angleRadians rotation angle in radians
* #param rotatedCorners output buffer for the four rotated corners
* #return the bounding box of the rotated rectangle
* #throws NullPointerException if {#code rotatedCorners == null}.
* #throws ArrayIndexOutOfBoundsException if {#code rotatedCorners.length < 4}.
*/
public static Rectangle2D rotateRectangle(int originalWidth, int originalHeight,
double angleRadians,
Point2D[] rotatedCorners) {
// create original corner points
Point2D a0 = new Point2D.Double(0, 0);
Point2D b0 = new Point2D.Double(originalWidth, 0);
Point2D c0 = new Point2D.Double(0, originalHeight);
Point2D d0 = new Point2D.Double(originalWidth, originalHeight);
Point2D[] originalCorners = { a0, b0, c0, d0 };
// create affine rotation transform
AffineTransform transform = AffineTransform.getRotateInstance(angleRadians);
// transform original corners to rotated corners
transform.transform(originalCorners, 0, rotatedCorners, 0, originalCorners.length);
// determine rotated width and height as difference between maximum and
// minimum rotated coordinates
double minRotatedX = Double.POSITIVE_INFINITY;
double maxRotatedX = Double.NEGATIVE_INFINITY;
double minRotatedY = Double.POSITIVE_INFINITY;
double maxRotatedY = Double.NEGATIVE_INFINITY;
for (Point2D rotatedCorner: rotatedCorners) {
minRotatedX = Math.min(minRotatedX, rotatedCorner.getX());
maxRotatedX = Math.max(maxRotatedX, rotatedCorner.getX());
minRotatedY = Math.min(minRotatedY, rotatedCorner.getY());
maxRotatedY = Math.max(maxRotatedY, rotatedCorner.getY());
}
// the bounding box is the rectangle with minimum rotated X and Y as offset
double rotatedWidth = maxRotatedX - minRotatedX;
double rotatedHeight = maxRotatedY - minRotatedY;
Rectangle2D rotatedBounds = new Rectangle2D.Double(
minRotatedX, minRotatedY,
rotatedWidth, rotatedHeight);
return rotatedBounds;
}
/**
* Simple test for {#link #rotateRectangle(int, int, double, java.awt.geom.Point2D[])}.
* #param args ignored
*/
public static void main(String[] args) {
// setup original width
int originalWidth = 500;
int originalHeight = 400;
// create buffer for rotated corners
Point2D[] rotatedCorners = new Point2D[4];
// rotate rectangle from 0° to 360° in 30° steps
for (int angleDegrees = 0; angleDegrees < 360; angleDegrees += 30) {
// convert angle to radians
double angleRadians = Math.toRadians(angleDegrees);
// rotate rectangle
Rectangle2D rotatedBounds = rotateRectangle(
originalWidth, originalHeight,
angleRadians,
rotatedCorners);
// dump results
System.out.println("--- Rotate " + originalWidth + "x" + originalHeight + " by " + angleDegrees + "° ---");
System.out.println("Bounds: " + rotatedBounds);
for (Point2D rotatedCorner: rotatedCorners) {
System.out.println("Corner " + rotatedCorner);
}
}
}
}
If your image is not placed at offset (0, 0), you can simply modify the method to have the offset as input parameter, and adding the offset coordinates to the original points.
Also, this method rotates the image (or rectangle) about the origin (0, 0). If you want other rotation centers, AffineTransform provides an overloaded variant of getRotateInstace() which allows you to specify the rotation center (called "anchor" in the API documentation).
I'm trying to automate a process where someone manually converts a code to a digital one.
Then I started reading about OCR. So I installed tesseract OCR and tried it on some images. It doesn't even detect something close to the code.
I figured after reading some questions on stackoverflow, that the images need some preprocessing like skewing the image to a horizontal one, which can been done by openCV for example.
Now my questions are:
What kind of preprocessing or other methods should be used in a case like the above image?
Secondly, can I rely on the output? Will it always work in cases like the above image?
I hope someone can help me!
I have decided to capture the whole card instead of the code only. By capturing the whole card it is possible to transform it to a plain perspective and then I could easily get the "code" region.
Also I learned a lot of things. Especially regarding speed. This function is slow on high resolution images. It can take up to 10 seconds with a size of 3264 x 1836.
What I did to speed things up, is re-sizing the input matrix by a factor of 1 / 4. Which makes it 4^2 times faster and gave me a minimal lose of precision. The next step is scaling the quadrangle which we found back to the normal size. So that we can transform the quadrangle to a plain perspective using the original source.
The code I created for detecting the largest area is heavily based on code I found on stackoverflow. Unfortunately they didn't work as expected for me, so I combined more code snippets and modified a lot.
This is what I got:
private static double angle(Point p1, Point p2, Point p0 ) {
double dx1 = p1.x - p0.x;
double dy1 = p1.y - p0.y;
double dx2 = p2.x - p0.x;
double dy2 = p2.y - p0.y;
return (dx1 * dx2 + dy1 * dy2) / Math.sqrt((dx1 * dx1 + dy1 * dy1) * (dx2 * dx2 + dy2 * dy2) + 1e-10);
}
private static MatOfPoint find(Mat src) throws Exception {
Mat blurred = src.clone();
Imgproc.medianBlur(src, blurred, 9);
Mat gray0 = new Mat(blurred.size(), CvType.CV_8U), gray = new Mat();
List<MatOfPoint> contours = new ArrayList<>();
List<Mat> blurredChannel = new ArrayList<>();
blurredChannel.add(blurred);
List<Mat> gray0Channel = new ArrayList<>();
gray0Channel.add(gray0);
MatOfPoint2f approxCurve;
double maxArea = 0;
int maxId = -1;
for (int c = 0; c < 3; c++) {
int ch[] = {c, 0};
Core.mixChannels(blurredChannel, gray0Channel, new MatOfInt(ch));
int thresholdLevel = 1;
for (int t = 0; t < thresholdLevel; t++) {
if (t == 0) {
Imgproc.Canny(gray0, gray, 10, 20, 3, true); // true ?
Imgproc.dilate(gray, gray, new Mat(), new Point(-1, -1), 1); // 1 ?
} else {
Imgproc.adaptiveThreshold(gray0, gray, thresholdLevel, Imgproc.ADAPTIVE_THRESH_GAUSSIAN_C, Imgproc.THRESH_BINARY, (src.width() + src.height()) / 200, t);
}
Imgproc.findContours(gray, contours, new Mat(), Imgproc.RETR_LIST, Imgproc.CHAIN_APPROX_SIMPLE);
for (MatOfPoint contour : contours) {
MatOfPoint2f temp = new MatOfPoint2f(contour.toArray());
double area = Imgproc.contourArea(contour);
approxCurve = new MatOfPoint2f();
Imgproc.approxPolyDP(temp, approxCurve, Imgproc.arcLength(temp, true) * 0.02, true);
if (approxCurve.total() == 4 && area >= maxArea) {
double maxCosine = 0;
List<Point> curves = approxCurve.toList();
for (int j = 2; j < 5; j++)
{
double cosine = Math.abs(angle(curves.get(j % 4), curves.get(j - 2), curves.get(j - 1)));
maxCosine = Math.max(maxCosine, cosine);
}
if (maxCosine < 0.3) {
maxArea = area;
maxId = contours.indexOf(contour);
//contours.set(maxId, getHull(contour));
}
}
}
}
}
if (maxId >= 0) {
return contours.get(maxId);
//Imgproc.drawContours(src, contours, maxId, new Scalar(255, 0, 0, .8), 8);
}
return null;
}
You can call it like so:
MathOfPoint contour = find(src);
See this answer for quadrangle detection from a contour and transforming it to a plain perspective:
Java OpenCV deskewing a contour
For a project we were given a game engine off which to create a game. We, as part of this, have to implement pixel level collision detection after a possible collision has been found via a bounding box detection method. I have implemented both but my pixel level test fails for small objects (bullets in this case). I have checked if it works for slow bullets but that fails too.
For my pixel level implementation I create bitmasks for each texture using an the available IntBuffer (a ByteBuffer is available too?). The IntBuffer is in RGBA format and its size is width*height, I placed this in a 2D array and replaced all non-zero numbers with 1's to create the mask. After a collision of bounding boxes I find the rectangle represented by the overlap (using .createIntersection) and then check the maps of both sprites within this intersection for a nonzero pixel from both using bitwise AND.
Here is my code for the pixel level test:
/**
* Pixel level test
*
* #param rect the rectangle representing the intersection of the bounding
* boxes
* #param index1 the index at which the first objects texture is stored
* #param index the index at which the second objects texture is stored
*/
public static boolean isBitCollision(Rectangle2D rect, int index1, int index2)
{
int height = (int) rect.getHeight();
int width = (int) rect.getWidth();
long mask1 = 0;
long mask2 = 0;
for (int i = 0; i < width; i++)
{
for (int j = 0; j < height; j++)
{
mask1 = mask1 + bitmaskArr[index1].bitmask[i][j];//add up the current column of "pixels"
mask2 = mask2 + bitmaskArr[index2].bitmask[i][j];
if (((mask1) & (mask2)) != 0)//bitwise and, if both are nonzero there is a collsion
{
return true;
}
mask1 = 0;
mask2 = 0;
}
}
return false;
}
I've been struggling with this for days and any help will be greatly appreciated.
I managed to solve my own issue and now it works properly. For anyone interested what I did was find the rectangle created by the overlap of the two bounding boxes of the two sprites. I then dropped each object to the origin along with it, relatively, the rectangle of intersection. It should be noted that I dropped each object to a "separate" origin - ie I effectively had two rectangle of intersection afterwards - one for each. The co-ordinates of each rectangle of intersection, now in bounds of the bitmask 2D arrays for both objects, were used to check the correct regions for overlap of both objects:
I loop bottom to top left to right through the bitmask as the image data provided in upside - apparently this is the norm for image data.
/**
* My Pixel level test - 2D
*
* #param rect the rectangle representing the intersection of the bounding
* boxes
* #param index1 the index at which the first objects texture is stored
* #param index2 the index at which the second objects texture is stored
* #param p1 the position of object 1
* #param p2 the position of object 2
* #return true if there is a collision at a pixel level false if not
*/
//public static boolean isPixelCollision(Rectangle2D rect, Point2D.Float p1, Bitmask bm1, Point2D.Float p2, Bitmask bm2)
public static boolean isPixelCollision(Rectangle2D rect, Point2D.Float p1, int index1, Point2D.Float p2, int index2)
{
int height = (int) rect.getHeight();
int width = (int) rect.getWidth();
byte mask1 = 0;
byte mask2 = 0;
//drop both objects to the origin and drop a rectangle of intersection for each along with them
//this allows for us to have the co-ords of the rect on intersection within them at number that are inbounds.
Point2D.Float origP1 = new Point2D.Float((float) Math.abs(rect.getX() - p1.x), (float) Math.abs(rect.getY() - p1.y));//rect for object one
Point2D.Float origP2 = new Point2D.Float((float) Math.abs(rect.getX() - p2.x), (float) Math.abs(rect.getY() - p2.y));//rect for object two
//to avoid casting with every iteration
int start1y = (int) origP1.y;
int start1x = (int) origP1.x;
int start2y = (int) origP2.y;
int start2x = (int) origP2.x;
//we need to loop within the rect of intersection
//goind bottom up and left to right
for (int i = height - 1; i > 0; i--)
{
for (int j = 0; j < width; j++)
{
mask1 = bitmaskArr[index1].bitmask[start1y + i][start1x + j];
mask2 = bitmaskArr[index2].bitmask[start2y + i][start2x + j];
if ((mask1 & mask2) > 0)
{
return true;
}
}
}
//no collsion was found
return false;
}
The problem could be with this part of the code:
if (((mask1) & (mask2)) != 0)//bitwise and, if both are nonzero there is a collsion
{
return true;
}
you seem to be using a bitwise and for checking whether both values are non zero - but this may not work in such a manner.
For instance the value of the following expression:
3 & 4 == 0
is true
This is because when you do a bitwise operation you need to think of the numbers as their bit representation and do the operation bit-by-bit.
So:
3 = 0000 0011
& 4 = 0000 0100
---------------
0 = 0000 0000
this is because of how the 1 bit values align to one another. For a bit in a bitwise and to have the value of 1 - two bits at the same location in the different numbers need to be 1.
Another example would be:
3 = 0000 0011
& 2 = 0000 0010
---------------
2 = 0000 0010
So in your case a better check would be:
if ( mask1>0 && mask2>0 )//logical and if both are nonzero there is a collsion
{
return true;
}
I need a little help on an image analysis algorithm in Java. I basically have images like this:
So, as you might guessed, I need to count the lines.
What approach do you think would be best?
Thanks,
Smaug
A simple segmentation algorithm can help you out. Heres how the algorithm works:
scan pixels from left to right and
record the position of the first
black (whatever the color of your
line is) pixel.
carry on this process
unless you find one whole scan when
you don't find the black pixel.
Record this position as well.
We are
just interested in the Y positions
here. Now using this Y position
segment the image horizontally.
Now
we are going to do the same process
but this time we are going to scan
from top to bottom (one column at a
time) in the segment we just created.
This time we are interested in X
positions.
So in the end we get every
lines extents or you can say a
bounding box for every line.
The
total count of these bounding boxes
is the number of lines.
You can do many optimizations in the algorithm according to your needs.
package ac.essex.ooechs.imaging.commons.edge.hough;
import java.awt.image.BufferedImage;
import java.awt.*;
import java.util.Vector;
import java.io.File;
/**
* <p/>
* Java Implementation of the Hough Transform.<br />
* Used for finding straight lines in an image.<br />
* by Olly Oechsle
* </p>
* <p/>
* Note: This class is based on original code from:<br />
* http://homepages.inf.ed.ac.uk/rbf/HIPR2/hough.htm
* </p>
* <p/>
* If you represent a line as:<br />
* x cos(theta) + y sin (theta) = r
* </p>
* <p/>
* ... and you know values of x and y, you can calculate all the values of r by going through
* all the possible values of theta. If you plot the values of r on a graph for every value of
* theta you get a sinusoidal curve. This is the Hough transformation.
* </p>
* <p/>
* The hough tranform works by looking at a number of such x,y coordinates, which are usually
* found by some kind of edge detection. Each of these coordinates is transformed into
* an r, theta curve. This curve is discretised so we actually only look at a certain discrete
* number of theta values. "Accumulator" cells in a hough array along this curve are incremented
* for X and Y coordinate.
* </p>
* <p/>
* The accumulator space is plotted rectangularly with theta on one axis and r on the other.
* Each point in the array represents an (r, theta) value which can be used to represent a line
* using the formula above.
* </p>
* <p/>
* Once all the points have been added should be full of curves. The algorithm then searches for
* local peaks in the array. The higher the peak the more values of x and y crossed along that curve,
* so high peaks give good indications of a line.
* </p>
*
* #author Olly Oechsle, University of Essex
*/
public class HoughTransform extends Thread {
public static void main(String[] args) throws Exception {
String filename = "/home/ooechs/Desktop/vase.png";
// load the file using Java's imageIO library
BufferedImage image = javax.imageio.ImageIO.read(new File(filename));
// create a hough transform object with the right dimensions
HoughTransform h = new HoughTransform(image.getWidth(), image.getHeight());
// add the points from the image (or call the addPoint method separately if your points are not in an image
h.addPoints(image);
// get the lines out
Vector<HoughLine> lines = h.getLines(30);
// draw the lines back onto the image
for (int j = 0; j < lines.size(); j++) {
HoughLine line = lines.elementAt(j);
line.draw(image, Color.RED.getRGB());
}
}
// The size of the neighbourhood in which to search for other local maxima
final int neighbourhoodSize = 4;
// How many discrete values of theta shall we check?
final int maxTheta = 180;
// Using maxTheta, work out the step
final double thetaStep = Math.PI / maxTheta;
// the width and height of the image
protected int width, height;
// the hough array
protected int[][] houghArray;
// the coordinates of the centre of the image
protected float centerX, centerY;
// the height of the hough array
protected int houghHeight;
// double the hough height (allows for negative numbers)
protected int doubleHeight;
// the number of points that have been added
protected int numPoints;
// cache of values of sin and cos for different theta values. Has a significant performance improvement.
private double[] sinCache;
private double[] cosCache;
/**
* Initialises the hough transform. The dimensions of the input image are needed
* in order to initialise the hough array.
*
* #param width The width of the input image
* #param height The height of the input image
*/
public HoughTransform(int width, int height) {
this.width = width;
this.height = height;
initialise();
}
/**
* Initialises the hough array. Called by the constructor so you don't need to call it
* yourself, however you can use it to reset the transform if you want to plug in another
* image (although that image must have the same width and height)
*/
public void initialise() {
// Calculate the maximum height the hough array needs to have
houghHeight = (int) (Math.sqrt(2) * Math.max(height, width)) / 2;
// Double the height of the hough array to cope with negative r values
doubleHeight = 2 * houghHeight;
// Create the hough array
houghArray = new int[maxTheta][doubleHeight];
// Find edge points and vote in array
centerX = width / 2;
centerY = height / 2;
// Count how many points there are
numPoints = 0;
// cache the values of sin and cos for faster processing
sinCache = new double[maxTheta];
cosCache = sinCache.clone();
for (int t = 0; t < maxTheta; t++) {
double realTheta = t * thetaStep;
sinCache[t] = Math.sin(realTheta);
cosCache[t] = Math.cos(realTheta);
}
}
/**
* Adds points from an image. The image is assumed to be greyscale black and white, so all pixels that are
* not black are counted as edges. The image should have the same dimensions as the one passed to the constructor.
*/
public void addPoints(BufferedImage image) {
// Now find edge points and update the hough array
for (int x = 0; x < image.getWidth(); x++) {
for (int y = 0; y < image.getHeight(); y++) {
// Find non-black pixels
if ((image.getRGB(x, y) & 0x000000ff) != 0) {
addPoint(x, y);
}
}
}
}
/**
* Adds a single point to the hough transform. You can use this method directly
* if your data isn't represented as a buffered image.
*/
public void addPoint(int x, int y) {
// Go through each value of theta
for (int t = 0; t < maxTheta; t++) {
//Work out the r values for each theta step
int r = (int) (((x - centerX) * cosCache[t]) + ((y - centerY) * sinCache[t]));
// this copes with negative values of r
r += houghHeight;
if (r < 0 || r >= doubleHeight) continue;
// Increment the hough array
houghArray[t][r]++;
}
numPoints++;
}
/**
* Once points have been added in some way this method extracts the lines and returns them as a Vector
* of HoughLine objects, which can be used to draw on the
*
* #param percentageThreshold The percentage threshold above which lines are determined from the hough array
*/
public Vector<HoughLine> getLines(int threshold) {
// Initialise the vector of lines that we'll return
Vector<HoughLine> lines = new Vector<HoughLine>(20);
// Only proceed if the hough array is not empty
if (numPoints == 0) return lines;
// Search for local peaks above threshold to draw
for (int t = 0; t < maxTheta; t++) {
loop:
for (int r = neighbourhoodSize; r < doubleHeight - neighbourhoodSize; r++) {
// Only consider points above threshold
if (houghArray[t][r] > threshold) {
int peak = houghArray[t][r];
// Check that this peak is indeed the local maxima
for (int dx = -neighbourhoodSize; dx <= neighbourhoodSize; dx++) {
for (int dy = -neighbourhoodSize; dy <= neighbourhoodSize; dy++) {
int dt = t + dx;
int dr = r + dy;
if (dt < 0) dt = dt + maxTheta;
else if (dt >= maxTheta) dt = dt - maxTheta;
if (houghArray[dt][dr] > peak) {
// found a bigger point nearby, skip
continue loop;
}
}
}
// calculate the true value of theta
double theta = t * thetaStep;
// add the line to the vector
lines.add(new HoughLine(theta, r));
}
}
}
return lines;
}
/**
* Gets the highest value in the hough array
*/
public int getHighestValue() {
int max = 0;
for (int t = 0; t < maxTheta; t++) {
for (int r = 0; r < doubleHeight; r++) {
if (houghArray[t][r] > max) {
max = houghArray[t][r];
}
}
}
return max;
}
/**
* Gets the hough array as an image, in case you want to have a look at it.
*/
public BufferedImage getHoughArrayImage() {
int max = getHighestValue();
BufferedImage image = new BufferedImage(maxTheta, doubleHeight, BufferedImage.TYPE_INT_ARGB);
for (int t = 0; t < maxTheta; t++) {
for (int r = 0; r < doubleHeight; r++) {
double value = 255 * ((double) houghArray[t][r]) / max;
int v = 255 - (int) value;
int c = new Color(v, v, v).getRGB();
image.setRGB(t, r, c);
}
}
return image;
}
}
Source: http://vase.essex.ac.uk/software/HoughTransform/HoughTransform.java.html
I've implemented a simple solution (must be improved) using Marvin Framework that finds the vertical lines start and end points and prints the total number of lines found.
Approach:
Binarize the image using a given threshold.
For each pixel, if it is black (solid), try to find a vertical line
Save the x,y, of the start and end points
The line has a minimum lenght? It is an acceptable line!
Print the start point in red and the end point in green.
The output image is shown below:
The programs output:
Vertical line fount at: (74,9,70,33)
Vertical line fount at: (113,9,109,31)
Vertical line fount at: (80,10,76,32)
Vertical line fount at: (137,11,133,33)
Vertical line fount at: (163,11,159,33)
Vertical line fount at: (184,11,180,33)
Vertical line fount at: (203,11,199,33)
Vertical line fount at: (228,11,224,33)
Vertical line fount at: (248,11,244,33)
Vertical line fount at: (52,12,50,33)
Vertical line fount at: (145,13,141,35)
Vertical line fount at: (173,13,169,35)
Vertical line fount at: (211,13,207,35)
Vertical line fount at: (94,14,90,36)
Vertical line fount at: (238,14,236,35)
Vertical line fount at: (130,16,128,37)
Vertical line fount at: (195,16,193,37)
Vertical lines total: 17
Finally, the source code:
import java.awt.Color;
import java.awt.Point;
import marvin.image.MarvinImage;
import marvin.io.MarvinImageIO;
import marvin.plugin.MarvinImagePlugin;
import marvin.util.MarvinPluginLoader;
public class VerticalLineCounter {
private MarvinImagePlugin threshold = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.thresholding");
public VerticalLineCounter(){
// Binarize
MarvinImage image = MarvinImageIO.loadImage("./res/lines.jpg");
MarvinImage binImage = image.clone();
threshold.setAttribute("threshold", 127);
threshold.process(image, binImage);
// Find lines and save an output image
MarvinImage imageOut = findVerticalLines(binImage, image);
MarvinImageIO.saveImage(imageOut, "./res/lines_out.png");
}
private MarvinImage findVerticalLines(MarvinImage binImage, MarvinImage originalImage){
MarvinImage imageOut = originalImage.clone();
boolean[][] processedPixels = new boolean[binImage.getWidth()][binImage.getHeight()];
int color;
Point endPoint;
int totalLines=0;
for(int y=0; y<binImage.getHeight(); y++){
for(int x=0; x<binImage.getWidth(); x++){
if(!processedPixels[x][y]){
color = binImage.getIntColor(x, y);
// Black?
if(color == 0xFF000000){
endPoint = getEndOfLine(x,y,binImage,processedPixels);
// Line lenght threshold
if(endPoint.x - x > 5 || endPoint.y - y > 5){
imageOut.fillRect(x-2, y-2, 5, 5, Color.red);
imageOut.fillRect(endPoint.x-2, endPoint.y-2, 5, 5, Color.green);
totalLines++;
System.out.println("Vertical line fount at: ("+x+","+y+","+endPoint.x+","+endPoint.y+")");
}
}
}
processedPixels[x][y] = true;
}
}
System.out.println("Vertical lines total: "+totalLines);
return imageOut;
}
private Point getEndOfLine(int x, int y, MarvinImage image, boolean[][] processedPixels){
int xC=x;
int cY=y;
while(true){
processedPixels[xC][cY] = true;
processedPixels[xC-1][cY] = true;
processedPixels[xC-2][cY] = true;
processedPixels[xC-3][cY] = true;
processedPixels[xC+1][cY] = true;
processedPixels[xC+2][cY] = true;
processedPixels[xC+3][cY] = true;
if(getSafeIntColor(xC,cY,image) < 0xFF000000){
// nothing
}
else if(getSafeIntColor(xC-1,cY,image) == 0xFF000000){
xC = xC-2;
}
else if(getSafeIntColor(xC-2,cY,image) == 0xFF000000){
xC = xC-3;
}
else if(getSafeIntColor(xC+1,cY,image) == 0xFF000000){
xC = xC+2;
}
else if(getSafeIntColor(xC+2,cY,image) == 0xFF000000){
xC = xC+3;
}
else{
return new Point(xC, cY);
}
cY++;
}
}
private int getSafeIntColor(int x, int y, MarvinImage image){
if(x >= 0 && x < image.getWidth() && y >= 0 && y < image.getHeight()){
return image.getIntColor(x, y);
}
return -1;
}
public static void main(String args[]){
new VerticalLineCounter();
System.exit(0);
}
}
It depends on how much they look like that.
Bring the image to 1-bit (black and white) in a way that preserves the lines and brings the background to pure white
Perhaps do simple cleanup like speck removal (remove any small black components).
Then,
Find a black pixel
Use flood-fill algorithms to find its extent
See if the shape meets the criteria for being a line (lineCount++ if so)
remove it
Repeat this until there are no black pixels
A lot depends on how good you do #3, some ideas
Use Hough just on this section to check that you have one line, and that it is vertical(ish)
(after #1) rotate it to the vertical and check its width/height ratio