Related
UPDATE
You can find all the images I have for testing on my GitHub here:
GitHub repository with sources
There are also 2 videos, where the detection should work on as well
ORIGINAL QUESTION
I tried to use OpenCV 4.x.x to find the edges of a blackboard (image following), but somehow I cannot succeed. My code at the moment looks like this: (Android with OpenCV and live camera feed), where imgMat is a Mat from the camera feed:
Mat gray = new Mat();
Imgproc.cvtColor(imgMat, gray, Imgproc.COLOR_RGB2BGR);
Mat blurred = new Mat();
Imgproc.blur(gray, blurred, new org.opencv.core.Size(3, 3));
Mat canny = new Mat();
Imgproc.Canny(blurred, canny, 80, 230);
Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new org.opencv.core.Size(2, 2));
Mat dilated = new Mat();
Imgproc.morphologyEx(canny, dilated, Imgproc.MORPH_DILATE, kernel, new Point(0, 0), 10);
Mat rectImage = new Mat();
Imgproc.morphologyEx(dilated, rectImage, Imgproc.MORPH_CLOSE, kernel, new Point(0, 0), 5);
Mat endproduct = new Mat();
Imgproc.Canny(rectImage, endproduct, 120, 230);
List<MatOfPoint> contours = new ArrayList<>();
Mat hierarchy = new Mat();
Imgproc.findContours(endproduct, contours, hierarchy, Imgproc.RETR_LIST, Imgproc.CHAIN_APPROX_SIMPLE);
double maxArea = 0;
boolean hasContour = false;
MatOfPoint2f biggestContour = new MatOfPoint2f();
Iterator<MatOfPoint> each = contours.iterator();
while (each.hasNext()) {
MatOfPoint wrapper = each.next();
double area = Imgproc.contourArea(wrapper);
if (area > maxArea) {
maxArea = area;
biggestContour = new MatOfPoint2f(wrapper.toArray());
hasContour = true;
}
}
if (hasContour) {
Mat output = imgMat.clone();
MatOfPoint2f approx = new MatOfPoint2f();
MatOfPoint poly = new MatOfPoint();
Imgproc.approxPolyDP(biggestContour, approx, Imgproc.arcLength(biggestContour, true) * .02, true);
approx.convertTo(poly, CvType.CV_32S);
Rect rect = Imgproc.boundingRect(poly);
}
Somehow I am not able to get it working, although the same code(written in python) worked on my computer with a video. I take the output from the rectangle and display it on my mobile screen, where it flickers around a lot and does not work properly.
These are my images I tried the python program on, and they worked:
What am I doing wrong? I am not able to constantly detect the edges of the blackboard.
Additional information about the blackboard:
always rectangular
may have different lighting
the text should be ignored, only the main board should be detected
the outer blackboard should be ignored as well
only the contour for the main board should be shown/returned
Thanks for any advice or code!
I used HSV because that's the easiest way to detect specific colors. I used an abundancy test to automatically select the color threshold (so this will work for green or blue boards). However, this test will fail on white or black boards since white and black count as all colors according to hue. Instead, in HSV, white and black are easiest to detect as very low saturation (white) or as very low value (black).
I did a 3-way check for each and selected the mask that had the most pixels in it (I assume that the boards are the majority of the image). I'm not sure how this will work on other images since we only have one here, so this may or may not work for other boards.
I used approxPolyDP to cut down on the number of points in the contour until I had 4 points and used that to draw the shape.
import cv2
import numpy as np
# get unique colors (to speed up search) and return the most abundant mask
def getAbundantColor(channel, margin):
# get uniques
unique_colors, counts = np.unique(channel, return_counts=True);
# check for the most abundant color
most = None;
biggest_count = -1;
for col in unique_colors:
# count number of white pixels
mask = cv2.inRange(channel, int(col - margin), int(col + margin));
count = np.count_nonzero(mask);
# if bigger, set new "most"
if count > biggest_count:
biggest_count = count;
most = mask;
return most, biggest_count;
# load image
img = cv2.imread("blackboard.jpg");
# it's huge, scale down so that we can see the whole thing
h, w = img.shape[:2];
scale = 0.25;
h = int(scale*h);
w = int(scale*w);
img = cv2.resize(img, (w,h));
# hsv
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV);
h,s,v = cv2.split(hsv);
# median blur to get rid of most of the text
h = cv2.medianBlur(h, 5);
s = cv2.medianBlur(s, 5);
v = cv2.medianBlur(v, 5);
# get most abundant color
color_margin = 30;
hmask, hcount = getAbundantColor(h, color_margin);
# detect white and black separately
light_margin = 30;
# white
wmask = cv2.inRange(s, 0, light_margin);
wcount = np.count_nonzero(wmask);
# black
bmask = cv2.inRange(v, 0, light_margin);
bcount = np.count_nonzero(bmask);
# check which is biggest
sorter = [[hcount, hmask], [wcount, wmask], [bcount, bmask]];
sorter.sort();
mask = sorter[-1][1];
# dilate and erode to close holes
kernel = np.ones((3,3), np.uint8);
mask = cv2.dilate(mask, kernel, iterations = 2);
mask = cv2.erode(mask, kernel, iterations = 4);
mask = cv2.dilate(mask, kernel, iterations = 2);
# get contours # OpenCV 3.4, in OpenCV 2* or 4* it returns (contours, _)
_, contours, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE);
# for each contour, approximate a simpler shape until we have 4 points
simplified = [];
for con in contours:
# go until we have 4 points
num_points = 999999;
step_size = 0.01;
percent = step_size;
while num_points >= 4:
# get number of points
epsilon = percent * cv2.arcLength(con, True);
approx = cv2.approxPolyDP(con, epsilon, True);
num_points = len(approx);
# increment
percent += step_size;
# step back and get the points
# there could be more than 4 points if our step size misses it
percent -= step_size * 2;
epsilon = percent * cv2.arcLength(con, True);
approx = cv2.approxPolyDP(con, epsilon, True);
simplified.append(approx);
cv2.drawContours(img, simplified, -1, (0,0,200), 2);
# print out the number of points
for points in simplified:
print("Num Points: " + str(len(points)));
# show image
cv2.imshow("Image", img);
cv2.imshow("Hue", h);
cv2.imshow("Mask", mask);
cv2.waitKey(0);
Edit: In order to accommodate the uncertainty in the board's color and appearance I run the assumption that the board itself will be the majority of the picture. The lines involving the sorter are looking for the most abundant color in the image. If the white wall behind the board takes up more space in the image then that'll be the color that gets selected for the mask.
There are other ways to try and select just the board, but it's really difficult to come up with a catch-all solution. The rest of the code should do its job the same if you can come up with some way of masking the board. If you're willing to budge on the unknown color assumption and provide the original pictures of the failing cases then I can probably come up with an appropriate mask.
I am developing an application to detect the lesion area, for this I am using the grabcut to detect the ROI and remove the background from the image. However in some images it is not working well. He ends up not identifying the borders of the region of interest well. The watershed can better identify the edges for this type of work, however I am having difficulties making this transition from grabcut to watershed. Before processing the grabcut, the user uses touchevent to mark a rectangle around the image of interest (wound area) to facilitate the work of the algorithm. As the image below.
However, using other wound images, segmentation is not good, showing flaws in ROI detection.
Image using grabcut in app
Image using watershed in desktop
this is the code:
private fun extractForegroundFromBackground(coordinates: Coordinates, currentPhotoPath: String): String {
// TODO: Provide complex object that has both path and extension
val width = bitmap?.getWidth()!!
val height = bitmap?.getHeight()!!
val rgba = Mat()
val gray_mat = Mat()
val threeChannel = Mat()
Utils.bitmapToMat(bitmap, gray_mat)
cvtColor(gray_mat, rgba, COLOR_RGBA2RGB)
cvtColor(rgba, threeChannel, COLOR_RGB2GRAY)
threshold(threeChannel, threeChannel, 100.0, 255.0, THRESH_OTSU)
val rect = Rect(coordinates.first, coordinates.second)
val fg = Mat(rect.size(), CvType.CV_8U)
erode(threeChannel, fg, Mat(), Point(-1.0, -1.0), 10)
val bg = Mat(rect.size(), CvType.CV_8U)
dilate(threeChannel, bg, Mat(), Point(-1.0, -1.0), 5)
threshold(bg, bg, 1.0, 128.0, THRESH_BINARY_INV)
val markers = Mat(rgba.size(), CvType.CV_8U, Scalar(0.0))
Core.add(fg, bg, markers)
val marker_tempo = Mat()
markers.convertTo(marker_tempo, CvType.CV_32S)
watershed(rgba, marker_tempo)
marker_tempo.convertTo(markers, CvType.CV_8U)
val imgBmpExit = Bitmap.createBitmap(width, height, Bitmap.Config.RGB_565)
Utils.matToBitmap(markers, imgBmpExit)
image.setImageBitmap(imgBmpExit)
// Run the grab cut algorithm with a rectangle (for subsequent iterations with touch-up strokes,
// flag should be Imgproc.GC_INIT_WITH_MASK)
//Imgproc.grabCut(srcImage, firstMask, rect, bg, fg, iterations, Imgproc.GC_INIT_WITH_RECT)
// Create a matrix of 0s and 1s, indicating whether individual pixels are equal
// or different between "firstMask" and "source" objects
// Result is stored back to "firstMask"
//Core.compare(mark, source, mark, Core.CMP_EQ)
// Create a matrix to represent the foreground, filled with white color
val foreground = Mat(srcImage.size(), CvType.CV_8UC3, Scalar(255.0, 255.0, 255.0))
// Copy the foreground matrix to the first mask
srcImage.copyTo(foreground, mark)
// Create a red color
val color = Scalar(255.0, 0.0, 0.0, 255.0)
// Draw a rectangle using the coordinates of the bounding box that surrounds the foreground
rectangle(srcImage, coordinates.first, coordinates.second, color)
// Create a new matrix to represent the background, filled with black color
val background = Mat(srcImage.size(), CvType.CV_8UC3, Scalar(0.0, 0.0, 0.0))
val mask = Mat(foreground.size(), CvType.CV_8UC1, Scalar(255.0, 255.0, 255.0))
// Convert the foreground's color space from BGR to gray scale
cvtColor(foreground, mask, Imgproc.COLOR_BGR2GRAY)
// Separate out regions of the mask by comparing the pixel intensity with respect to a threshold value
threshold(mask, mask, 254.0, 255.0, Imgproc.THRESH_BINARY_INV)
// Create a matrix to hold the final image
val dst = Mat()
// copy the background matrix onto the matrix that represents the final result
background.copyTo(dst)
val vals = Mat(1, 1, CvType.CV_8UC3, Scalar(0.0))
// Replace all 0 values in the background matrix given the foreground mask
background.setTo(vals, mask)
// Add the sum of the background and foreground matrices by applying the mask
Core.add(background, foreground, dst, mask)
// Save the final image to storage
Imgcodecs.imwrite(currentPhotoPath + "_tmp.png", dst)
// Clean up used resources
firstMask.release()
source.release()
//bg.release()
//fg.release()
vals.release()
dst.release()
return currentPhotoPath
}
Exit:
How do I update the code to use watershed instead of grabcut?
A description of how to apply the watershed algorithm in OpenCV is here, although it is in Python. The documentation also contains some potentially useful examples. Since you already have a binary image, all that's left is to apply the Euclidean Distance Transform (EDT) and the watershed function. So instead of Imgproc.grabCut(srcImage, firstMask, rect, bg, fg, iterations, Imgproc.GC_INIT_WITH_RECT), you would have:
Mat dist = new Mat();
Imgproc.distanceTransform(srcImage, dist, Imgproc.DIST_L2, Imgproc.DIST_MASK_3); // use L2 for Euclidean Distance
Mat markers = Mat.zeros(dist.size(), CvType.CV_32S);
Imgproc.watershed(dist, markers); # apply watershed to resultant image from EDT
Mat mark = Mat.zeros(markers.size(), CvType.CV_8U);
markers.convertTo(mark, CvType.CV_8UC1);
Imgproc.threshold(mark, firstMask, 0, 255, Imgproc.THRESH_BINARY + Imgproc.THRESH_OTSU); # threshold results to get binary image
The thresholding step is described here. Also, optionally, before you apply Imgproc.watershed, you may want to apply some morphological operations to the result of EDT i.e; dilation, erosion:
Imgproc.dilate(dist, dist, Mat.ones(3, 3, CvType.CV_8U));
If you're not familiar with morphological operations when it comes to processing binary images, the OpenCV documentation contains some good, quick examples.
Hope this helps!
I was able to localize the content of the following image:
This is the current Java code:
Mat image = Imgcodecs.imread("test.png");
Mat gray = new Mat();
Imgproc.cvtColor(image, gray, Imgproc.COLOR_BGR2GRAY);
Core.absdiff(gray, new Scalar(255), gray);
Imgproc.threshold(gray, gray, 5, 255, Imgproc.THRESH_TOZERO);
Mat kernel1 = Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, new Size(11, 11));
Mat kernel2 = Mat.ones(3, 3, CvType.CV_8U);
Mat erosion = new Mat();
Imgproc.erode(gray, erosion, kernel2, new Point(-1, -1), 1);
Mat dilation = new Mat();
Imgproc.dilate(erosion, dilation, kernel1, new Point(-1, -1), 7);
final List<MatOfPoint> contours = new ArrayList<>();
final Mat hierarchy = new Mat();
Imgproc.findContours(dilation, contours, hierarchy,
Imgproc.RETR_TREE, Imgproc.CHAIN_APPROX_SIMPLE);
for (MatOfPoint contour : contours) {
RotatedRect rect = Imgproc.minAreaRect(new MatOfPoint2f(contour.toArray()));
Mat box = new Mat();
Imgproc.boxPoints(rect, box);
Imgproc.drawContours(image, contours, -1, new Scalar(0,0,255));
}
This is the resulting image:
As you may see - together with the useful content there are still a few scanning artifacts located with the red contours.
Is it possible to remove these scanning artifacts in some common way(that will work not only for this picture) without damage to content?
Also, how to properly rotate the content inside of this image(not the image itself) based on contours?
This problem can be treated as a Text Detection situation.
We can use some static image analysis:
Convert to Grey Scale
Apply Blurring/Smoothing
Threshold Image
Apply Morphological Dilation
Find Connected Components
Filter out components of small area
--
Gaussian Blur
Thresholding
Inverted Colors
Dilation
Detected Areas (after filtering) UPDATED
--
System.load("opencv_java320.dll");
Mat dst = new Mat();
Mat src = Imgcodecs.imread("path/to/your/image.png");
// Converting to Grey Scale
Imgproc.cvtColor(src, dst, Imgproc.COLOR_RGB2GRAY, 0);
// Blurring/Smoothing
Imgproc.GaussianBlur(dst, src, new Size(15.0,15.0),0.0,0.0);
// Thresholding / Binarization
Imgproc.threshold(src, dst, 150,255,Imgproc.THRESH_BINARY);
Mat painted = new Mat(); // UPDATED
src.copyTo(painted); // UPDATED
// Invert colors (helps with dilation)
Core.bitwise_not(dst,src);
// Image Dilation
Mat structuringElement = Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(55.0,55.0));
Imgproc.dilate(src, dst, structuringElement);
// Detect Text Areas
List<Rect> textBlocks = findTextBlocks(dst);
// Paint detected text areas
paintTextBlocks(textBlocks, painted);
static List<Rect> findTextBlocks(Mat dilated)
{
Mat labels = new Mat();
Mat stats = new Mat();
Mat centroids = new Mat();
// Find connected components
int numberOfLabels = Imgproc.connectedComponentsWithStats(dilated,labels,stats,centroids,8, CvType.CV_16U);
List<Rect> textBlocks = new ArrayList<>();
// adjust this threshold as your desire
double sizeThreshold = 0.01;
// Label 0 is considered to be the background label, so we skip it
for (int i = 1; i < numberOfLabels; i++)
{
// stats columns; [0-4] : [left top width height area}
Rect textBlock = new Rect(new Point(stats.get(i,0)[0],stats.get(i,1)[0]),new Size(stats.get(i,2)[0],
stats.get(i,3)[0]));
// stats.get(i,4)[0] is the area of the connected component / Filtering out small areas
if (Double.compare(stats.get(i,4)[0],dilated.height() * dilated.width() * sizeThreshold) > 0){
textBlocks.add(textBlock);
}
}
return textBlocks;
}
static void paintTextBlocks(List<Rect> textBlocks, Mat original)
{
for (Rect r : textBlocks)
{
Imgproc.rectangle(original, new Point(r.x,r.y), new Point(r.x+r.width,r.y+r.height),
new Scalar(100.0),2);
}
}
You can tune/adjust the following:
1) 3rd parameter of Imgproc.threshold method. Looking at the code it means that any pixel with color value higher of 150 will be replaced with 255 (white). Hence, increasing this number will result in getting fewer black/text pixels.
Decreasing the number will result in more black areas e.g. artifacts.
2) Size of Dilation structuring element (rectangle). Width and height should be the same and both odd numbers. Smaller dimensions of the structuring element means weaker dilation; smaller connected components. Larger dimensions means wider dilation with bigger connected components.
3) sizeThreshold in findTextBlocks() method. This variable controls the strength of the filtering of the connected components based on their size/area. Very small threshold will result in getting small areas e.g. artifacts and a big threshold will result in very big detected areas only.
I have the following image:
I would like to detect the red rectangle using cv::inRange method and HSV color space.
int H_MIN = 0;
int H_MAX = 10;
int S_MIN = 70;
int S_MAX = 255;
int V_MIN = 50;
int V_MAX = 255;
cv::cvtColor( input, imageHSV, cv::COLOR_BGR2HSV );
cv::inRange( imageHSV, cv::Scalar( H_MIN, S_MIN, V_MIN ), cv::Scalar( H_MAX, S_MAX, V_MAX ), imgThreshold0 );
I already created dynamic trackbars in order to change the values for HSV, but I can't get the desired result.
Any suggestion for best values (and maybe filters) to use?
In HSV space, the red color wraps around 180. So you need the H values to be both in [0,10] and [170, 180].
Try this:
#include <opencv2\opencv.hpp>
using namespace cv;
int main()
{
Mat3b bgr = imread("path_to_image");
Mat3b hsv;
cvtColor(bgr, hsv, COLOR_BGR2HSV);
Mat1b mask1, mask2;
inRange(hsv, Scalar(0, 70, 50), Scalar(10, 255, 255), mask1);
inRange(hsv, Scalar(170, 70, 50), Scalar(180, 255, 255), mask2);
Mat1b mask = mask1 | mask2;
imshow("Mask", mask);
waitKey();
return 0;
}
Your previous result:
Result adding range [170, 180]:
Another interesting approach which needs to check a single range only is:
invert the BGR image
convert to HSV
look for cyan color
This idea has been proposed by fmw42 and kindly pointed out by Mark Setchell. Thank you very much for that.
#include <opencv2\opencv.hpp>
using namespace cv;
int main()
{
Mat3b bgr = imread("path_to_image");
Mat3b bgr_inv = ~bgr;
Mat3b hsv_inv;
cvtColor(bgr_inv, hsv_inv, COLOR_BGR2HSV);
Mat1b mask;
inRange(hsv_inv, Scalar(90 - 10, 70, 50), Scalar(90 + 10, 255, 255), mask); // Cyan is 90
imshow("Mask", mask);
waitKey();
return 0;
}
While working with dominant colors such as red, blue, green and yellow; analyzing the two color channels of the LAB color space keeps things simple. All you need to do is apply a suitable threshold on either of the two color channels.
1. Detecting Red color
Background :
The LAB color space represents:
the brightness value in the image in the primary channel (L-channel)
while colors are expressed in the two remaining channels:
the color variations between red and green are expressed in the secondary channel (A-channel)
the color variations between yellow and blue are expressed in the third channel (B-channel)
Code :
import cv2
img = cv2.imread('red.png')
# convert to LAB color space
lab = cv2.cvtColor(img, cv2.COLOR_BGR2LAB)
# Perform Otsu threshold on the A-channel
th = cv2.threshold(lab[:,:,1], 127, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
Result:
I have placed the LAB converted image and the threshold image besides each other.
2. Detecting Blue color
Now lets see how to detect blue color
Sample image:
Since I am working with blue color:
Analyze the B-channel (since it expresses blue color better)
Perform inverse threshold to make the blue region appear white
(Note: the code changes below compared to the one above)
Code :
import cv2
img = cv2.imread('blue.jpg')
# convert to LAB color space
lab = cv2.cvtColor(img, cv2.COLOR_BGR2LAB)
# Perform Otsu threshold on the A-channel
th = cv2.threshold(lab[:,:,2], 127, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
Result:
Again, stacking the LAB and final image:
Conclusion :
Similar processing can be performed on green and yellow colors
Moreover segmenting a range of one of these dominant colors is also much simpler.
I would like to create a Mat with a grid on a transparent background that I can lay on top of other Mats. I struggle with the transparent part and to laying on top
Mat image = imread("pic.jpg");
Mat grid = new Mat(image.size(), CV_8UC4, new Scalar(0, 0, 0, 0);
for (//times)
// draw grid with: line(grid, ... )
grid.copyTo(image);
First of all the grid Mat is not transparent at all it is black. Isn't scalar constructed like this?
new Scalar(Blue, Green, Red, Alpha)
Also how do I overlay an image with another one? This is just overwriting.
Here is sample program written in C++ but it should be very analogue in java:
cv::Mat input = cv::imread("../inputData/Lenna.png");
cv::Mat inputBGRA;
cv::cvtColor(input, inputBGRA, CV_BGR2BGRA);
cv::Mat gridSolid = cv::Mat(input.size(), inputBGRA.type(), cv::Scalar(0,0,0,0));
cv::Mat gridMask = cv::Mat(input.size(), CV_8UC1, cv::Scalar(0));
cv::Mat gridAlpha = cv::Mat(input.size(), inputBGRA.type(), cv::Scalar(0,0,0,0));
cv::line(gridSolid, cv::Point(0,0), cv::Point(512,512), cv::Scalar(0,255,0,255), 10);
cv::line(gridSolid, cv::Point(0,512), cv::Point(512,0), cv::Scalar(0,255,0,255), 10);
cv::line(gridMask, cv::Point(0,0), cv::Point(512,512), cv::Scalar(255), 10); // single channel
cv::line(gridMask, cv::Point(0,512), cv::Point(512,0), cv::Scalar(255), 10); // single channel
// copy and use the mask. copying eliminates the original values where the mask is set
cv::Mat outputCopy = inputBGRA.clone();
gridSolid.copyTo(outputCopy,gridMask);
// here set the scalar alpha value to less than 255
// both lines use different alpha values
cv::line(gridAlpha, cv::Point(0,0), cv::Point(512,512), cv::Scalar(0,255,0,120), 10);
cv::line(gridAlpha, cv::Point(0,512), cv::Point(512,0), cv::Scalar(0,255,0,180), 10);
cv::Mat outputWeightSum = inputBGRA.clone();
//cv::addWeighted(inputBGRA, 0.5, gridAlpha, 0.5, 0, outputWeightSum);
// manually add weighted sum PER ALPHA VALUE:
for(int y=0; y<outputWeightSum.rows; ++y)
for(int x=0; x<outputWeightSum.cols; ++x)
{
// the bigger the alpha value, the less of the original image is kept at that pixel
cv::Vec4b imgPix = outputWeightSum.at<cv::Vec4b>(y,x);
cv::Vec4b gridPix = gridAlpha.at<cv::Vec4b>(y,x);
// use alpha channel vor blending
float blendpart = (float)gridPix[3]/(float)255;
// set pixel value to blended value
outputWeightSum.at<cv::Vec4b>(y,x) = blendpart * gridPix + (1.0f-blendpart) * imgPix;
}
in fact you dont need the alpha channel in this example but if you have more complex "grids" with differing alpha values, it might be nice.
I get these results:
method: copy:
method: blend with alpha channel: