How to extract width and height of contour in javacv? - java

I am developing project on component identification using javacv package (Opencv ). I used a method to return set of rectangles on the image as "CvSeq" What I need to know is how to do following things
How can I get each rectangle from methods output (from CvSeq)?
How to access the lengths and width of the rectangle ?
This is the method that returns the rectangles
public static CvSeq findSquares( final IplImage src, CvMemStorage storage)
{
CvSeq squares = new CvContour();
squares = cvCreateSeq(0, sizeof(CvContour.class), sizeof(CvSeq.class), storage);
IplImage pyr = null, timg = null, gray = null, tgray;
timg = cvCloneImage(src);
CvSize sz = cvSize(src.width() & -2, src.height() & -2);
tgray = cvCreateImage(sz, src.depth(), 1);
gray = cvCreateImage(sz, src.depth(), 1);
pyr = cvCreateImage(cvSize(sz.width()/2, sz.height()/2), src.depth(), src.nChannels());
// down-scale and upscale the image to filter out the noise
cvPyrDown(timg, pyr, CV_GAUSSIAN_5x5);
cvPyrUp(pyr, timg, CV_GAUSSIAN_5x5);
cvSaveImage("ha.jpg", timg);
CvSeq contours = new CvContour();
// request closing of the application when the image window is closed
// show image on window
// find squares in every color plane of the image
for( int c = 0; c < 3; c++ )
{
IplImage channels[] = {cvCreateImage(sz, 8, 1), cvCreateImage(sz, 8, 1), cvCreateImage(sz, 8, 1)};
channels[c] = cvCreateImage(sz, 8, 1);
if(src.nChannels() > 1){
cvSplit(timg, channels[0], channels[1], channels[2], null);
}else{
tgray = cvCloneImage(timg);
}
tgray = channels[c]; // try several threshold levels
for( int l = 0; l < N; l++ )
{
// hack: use Canny instead of zero threshold level.
// Canny helps to catch squares with gradient shading
if( l == 0 )
{
// apply Canny. Take the upper threshold from slider
// and set the lower to 0 (which forces edges merging)
cvCanny(tgray, gray, 0, thresh, 5);
// dilate canny output to remove potential
// // holes between edge segments
cvDilate(gray, gray, null, 1);
}
else
{
// apply threshold if l!=0:
cvThreshold(tgray, gray, (l+1)*255/N, 255, CV_THRESH_BINARY);
}
// find contours and store them all as a list
cvFindContours(gray, storage, contours, sizeof(CvContour.class), CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE);
CvSeq approx;
// test each contour
while (contours != null && !contours.isNull()) {
if (contours.elem_size() > 0) {
approx = cvApproxPoly(contours, Loader.sizeof(CvContour.class),storage, CV_POLY_APPROX_DP, cvContourPerimeter(contours)*0.02, 0);
if( approx.total() == 4
&&
Math.abs(cvContourArea(approx, CV_WHOLE_SEQ, 0)) > 1000 &&
cvCheckContourConvexity(approx) != 0
){
double maxCosine = 0;
//
for( int j = 2; j < 5; j++ )
{
// find the maximum cosine of the angle between joint edges
double cosine = Math.abs(angle(new CvPoint(cvGetSeqElem(approx, j%4)), new CvPoint(cvGetSeqElem(approx, j-2)), new CvPoint(cvGetSeqElem(approx, j-1))));
maxCosine = Math.max(maxCosine, cosine);
}
if( maxCosine < 0.2 ){
cvSeqPush(squares, approx);
}
}
}
contours = contours.h_next();
}
contours = new CvContour();
}
}
return squares;
}
This is the sample original image that I used
And this is the image that I got after drawing lines around the matching rectangles
Actually in above images I'm tying to remove those large rectangles and just need to identify other rectangles so I need some code example to understand how to archive above goals. Please be kind enough to share your experience with me. Thanks !

OpenCV finds contours of the white objects in black background. In your case it is reverse, objects are black. And in that way, even image border is also an object. So to avoid that, just reverse image such that background is black.
Below I have demonstrated it (using OpenCV-Python):
import numpy as np
import cv2
im = cv2.imread('sofsqr.png')
img = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(img,127,255,1)
Remember, instead of using seperate function for inverting, I used it in threshold. Just convert the threshold type to BINARY_INV (ie '1').
Now you have an image as below :
Now we find the contours. Then for each contour, we approximate it and check if it is a rectangle by looking at the length of approximated contour, which should be four for a rectangle.
If drawn, you get like this:
And at the same time, we also find the bounding rect of each contour. The bounding rect has a shape like this : [initial point x, initial point y, width of rect, height of rect]
So you get the width and height.
Below is the code:
contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contours:
approx = cv2.approxPolyDP(cnt,cv2.arcLength(cnt,True)*0.02,True)
if len(approx)==4:
cv2.drawContours(im,[approx],0,(0,0,255),2)
x,y,w,h = cv2.boundingRect(cnt)
EDIT :
After some comments, I understood that, the real aim of the question is to avoid large rectangles and select only smaller ones.
It can be done using bounding rect values we obtained. ie, Select only those rectangles whose length is less than a threshold value, or breadth or area. As an example, in this image, I took area should be less than 10000.(A rough estimate). If it is less than 10000, it should be selected and we denote it in red color, otherwise, false candidate, represented in blue color ( just for visualization).
for cnt in contours:
approx = cv2.approxPolyDP(cnt,cv2.arcLength(cnt,True)*0.02,True)
if len(approx)==4:
x,y,w,h = cv2.boundingRect(approx)
if w*h < 10000:
cv2.drawContours(im,[approx],0,(0,0,255),-1)
else:
cv2.drawContours(im,[approx],0,(255,0,0),-1)
Below is the output i got :
How to get that threshold value? :
It completely depends on you and your application. Or you can find it by trial and error methods. ( i did so).
Hope that solves your problem. All functions are standard opencv functions. So i think you won't find any problem to convert to JavaCV.

Just noticed that there is a bug in the code provided in the question:
IplImage channels[] = {cvCreateImage(sz, 8, 1), cvCreateImage(sz, 8, 1), cvCreateImage(sz, 8, 1)};
channels[c] = cvCreateImage(sz, 8, 1);
if(src.nChannels() > 1){
cvSplit(timg, channels[0], channels[1], channels[2], null);
}else{
tgray = cvCloneImage(timg);
}
tgray = channels[c];
This means if there is a single channel, tgray will be an empty image.
It should read:
IplImage channels[] = {cvCreateImage(sz, 8, 1), cvCreateImage(sz, 8, 1), cvCreateImage(sz, 8, 1)};
channels[c] = cvCreateImage(sz, 8, 1);
if(src.nChannels() > 1){
cvSplit(timg, channels[0], channels[1], channels[2], null);
tgray = channels[c];
}else{
tgray = cvCloneImage(timg);
}

Related

Android semantic segmentation post-processing is too slow

I'd really appreciate it if anyone can advise with a task I've been working without success for the last week.
I have semantic segmentation model (MobileNetV3 + Lightweight ASPP).Short info: input - 1024x1024, output - same size and 2 classes (bg and vehicle), so my output shape is (1, 1048576, 2). I'm not the mobile dev or java world guy, so I used a few complete andoid examples for image segmentation to test it:
the one from google: https://github.com/tensorflow/examples/tree/master/lite/examples/image_segmentation
and another one open-sourced: https://github.com/pillarpond/image-segmenter-android
I successfully converted it to tflite format and its inference time on OnePlus 7 with GPU enabled and 10 threads is between 105-140ms for such size. But here I run into a problem: general execution time in these two android examples or any you can find for semantic segmentation is about 1050-1300ms (which is less than 1FPS). The slower part of this pipeline is image post-processing (~900-1150ms). You can see that part in the Deeplab#segment method. Since I have only 1 class besides bg - I don't have this third loop, but everything else is untouched and still very slow. Output size is not small in comparison to other common mobile sizes like 128/226/512, but still. I think it shouldn't take so much time to process 1024x1024 matrix and draw rectangles in canvas on modern smartphones.
I tried different solutions, like splitting matrix manipulations into threads or creating all these objects like RectF and Recognition once before and just filling their attributes with new data inside nested loops, but I didn't succeed on either of them. On the desktop side I easily handle it with numpy and opencv and I don't even close to understanding how can I do the same in Android and will it even be efficient or not.
Here's code which I use in python:
CLASS_COLORS = [(0, 0, 0), (255, 255, 255)] # black for bg and white for mask
def get_image_array(image_input, width, height):
img = cv2.imread(image_input, 1)
img = cv2.resize(img, (width, height))
img = img.astype(np.float32)
img[:, :, 0] -= 128.0
img[:, :, 1] -= 128.0
img[:, :, 2] -= 128.0
img = img[:, :, ::-1]
return img
def get_segmentation_array(seg_arr, n_classes):
output_height = seg_arr.shape[0]
output_width = seg_arr.shape[1]
seg_img = np.zeros((output_height, output_width, 3))
for c in range(n_classes):
seg_arr_c = seg_arr[:, :] == c
seg_img[:, :, 0] += ((seg_arr_c)*(CLASS_COLORS[c][0])).astype('uint8')
seg_img[:, :, 1] += ((seg_arr_c)*(CLASS_COLORS[c][1])).astype('uint8')
seg_img[:, :, 2] += ((seg_arr_c)*(CLASS_COLORS[c][2])).astype('uint8')
return seg_img
interpreter = tf.lite.Interpreter(model_path=f"my_model.tflite")
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
img_arr = get_image_array("input.png", 1024, 1024)
interpreter.set_tensor(input_details[0]['index'], np.array([x]))
interpreter.invoke()
output = interpreter.get_tensor(output_details[0]['index'])
output = output.reshape((1024, 1024, 2)).argmax(axis=2)
seg_img = get_segmentation_array(output, 2)
cv2.imwrite("output.png", seg_img)
Maybe there's anything powerful than the current solution for post-processing.
I would really appreciate any help with this. I'm sure there's anything that can improve post-processing and reduce its time to ~100ms, so I will have ~5FPS in general.
New Update. Thanks to Farmaker, I used a piece of code found in his repo from comment above and now pipeline looks like:
int channels = 3;
int n_classes = 2;
int float_byte_size = 4;
int width = model.inputWidth;
int height = model.inputHeight;
int[] intValues = new int[width * height];
ByteBuffer inputBuffer = ByteBuffer.allocateDirect(width * height * channels * float_byte_size).order(ByteOrder.nativeOrder());
ByteBuffer outputBuffer = ByteBuffer.allocateDirect(width * height * n_classes * float_byte_size).order(ByteOrder.nativeOrder());
Bitmap input = textureView.getBitmap(width, height);
input.getPixels(intValues, 0, width, 0, 0, height, height);
inputBuffer.rewind();
outputBuffer.rewind();
for (final int value: intValues) {
inputBuffer.putFloat(((value >> 16 & 0xff) - 128.0) / 1.0f);
inputBuffer.putFloat(((value >> 8 & 0xff) - 128.0) / 1.0f);
inputBuffer.putFloat(((value & 0xff) - 128.0) / 1.0f);
}
tfLite.run(inputBuffer, outputBuffer);
final Bitmap output = Bitmap.createBitmap(width, height, Bitmap.Config.ARGB_8888);
outputBuffer.flip();
int[] pixels = new int[width * height];
for (int i = 0; i < width * height; i++) {
float max = outputBuffer.getFloat();
float val = outputBuffer.getFloat();
int id = val > max ? 1 : 0;
pixels[i] = id == 0 ? 0x00000000 : 0x990000ff;
}
output.setPixels(pixels, 0, width, 0, 0, width, height);
resultView.setImageBitmap(resizeBitmap(output, resultView.getWidth(), resultView.getHeight()));
public static Bitmap resizeBitmap(Bitmap bm, int newWidth, int newHeight) {
int width = bm.getWidth();
int height = bm.getHeight();
float scaleWidth = ((float) newWidth) / width;
float scaleHeight = ((float) newHeight) / height;
// CREATE A MATRIX FOR THE MANIPULATION
Matrix matrix = new Matrix();
// RESIZE THE BIT MAP
matrix.postScale(scaleWidth, scaleHeight);
// "RECREATE" THE NEW BITMAP
Bitmap resizedBitmap = Bitmap.createBitmap(
bm, 0, 0, width, height, matrix, false);
bm.recycle();
return resizedBitmap;
}
Right now post-processing time is ~70-130ms, 95th is around 90ms, which alongside ~60ms of image pre-processing time, ~140ms inference time and around 30-40ms for other stuff with enabled GPU and 10 threads gives me general execution time around 330ms which is 3FPS! And this is for a large model for 1024x1024.
At this point, I'm more than satisfied and want to try different configurations for my model, including MobilenetV3 small as a backbone.

OpenCV - Closing contours (Java)

I'm currently trying to close the contours on the right of this picture:
Sample.
The reason for the opened contour lies in kabeja, a library to convert DXF files to images. It seems that on some images it doesn't convert the last pixel column (or row) and that's why the Sample picture is open.
I had the idea to use Core.copyMakeBorder() in Opencv, to add some space to the picture. After that I tried to use Imgproc.approxPolyDP() to close the contour, but this doesn't work. I tried this with different Epsilon values: Pics EDIT: Can't post more than 2 links
The reason for that is maybe that the contour surrounds the line. It never closes the contour where i want it to do.
I tried another method using Imgproc.convexHull(), which delivers this one: ConvexHull.
This could be useful for me, but i have no idea how to take out the part of the convex hull i need and merge it together with the contour to close it.
I hope that someone has an idea.
Here is my method for Imgproc.approxPolyDP()
public static ArrayList<MatOfPoint> makeComplete(Mat mat) {
System.out.println("makeComplete: START");
Mat dst = new Mat();
Core.copyMakeBorder(mat, dst, 10, 10, 10, 10, Core.BORDER_CONSTANT);
ArrayList<MatOfPoint> cnts = Tools.getContours(dst);
ArrayList<MatOfPoint2f> opened = new ArrayList<>();
//convert to MatOfPoint2f to use approxPolyDP
for (MatOfPoint m : cnts) {
MatOfPoint2f temp = new MatOfPoint2f(m.toArray());
opened.add(temp);
System.out.println("First loop runs");
}
ArrayList<MatOfPoint> closed = new ArrayList<>();
for (MatOfPoint2f conts : opened) {
MatOfPoint2f temp = new MatOfPoint2f();
Imgproc.approxPolyDP(conts, temp, 3, true);
MatOfPoint closedTemp = new MatOfPoint(temp.toArray());
closed.add(closedTemp);
System.out.println("Second loop runs");
}
System.out.println("makeComplete: END");
return closed;
}
And here the code for Imgproc.convexHull()
public static ArrayList<MatOfPoint> getConvexHull(Mat mat) {
Mat dst = new Mat();
Core.copyMakeBorder(mat, dst, 10, 10, 10, 10, Core.BORDER_CONSTANT);
ArrayList<MatOfPoint> cnts = Tools.getContours(dst);
ArrayList<MatOfPoint> out = new ArrayList<MatOfPoint>();
MatOfPoint mopIn = cnts.get(0);
MatOfInt hull = new MatOfInt();
Imgproc.convexHull(mopIn, hull, false);
MatOfPoint mopOut = new MatOfPoint();
mopOut.create((int) hull.size().height, 1, CvType.CV_32SC2);
for (int i = 0; i < hull.size().height; i++) {
int index = (int) hull.get(i, 0)[0];
double[] point = new double[]{
mopIn.get(index, 0)[0], mopIn.get(index, 0)[1]
};
mopOut.put(i, 0, point);
}
out.add(mopOut);
return out;
}
Best regards,
Brk
Assuming the assumption is correct, that the last row (for column it is similar) isn't converting (i.e. missing), then try the following. Assume x goes from left to right and y from top to bottom. We add one row of empty (white?) pixels at the image bottom and then go from left to right. Below is pseudo code:
// EMPTY - value of backgroung e.g. white for the sample image
PixelType curPixel = EMPTY;
int y = height - 1; // last row, the one we added
for (int x = 0; x < width; ++x)
{
// img(y,x) - current pixel, is "empty"
// img (y-1, x) - pixel above the current
if (img(y-1, x) != img(y, x))
{
// pixel above isn't empty, so we make current pixel non-empty
img(y, x) = img(y-1, x);
// if we were drawing, then stop, otherwise - start
if (curPixel == EMPTY)
curPixel = img(y-1, x);
else
curPixel = EMPTY;
}
else
{
img(y, x) = curPixel;
}
}

How to create BufferedImage for 32 bits per sample, 3 samples image data

I am trying to create a BufferedImage from some image data which is a byte array. The image is RGB format with 3 samples per pixel - R, G, and B and 32 bits per sample (for each sample, not all 3 samples).
Now I want to create a BufferedImage from this byte array. This is what I have done:
ColorModel cm = new ComponentColorModel(ColorSpace.getInstance(ColorSpace.CS_sRGB), new int[] {32, 32, 32}, false, false, Transparency.OPAQUE, DataBuffer.TYPE_INT);
Object tempArray = ArrayUtils.toNBits(bitsPerSample, pixels, samplesPerPixel*imageWidth, endian == IOUtils.BIG_ENDIAN);
WritableRaster raster = cm.createCompatibleWritableRaster(imageWidth, imageHeight);
raster.setDataElements(0, 0, imageWidth, imageHeight, tempArray);
BufferedImage bi = new BufferedImage(cm, raster, false, null);
The above code works with 24 bits per sample RGB image but not 32 bits per sample. The generated image is garbage which is shown on the right of the image. It is supposed to be like the left side of the image.
Note: the only image reader on my machine which can read this image is ImageMagick. All the others show similar results as the garbage one to the right of the following image.
The ArrayUtils.toNBits() just translates the byte array to int array with correct endianess. I'm sure this one is correct as I have cross checked with other methods to generate the same int array.
I guess the problem might arise from the fact I am using all the 32 bits int to represent the color which would contain negative values. Looks like I need long data type, but there is no DataBuffer type for long.
Instances of ComponentColorModel created with transfer types
DataBuffer.TYPE_BYTE, DataBuffer.TYPE_USHORT, and DataBuffer.TYPE_INT
have pixel sample values which are treated as unsigned integral
values.
The above quote is from Java document for ComponentColorModel. This means the 32 bit sample does get treated as unsigned integer value. Then the problem could be somewhere else.
Has any body met similar problem and got a workaround or I may have done some thing wrong here?
Update2: The "real" problem lies in the fact when 32 bit sample is used, the algorithm for the ComponentColorModel will shift 1 to the left 0 times (1<<0) since shift on int is always within 0~31 inclusive. This is not the expected value. To solve this problem (actually shift left 32 times), the only thing needs to be done is change 1 from int to long type as 1L as shown in the fix below.
Update: from the answer by HaraldK and the comments, we have finally agreed that the problem is coming from Java's ComponentColorModel which is not handling 32 bit sample correctly. The proposed fix by HaraldK works for my case too. The following is my version:
import java.awt.Transparency;
import java.awt.color.ColorSpace;
import java.awt.image.ComponentColorModel;
import java.awt.image.DataBuffer;
public class Int32ComponentColorModel extends ComponentColorModel {
//
public Int32ComponentColorModel(ColorSpace cs, boolean alpha) {
super(cs, alpha, false, alpha ? Transparency.TRANSLUCENT : Transparency.OPAQUE, DataBuffer.TYPE_INT);
}
#Override
public float[] getNormalizedComponents(Object pixel, float[] normComponents, int normOffset) {
int numComponents = getNumComponents();
if (normComponents == null || normComponents.length < numComponents + normOffset) {
normComponents = new float[numComponents + normOffset];
}
switch (transferType) {
case DataBuffer.TYPE_INT:
int[] ipixel = (int[]) pixel;
for (int c = 0, nc = normOffset; c < numComponents; c++, nc++) {
normComponents[nc] = ipixel[c] / ((float) ((1L << getComponentSize(c)) - 1));
}
break;
default: // I don't think we can ever come this far. Just in case!!!
throw new UnsupportedOperationException("This method has not been implemented for transferType " + transferType);
}
return normComponents;
}
}
Update:
This seems to be a known bug: ComponentColorModel.getNormalizedComponents() does not handle 32-bit TYPE_INT, reported 10 (TEN!) years ago, against Java 5.
The upside, Java is now partly open-sourced. We can now propose a patch, and with some luck it will be evaluated for Java 9 or so... :-P
The bug proposes the following workaround:
Subclass ComponentColorModel and override getNormalizedComponents() to properly handle 32 bit per sample TYPE_INT data by dividing the incoming pixel value by 'Math.pow(2, 32) - 1' when dealing with this data, rather than using the erroneous bit shift. (Using a floating point value is ok, since getNormalizedComponents() converts everything to floating point anyway).
My fix is a little different, but the basic idea is the same (feel free to optimize as you see fit :-)):
private static class TypeIntComponentColorModel extends ComponentColorModel {
public TypeIntComponentColorModel(final ColorSpace cs, final boolean alpha) {
super(cs, alpha, false, alpha ? TRANSLUCENT : OPAQUE, DataBuffer.TYPE_INT);
}
#Override
public float[] getNormalizedComponents(Object pixel, float[] normComponents, int normOffset) {
int numComponents = getNumComponents();
if (normComponents == null) {
normComponents = new float[numComponents + normOffset];
}
switch (transferType) {
case DataBuffer.TYPE_INT:
int[] ipixel = (int[]) pixel;
for (int c = 0, nc = normOffset; c < numComponents; c++, nc++) {
normComponents[nc] = ((float) (ipixel[c] & 0xffffffffl)) / ((float) ((1l << getComponentSize(c)) - 1));
}
break;
default:
throw new UnsupportedOperationException("This method has not been implemented for transferType " + transferType);
}
return normComponents;
}
}
Consider the below code. If run as is, for me it displays a mostly black image, with the upper right quarter white overlayed with a black circle. If I change the datatype to TYPE_USHORT (uncomment the transferType line), it displays half/half white and a linear gradient from black to white, with an orange circle in the middle (as it should).
Using ColorConvertOp to convert to a standard type seems to make no difference.
public class Int32Image {
public static void main(String[] args) {
// Define dimensions and layout of the image
int w = 300;
int h = 200;
int transferType = DataBuffer.TYPE_INT;
// int transferType = DataBuffer.TYPE_USHORT;
ColorModel colorModel = new ComponentColorModel(ColorSpace.getInstance(ColorSpace.CS_sRGB), false, false, Transparency.OPAQUE, transferType);
WritableRaster raster = colorModel.createCompatibleWritableRaster(w, h);
BufferedImage image = new BufferedImage(colorModel, raster, false, null);
// Start with linear gradient
if (raster.getTransferType() == DataBuffer.TYPE_INT) {
DataBufferInt buffer = (DataBufferInt) raster.getDataBuffer();
int[] data = buffer.getData();
for (int y = 0; y < h; y++) {
int value = (int) (y * 0xffffffffL / h);
for (int x = 0; x < w; x++) {
int offset = y * w * 3 + x * 3;
data[offset] = value;
data[offset + 1] = value;
data[offset + 2] = value;
}
}
}
else if (raster.getTransferType() == DataBuffer.TYPE_USHORT) {
DataBufferUShort buffer = (DataBufferUShort) raster.getDataBuffer();
short[] data = buffer.getData();
for (int y = 0; y < h; y++) {
short value = (short) (y * 0xffffL / h);
for (int x = 0; x < w; x++) {
int offset = y * w * 3 + x * 3;
data[offset] = value;
data[offset + 1] = value;
data[offset + 2] = value;
}
}
}
// Paint something (in color)
Graphics2D g = image.createGraphics();
g.setColor(Color.WHITE);
g.fillRect(0, 0, w / 2, h);
g.setColor(Color.ORANGE);
g.fillOval(100, 50, w - 200, h - 100);
g.dispose();
System.out.println("image = " + image);
// image = new ColorConvertOp(null).filter(image, new BufferedImage(image.getWidth(), image.getHeight(), BufferedImage.TYPE_INT_ARGB));
JFrame frame = new JFrame();
frame.add(new JLabel(new ImageIcon(image)));
frame.pack();
frame.setLocationRelativeTo(null);
frame.setVisible(true);
}
}
To me, this seems to suggest that there's something wrong with the ColorModel using transferType TYPE_INT. But I'd be happy to be wrong. ;-)
Another thing you could try, is to scale the values down to 16 bit, use a TYPE_USHORT raster and color model, and see if that makes a difference. I bet it will, but I'm too lazy to try. ;-)

Java OpenCV + Tesseract OCR "code" regocnition

I'm trying to automate a process where someone manually converts a code to a digital one.
Then I started reading about OCR. So I installed tesseract OCR and tried it on some images. It doesn't even detect something close to the code.
I figured after reading some questions on stackoverflow, that the images need some preprocessing like skewing the image to a horizontal one, which can been done by openCV for example.
Now my questions are:
What kind of preprocessing or other methods should be used in a case like the above image?
Secondly, can I rely on the output? Will it always work in cases like the above image?
I hope someone can help me!
I have decided to capture the whole card instead of the code only. By capturing the whole card it is possible to transform it to a plain perspective and then I could easily get the "code" region.
Also I learned a lot of things. Especially regarding speed. This function is slow on high resolution images. It can take up to 10 seconds with a size of 3264 x 1836.
What I did to speed things up, is re-sizing the input matrix by a factor of 1 / 4. Which makes it 4^2 times faster and gave me a minimal lose of precision. The next step is scaling the quadrangle which we found back to the normal size. So that we can transform the quadrangle to a plain perspective using the original source.
The code I created for detecting the largest area is heavily based on code I found on stackoverflow. Unfortunately they didn't work as expected for me, so I combined more code snippets and modified a lot.
This is what I got:
private static double angle(Point p1, Point p2, Point p0 ) {
double dx1 = p1.x - p0.x;
double dy1 = p1.y - p0.y;
double dx2 = p2.x - p0.x;
double dy2 = p2.y - p0.y;
return (dx1 * dx2 + dy1 * dy2) / Math.sqrt((dx1 * dx1 + dy1 * dy1) * (dx2 * dx2 + dy2 * dy2) + 1e-10);
}
private static MatOfPoint find(Mat src) throws Exception {
Mat blurred = src.clone();
Imgproc.medianBlur(src, blurred, 9);
Mat gray0 = new Mat(blurred.size(), CvType.CV_8U), gray = new Mat();
List<MatOfPoint> contours = new ArrayList<>();
List<Mat> blurredChannel = new ArrayList<>();
blurredChannel.add(blurred);
List<Mat> gray0Channel = new ArrayList<>();
gray0Channel.add(gray0);
MatOfPoint2f approxCurve;
double maxArea = 0;
int maxId = -1;
for (int c = 0; c < 3; c++) {
int ch[] = {c, 0};
Core.mixChannels(blurredChannel, gray0Channel, new MatOfInt(ch));
int thresholdLevel = 1;
for (int t = 0; t < thresholdLevel; t++) {
if (t == 0) {
Imgproc.Canny(gray0, gray, 10, 20, 3, true); // true ?
Imgproc.dilate(gray, gray, new Mat(), new Point(-1, -1), 1); // 1 ?
} else {
Imgproc.adaptiveThreshold(gray0, gray, thresholdLevel, Imgproc.ADAPTIVE_THRESH_GAUSSIAN_C, Imgproc.THRESH_BINARY, (src.width() + src.height()) / 200, t);
}
Imgproc.findContours(gray, contours, new Mat(), Imgproc.RETR_LIST, Imgproc.CHAIN_APPROX_SIMPLE);
for (MatOfPoint contour : contours) {
MatOfPoint2f temp = new MatOfPoint2f(contour.toArray());
double area = Imgproc.contourArea(contour);
approxCurve = new MatOfPoint2f();
Imgproc.approxPolyDP(temp, approxCurve, Imgproc.arcLength(temp, true) * 0.02, true);
if (approxCurve.total() == 4 && area >= maxArea) {
double maxCosine = 0;
List<Point> curves = approxCurve.toList();
for (int j = 2; j < 5; j++)
{
double cosine = Math.abs(angle(curves.get(j % 4), curves.get(j - 2), curves.get(j - 1)));
maxCosine = Math.max(maxCosine, cosine);
}
if (maxCosine < 0.3) {
maxArea = area;
maxId = contours.indexOf(contour);
//contours.set(maxId, getHull(contour));
}
}
}
}
}
if (maxId >= 0) {
return contours.get(maxId);
//Imgproc.drawContours(src, contours, maxId, new Scalar(255, 0, 0, .8), 8);
}
return null;
}
You can call it like so:
MathOfPoint contour = find(src);
See this answer for quadrangle detection from a contour and transforming it to a plain perspective:
Java OpenCV deskewing a contour

Why cvFindContours() method doesn't detect Contours correctly in javacv?

I went through many questions in StackOverflow and able to develop small program to detect squares and rectangles correctly. This is my sample code
public static CvSeq findSquares(final IplImage src, CvMemStorage storage) {
CvSeq squares = new CvContour();
squares = cvCreateSeq(0, sizeof(CvContour.class), sizeof(CvSeq.class), storage);
IplImage pyr = null, timg = null, gray = null, tgray;
timg = cvCloneImage(src);
CvSize sz = cvSize(src.width(), src.height());
tgray = cvCreateImage(sz, src.depth(), 1);
gray = cvCreateImage(sz, src.depth(), 1);
// cvCvtColor(gray, src, 1);
pyr = cvCreateImage(cvSize(sz.width() / 2, sz.height() / 2), src.depth(), src.nChannels());
// down-scale and upscale the image to filter out the noise
// cvPyrDown(timg, pyr, CV_GAUSSIAN_5x5);
// cvPyrUp(pyr, timg, CV_GAUSSIAN_5x5);
// cvSaveImage("ha.jpg",timg);
CvSeq contours = new CvContour();
// request closing of the application when the image window is closed
// show image on window
// find squares in every color plane of the image
for (int c = 0; c < 3; c++) {
IplImage channels[] = { cvCreateImage(sz, 8, 1), cvCreateImage(sz, 8, 1), cvCreateImage(sz, 8, 1) };
channels[c] = cvCreateImage(sz, 8, 1);
if (src.nChannels() > 1) {
cvSplit(timg, channels[0], channels[1], channels[2], null);
} else {
tgray = cvCloneImage(timg);
}
tgray = channels[c];
// // try several threshold levels
for (int l = 0; l < N; l++) {
// hack: use Canny instead of zero threshold level.
// Canny helps to catch squares with gradient shading
if (l == 0) {
// apply Canny. Take the upper threshold from slider
// and set the lower to 0 (which forces edges merging)
cvCanny(tgray, gray, 0, thresh, 5);
// dilate canny output to remove potential
// // holes between edge segments
cvDilate(gray, gray, null, 1);
} else {
// apply threshold if l!=0:
cvThreshold(tgray, gray, (l + 1) * 255 / N, 255,
CV_THRESH_BINARY);
}
// find contours and store them all as a list
cvFindContours(gray, storage, contours, sizeof(CvContour.class), CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE);
CvSeq approx;
// test each contour
while (contours != null && !contours.isNull()) {
if (contours.elem_size() > 0) {
approx = cvApproxPoly(contours, Loader.sizeof(CvContour.class), storage, CV_POLY_APPROX_DP, cvContourPerimeter(contours) * 0.02, 0);
if (approx.total() == 4 && Math.abs(cvContourArea(approx, CV_WHOLE_SEQ, 0)) > 1000 && cvCheckContourConvexity(approx) != 0) {
double maxCosine = 0;
for (int j = 2; j < 5; j++) {
// find the maximum cosine of the angle between
// joint edges
double cosine = Math.abs(angle(
new CvPoint(cvGetSeqElem(
approx, j % 4)),
new CvPoint(cvGetSeqElem(
approx, j - 2)),
new CvPoint(cvGetSeqElem(
approx, j - 1))));
maxCosine = Math.max(maxCosine, cosine);
}
if (maxCosine < 0.2) {
CvRect x = cvBoundingRect(approx, l);
if ((x.width() * x.height()) < 50000) {
System.out.println("Width : " + x.width()
+ " Height : " + x.height());
cvSeqPush(squares, approx);
}
}
}
}
contours = contours.h_next();
}
contours = new CvContour();
}
}
return squares;
}
I use this image to detect rectangles and squares
I need to identify the following output
and
But when I run the above code, it detects only the following rectangles. But I don't know the reason for that. Please can someone explain the reason for that.
This is the output that I got.
Please be kind enough to explain the problem in above code and give some suggensions to detect this squares and rectangles.
Given a mask image (binary image, like your second figure), cvFindContours() gives you the contours (several list of points).
look at this link: http://dasl.mem.drexel.edu/~noahKuntz/openCVTut7.html

Categories

Resources