HOG parameters in OpenCV Java Version

HOG parameters in OpenCV Java Version - java

Hi I am developing in Android and I want to use my cellphone camera to do something.
I am using OpenCV-2.4.9 Java package to extract HOG feature, but I am confused about the output vector.
My image size is 480x640. I set the window to be 48x64, blocksize 24x32, cellsize 12x16 and 8 bins for each cell. So for each window, I should get a 128 dimension data to describe it. After running the following code:
MatOfFloat keyPoints = new MatOfFloat();
Hog.compute(imagePatch, keyPoints);
keyPoints is a array whose length is 172800 (I think it is 1350x128). I think there should be a parameter to set the window stride to control the number of windows. In the library, there also another function to control the window stride:
public void compute(Mat img, MatOfFloat descriptors, Size winStride, Size padding, MatOfPoint locations)
but I dont know the meaning of the parameters. Could anyone help me to figure this out?

void compute(Mat img, MatOfFloat descriptors, Size winStride, Size padding, MatOfPoint locations)
Mat img
input image to test
MatOfFloat descriptors
output vector of descriptors, one for each window in the sliding window search. in c++ it is an vector treated as an array, that is all descriptors are in descriptors[0] in one long array. You need to know the descriptor size to get back each descriptor: Hog.getDescriptorSize() .
Size winStride
size.width = the amount of overlap in the x direction for the sliding window search;
size.height= the amount of overlap in the y direction for the sliding window search;
So if you set it to 1,1 it will check a window centered on every pixel. However this will be slow, so you can set it to the cell-size for a good trade-off.
Size padding
This adds a border around the image, such that the detector can find things near to the edges. without this the first point of detection will half the window size into the image, thus a good choice would be to set it to the window size or half the window size, or some order of the cell size.
MatOfPoint locations
This is a list of locations that you can pre-specify, for instance if you only want descriptors for certain locations. Leave it empty to do a full search.
Example
disclaimer: may not be proper java, but should give you an idea what the parameters do...
Extract some Hogs
MatOfFloat descriptors(0) //an empty vector of descriptors
Size winStride(Hog.width/2,Hog.height/2) //50% overlap in the sliding window
Size padding(0,0) //no padding around the image
MatOfPoint locations(0) ////an empty vector of locations, so perform full search
compute(img , descriptors, winStride, padding, locations)

Related

What does virtual display in Android do with Image pixels when smaller display dimensions are set than grabbed one?

I dont understand what happens with pixels in Virtual Display in Android when output dimensions are reduced compared to the input ones ?
When I have for example input = size of my Display = 1920x960 and I set outputs to be 1920/3 and 960/3, what happens in that case with image pixels:
pixel density is increased or
maybe it takes only smaller part of screen that is centered and has dimensions 640x320 or
something else?
Additionally, is there a way that I can only grab center part of screen as in picture below?

By digging in the AOSP, watching, looking at a thread and experimenting~
I have come to my own (might be overly) simplified conclusion.
Android calls the native Java SDK which produces the information that's needed to render a bitmap/pixels on a Java program.
If the results are the same don't need to pass/copy it again to the GPU.
If the results are not the same pass/copy it to the GPU to be "invalidated" then re-rendered.
Now to your question.
By looking at the Bitmap class and looking at this thread It came to my mind that resizing depends on the scaling ratio passed on to the Matrix class.
If resized, it will expensively create a new Bitmap that looks like something either a pretty-bad higher pixel-density or not-so-smooth lower pixel density.
If the pixel-density is increased (smaller dimensions, your case) it will look squashed and if need be, the colors are averaged to the nearest neighbouring pixels. ("kind of" like how JPEG works).
After resizing it will still stay to it's origin (top-left part of the rendered object) which is defined by it's X and Y coordinates.
For your second question, about screen grabbing you can take a look at this and then programatically resize the image by doing something like this:
//...
Bitmap.createBitmap(screenshot_bitmap, left, top, right, bottom);
//...

How to "shrink" canvas

I am currently designing Agar.io game with Java. Most parts of my game went well. The only problem is when the blob reaches certain size, it become too big for the current screen size to handle. I need to find a way to automatically shrink game canvas (window dimensions stay the same) to make the big blob to appear smaller and smaller food blobs appear even smaller than before.
My original approach was reducing each of blob's width and height to make then smaller, that's when I noticed this approach will not only reduce blob's size but increase distance between blobs. I dropped it.
I need some suggestions on how to take account on blob's x, y positions and width, height to "shrink" game canvas.

Image preprocessing with OpenCV before doing character recognition (tesseract)

I'm trying to develop simple PC application for license plate recognition (Java + OpenCV + Tess4j). Images aren't really good (in further they will be good). I want to preprocess image for tesseract, and I'm stuck on detection of license plate (rectangle detection).
My steps:
1) Source Image
Mat img = new Mat();
img = Imgcodecs.imread("sample_photo.jpg");
Imgcodecs.imwrite("preprocess/True_Image.png", img);
2) Gray Scale
Mat imgGray = new Mat();
Imgproc.cvtColor(img, imgGray, Imgproc.COLOR_BGR2GRAY);
Imgcodecs.imwrite("preprocess/Gray.png", imgGray);
3) Gaussian Blur
Mat imgGaussianBlur = new Mat();
Imgproc.GaussianBlur(imgGray,imgGaussianBlur,new Size(3, 3),0);
Imgcodecs.imwrite("preprocess/gaussian_blur.png", imgGaussianBlur);
4) Adaptive Threshold
Mat imgAdaptiveThreshold = new Mat();
Imgproc.adaptiveThreshold(imgGaussianBlur, imgAdaptiveThreshold, 255, CV_ADAPTIVE_THRESH_MEAN_C ,CV_THRESH_BINARY, 99, 4);
Imgcodecs.imwrite("preprocess/adaptive_threshold.png", imgAdaptiveThreshold);
Here should be 5th step, which is detection of plate region (probably even without deskewing for now).
I croped needed region from image (after 4th step) with Paint, and got:
Then I did OCR (via tesseract, tess4j):
File imageFile = new File("preprocess/adaptive_threshold_AFTER_PAINT.png");
ITesseract instance = new Tesseract();
instance.setLanguage("eng");
instance.setTessVariable("tessedit_char_whitelist", "acekopxyABCEHKMOPTXY0123456789");
String result = instance.doOCR(imageFile);
System.out.println(result);
and got (good enough?) result - "Y841ox EH" (almost true)
How can I detect and crop plate region after 4th step? Have I to make some changes (improvements) in 1-4 steps? Would like to see some example implemented via Java + OpenCV (not JavaCV).
Thanks in advance.
EDIT (thanks to #Abdul Fatir's answer)
Well, I provide working (for me atleast) code sample (Netbeans+Java+OpenCV+Tess4j) for those who interested in this question. Code is not the best, but I made it just for studying.
http://pastebin.com/H46wuXWn (do not forget to put tessdata folder into your project folder)

Here's how I suggest you should do this task.
Convert to Grayscale.
Gaussian Blur with 3x3 or 5x5 filter.
Apply Sobel Filter to find vertical edges.
Sobel(gray, dst, -1, 1, 0)
Threshold the resultant image to get a binary image.
Apply a morphological close operation using suitable structuring element.
Find contours of the resulting image.
Find minAreaRect of each contour. Select rectangles based on aspect ratio and minimum and maximum area.
For each selected contour, find edge density. Set a threshold for edge density and choose the rectangles breaching that threshold as possible plate regions.
Few rectangles will remain after this. You can filter them based on orientation or any criteria you deem suitable.
Clip these detected rectangular portions from the image after adaptiveThreshold and apply OCR.
a) Result after Step 5
b) Result after Step 7. Green ones are all the minAreaRects and the Red ones are those which satisfy the following criteria: Aspect Ratio range (2,12) & Area range (300,10000)
c) Result after Step 9. Selected rectangle. Criteria: Edge Density > 0.5
EDIT
For edge-density, what I did in the above examples is the following.
Apply Canny Edge detector directly to input image. Let the cannyED image be Ic.
Multiply results of Sobel filter and Ic. Basically, take an AND of Sobel and Canny images.
Gaussian Blur the resultant image with a large filter. I used 21x21.
Threshold the resulting image using OTSU's method. You'll get a binary image
For each red rectangle, rotate the portion inside this rectangle (in the binary image) to make it upright. Loop through the pixels of the rectangle and count white pixels. (How to rotate?)
Edge Density = No. of White Pixels in the Rectangle/Total no. of Pixels in the rectangle
Choose a threshold for edge density.
NOTE: Instead of going through steps 1 to 3, you can also use the binary image from step 5 for calculating edge density.

Actually OpenCV has pre-trained model specially for Russian license plates: haarcascade_russian_plate_number
Also there is open source ANPR project for Russian license plates: plate_recognition. It is not use tesseract, but it has quite good pre-trained neural network.

You find all connected components (the white areas) and determine their outline.
If you filter them based on size (as part of the image), ratio (width-height) and white/black ratio to retrieve candidate-plates.
Undo the transformation of the rectangle
Remove the bolts
Pass in image to the OCR engine.

JOGL image rendering

The end goal is to be able to render images of arbitrary sizes in JOGL and do it fast on basic graphic cards.
My initial attempt was to achieve this using textures. However, I ran into problems on some graphics cards, (more precisely, virtual machine graphics cards).
Some images exceed the GL_MAX_TEXTURE_SIZE and if the card does not support textures which are not power of two (gl.isNPOTTextureAvailable())
I then followed several (1, 2) samples which used glDrawPixels to render the image directly.
gl.glBlendFunc (GL.GL_SRC_ALPHA, GL.GL_ONE_MINUS_SRC_ALPHA);
gl.glEnable (GL.GL_BLEND);
gl.glColor3f (0.0f, 0.0f, 0.0f);
gl.glRasterPos2i (10, 300);
gl.glDrawPixels (dukeWidth, dukeHeight,
gl.GL_RGBA, gl.GL_UNSIGNED_BYTE,
dukeRGBA);
This works fine, except when the raster position moves outside the viewport. When part of the image (bottom left corner) goes outside the viewport, the whole image is not displayed.
[1] https://today.java.net/pub/a/today/2003/09/11/jogl2d.html
[2] http://www.java-tips.org/other-api-tips/jogl/drawing-pixels-and-showing-the-effect-of-gldrawpixels-glcopypixels-and-glpix.htm
I have managed to solve the image disappearing problem by replacing glRasterPos2i with glWindowPos2d but again this lead to another problemn - glWindowPos2d is only supported from openGL 1.4 and my virtual machines only support 1.1.
What is wrong with my approach?
Should I be handing images which are non-power size by padding textures?
Should I split large images into many textures (like a quilt) so that maximum texture size in not exceed? worried about performance in this case.
Tried Mesa3D to ensure obtain a higher openGL version, but cannot make it compile for windows. Any other software renderers recommended? (waiting on Swiftshader support)

I have managed to figure out an answer to my own question.
http://www.opengl.org/wiki/Raster_Position_And_Clipping
How do I draw glBitmap() or glDrawPixels() primitives that have an initial glRasterPos() outside the window's left or bottom edge?
When the raster position is set outside the window, it's often outside the view volume and subsequently marked as invalid. Rendering the glBitmap and glDrawPixels primitives won't occur with an invalid raster position. Because glBitmap/glDrawPixels produce pixels up and to the right of the raster position, it appears impossible to render this type of primitive clipped by the left and/or bottom edges of the window.
However, here's an often-used trick: Set the raster position to a valid value inside the view volume. Then make the following call:
glBitmap (0, 0, 0, 0, xMove, yMove, NULL);
This tells OpenGL to render a no-op bitmap, but move the current raster position by (xMove,yMove). Your application will supply (xMove,yMove) values that place the raster position outside the view volume. Follow this call with the glBitmap() or glDrawPixels() to do the rendering you desire.

How to resize a non-square image to square thumbnail (by adding white space)?

I'd like to create a square thumbnail of an image using Java. I've already managed to resize images through a couple of ways. However I'd like to create a real square image, also from a non-square image.
Example: the source has a size of 200x400 (widht/height)
the target size is 100x100
The algorithm would then need to resize the image to 50x100 and add 25x100 pixels of whitespace each on the left and on the right.
Can anyone help me with this?

Just create a 100x100 background; add the scaled image to it. Use Math.max(width, height) to determine the scale factor. Then, plot the scaled image over the background, use calculations (offset x, offset y) to put it in the proper position.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.