I have been working on a Face Recognition system to log attendance using OpenCV version 3.3.1 in Java.
I'm using haarcascade_frontalface_alt to detect faces. But I have seen a lot of false positives.
To avoid this, I followed the below technique.
I will first run haarcascade_frontalface_alt to detect a face region.
Then I will run haarcascade_eye_tree_eyeglasses to check if the face region has eyes.
If yes, I will split the image into 2 halves, vertically and I will use haarcascade_eye to detect eyes in each half to avoid any false positives.
This technique is not working in all cases and I have encountered a few cases where haarcascade_eye_tree_eyeglasses is returning more than 2 eyes & haarcascade_eye is returning more than 1 eye in each half.
Also, in order to eliminate noise(ears, hair and neck areas, etc) from the image, I'm cropping the image by a ROI specified as left eye position to right eye position. I also applied histogram equalization on the final image.
The only things I need to get rid of are false positives and multiple eye detections.
Please help me in this regard and also let me know if there is any better approach.
Here is the sample image with multiple eye detection.
Related
I am using Java with OpenCV Library to detect Face,Eyes and Mouth using Laptop Camera.
What I have done so far:
Capture Video Frames using VideoCapture object.
Detect Face using Haar-Cascades.
Divide the Face region into Top Region and Bottom Region.
Search for Eyes inside Top region.
Search for Mouth inside Bottom region.
Problem I am facing:
At first Video is running normally and suddenly it becomes slower.
Main Questions:
Do Higher Cameras' Resolutions work better for Haar-Cascades?
Do I have to capture Video Frames in a certain scale? for example (100px X100px)?
Do Haar-Cascades work better in Gray-scale Images?
Does different lighting conditions make difference?
What does the method detectMultiScale(params) exactly do?
If I want to go for further analysis for Eye Blinking, Eye Closure Duration, Mouth Yawning, Head Nodding and Head Orientation to Detect Fatigue (Drowsiness) By Using Support Vector Machine, any advices?
Your help is appreciated!
The following article, would give you an overview of the things going under the hood, I would highly recommend to read the article.
Do Higher Cameras' Resolutions work better for Haar-Cascades?
Not necessarily, the cascade.detectMultiScale has params to adjust for various input width, height scenarios, like minSize and maxSize, These are optional params However, But you can tweak these to get robust predictions if you have control over the input image size. If you set the minSize to smaller value and ignore maxSize then it will work for smaller and high res images as well, but the performance would suffer. Also if you imagine now, How come there is no differnce between High-res and low-res images then you should consider that the cascade.detectMultiScale internally scales the images to lower resolutions for performance boost, that is why defining the maxSize and minSize is important to avoid any unnecessary iterations.
Do I have to capture Video Frames in a certain scale? for example
(100px X100px)
This mainly depends upon the params you pass to the cascade.detectMultiScale. Personally I guess that 100 x 100 would be too small for smaller face detection in the frame as some features would be completely lost while resizing the frame to smaller dimensions, and the cascade.detectMultiScale is highly dependent upon the gradients or features in the input image.
But if the input frame only has face as a major part, and there are no other smaller faces dangling behind then you may use 100 X 100. I have tested some sample faces of size 100 x 100 and it worked pretty well. And if this is not the case then 300 - 400 px width should work good. However you would need to tune the params in order to achieve accuracy.
Do Haar-Cascades work better in Gray-scale Images?
They work only in gray-scale images.
In the article, if you read the first part, you will come to know that it face detection is comprised of detecting many binary patterns in the image, This basically comes from the ViolaJones, paper which is the basic of this algorithm.
Does different lighting conditions make difference?
May be in some cases, largely Haar-features are lighting invariant.
If you are considering different lighting conditions as taking images under green or red light, then it may not affect the detection, The haar-features (since dependent on gray-scale) are independent of the RGB color of input image. The detection mainly depends upon the gradients/features in the input image. So as far as there are enough gradient differences in the input image such as eye-brow has lower intensity than fore-head, etc. it will work fine.
But consider a case when input image has back-light or very low ambient light, In that case it may be possible that some prominent features are not found, which may result in face not detected.
What does the method detectMultiScale(params) exactly do?
I guess, if you have read the article, by this time, then you must be knowing it well.
If I want to go for further analysis for Eye Blinking, Eye Closure
Duration, Mouth Yawning, Head Nodding and Head Orientation to Detect
Fatigue (Drowsiness) By Using Support Vector Machine, any advices?
No, I won't suggest you to perform these type of gesture detection with SVM, as it would be extremely slow to run 10 different cascades to conclude current facial state, However I would recommend you to use some Facial Landmark Detection Framework, such as Dlib, You may search for some other frameworks as well, because the model size of dlib is nearly 100MB and it may not suit your needs i f you want to port it to mobile device. So the key is ** Facial Landmark Detection **, once you get the full face labelled, you can draw conclusions like if the mouth if open or the eyes are blinking, and it works in Real-time, so your video processing won't suffer much.
I've got a ridiculously insane Linear Algebra professor at uni who asked us this last Friday to develop a programme in Java that loads a monochrome picture and then applies an edge-detecting filter on it.
The problem is nobody in my class has got the slightest clue how to do it and I have only a week to get it done.
As I'm still trying to get my head round it and start it from scratch, does anybody have anything ready to send me so I can study it and save my semester?
Any efforts will be much appreciated.
Here's a very basic approach you might go with:
1) What is an edge in a monochrome image? One could say that it is a steep intensity gradient. If you go from black to white that is an edge, and vice versa.
2) A very simple filter operation that builds on this idea is the Sobel operator. Read up on it here: Wikipedia.
3) You'll stumble across 2 terms that may be unfamiliar to you: Kernel and Convolution. A kernel is basically a window moved over each pixel, performing an operation on the pixel's environment. In case of the Sobel 3x3 kernel, you assign a new value to the filtered image based on the pixel's direct neighbours. The convolution operation can be thought of as - among other things - an operation that moves the kernel across every pixel in the image (note: This is a gross oversimplification to get you started and technically incorrect. It should, however, give you the right idea)
4) Now the simplest way of applying a Sobel kernel to a BufferedImage is by using the ConvolveOp class. It is a prebuilt java class that takes a kernel, applies it to a given image and returns the filtered image. However, if this is for class, you might want to implement this yourself.
I have asked this question in this too. But since the topic was a different one, maybe it was not noticed. I got the eigenface algorithm for face recognition working using opencv in java. I wanted to increase the accuracy of the code as its a well known fact that eigenface relies greatly on the light intensity.
What I have Right Now
I get perfect results if I give a check for a image clicked at the same place where the pictures in my database have been clicked, but the results get weird as I give in images clicked in different places.
I figured out that the reason was that my images differ in the light intensity.
Hence , my question is
Is there any way to set a standard to the images saved in the database or the ones that are coming fresh into the system for a recognition check so that I can improve on the accuracy of the face-recognition system that I have currently?
Any kind of positive solution to the problem would be really helpful.
Identifying the lighting intensity and pose is the important factor of face recognition. Try to do histogram comparison with training and testing image (http://docs.opencv.org/doc/tutorials/imgproc/histograms/histogram_comparison/histogram_comparison.html). This parameter helps to avoid the worst lighting situation. And pre processing is one of the successful key factor of Face recognition. Gamma Correction and DOG filtering may reduce the lighting problems.
You can also elliptical filter out only the face,removing the noise created by hair,neck etc.
The OpenCV cookbook provides an excellent and simple tutorial on this.
Below are the following options which may help you boost your accuracy
1] Image Normalization:
Make your image pixel values from 0 to 1 so that to reduce the effect of lighting conditions
2] Image Alignment (This is a very important step to achieve good performance):
Align all the train images and test images so that eyes, nose, mouth of all the faces in all the images have almost the same co-ordinates
Check this post on face alignment (Highly recommended) : https://www.pyimagesearch.com/2017/05/22/face-alignment-with-opencv-and-python/
3] Data augmentation trick:
You can add filters to you faces that will have an effect of the same face in different lighting conditions
So from one face you can make several images in different lighting conditions
4] Removing Noise:
Before performing step 3 apply Gaussian blur to all the images
I'm developing Side Scroll 2D Game, using AndEngine
I'm using their SVG extension (I'm using vector graphic)
But I discovered strange and ugly effect, while moving my player (while camera is chasing player exactly, means while camera is changing its position)
Images of my sprites looks just different, they are like blurred or there is effect like those images would be moving (not changing their possition, just jittery effect, really hard to explain and call this effect properly) Hopefully this image may explain it a bit:
Its more or less, how does it look in the game, where:
a) "FIRST" image is showing square, while player is moving (CAMERA isn't) image looks as it should
b) "SECOND" the same image, but with this strange effect "which looks like image moving/blurring during camera moving [chasing player])
Friend of mine told me that it might be hardware problem:
"the blurring that you notice is actually a hardware problem. Some phones "smooth" the content on the screen to give a nicer feel to applications. I don't know if it's the screen or the graphics processor, but it doesn't occur on my wife's Samsung Captivate. It happens on my Atrix and Xoom though. It's really noticable on the Xoom due to the large screen size."
But seems there is way to prevent it, since I have tested many similar games, where camera is chasing player, and I could not notice such effect.
Is there a way to turn this off in code?
I'm grateful for previous answers, unfortunately, still problem exist.
Till now, I have tried:
casting (int) on setCenter method which is being executed on updateChaseEntity
testing game using PNG images, instead of SVG extension and vector graphic
different TextureOptions
hardwareAcceleration
If someone have different idea, what may cause this strange effect, I would be really grateful for help - thank you.
Some devices (Xperia Play) bleed everywhere when trying to draw things that are moving quickly. For example a red icon on the application list leaves a blur behind it. You could try hardwareAcceleration in the manifest (on and off) to see if it makes a difference.
You'd probably get the same effect if you weren't using svg
When your player's just going to the right and camera begins chasing him, all other sprites except player are moving to the left. Try to print the absolute coordinates of your "blurring" sprite (or some of its anchor points) to the log. The X-coord of sprite should be decreasing linearly. If you notice it's increasing some times, it could be a reason of blur.
Hope this will help.
It sounds like it's due to the camera moving in real increments, making the SVG components rest on non-integer bounds, and the SVG renderer making anti-aliasing come into effect to demonstrate this. Try moving the camera in integer increments by casting camera values to int.
I'm not familiar with this engine but I wonder why would you use vector graphics for pixelated art style. I'll be surprised if your character in the screenshot is really a vector art... maybe it's texture imported in SVG? I attempted back in the day to use flash a few times and I was making the same mistake... I'm not saying it's not possible but it's not intended to create pixel art with flash or any other vector software. There is a reason why most flash games have similar look.
Best way to debug it, is try a different looking sprite.
Maybe it is just the slow response time of your device display.
I'm also an Andengine developer, and never seen such behavior.
Sometimes you fix jittering using FixedStepEngine, it might help.
If you can post your code maybe we can better help you.
Here’s my task which I want to solve with as little effort as possible (preferrably with QT & C++ or Java): I want to use webcam video input to detect if there’s a (or more) crate(s) in front of the camera lens or not. The scene can change from "clear" to "there is a crate in front of the lens" and back while the cam feeds its video signal to my application. For prototype testing/ learning I have 2-3 images of the “empty” scene, and 2-3 images with one or more crates.
Do you know straightforward idea how to tackle this task? I found OpenCV, but isn't this framework too bulky for this simple task? I'm new to the field of computer vision. Is this generally a hard task or is it simple and robust to detect if there's an obstacle in front of the cam in live feeds? Your expert opinion is deeply appreciated!
Here's an approach I've heard of, which may yield some success:
Perform edge detection on your image to translate it into a black and white image, whereby edges are shown as black pixels.
Now create a histogram to record the frequency of black pixels in each vertical column of pixels in the image. The theory here is that a high frequency value in the histogram in or around one bucket is indicative of a vertical edge, which could be the edge of a crate.
You could also consider a second histogram to measure pixels on each row of the image.
Obviously this is a fairly simple approach and is highly dependent on "simple" input; i.e. plain boxes with "hard" edges against a blank background (preferable a background that contrasts heavily with the box).
You dont need a full-blown computer-vision library to detect if there is a crate or no crate in front of the camera. You can just take a snapshot and make a color-histogram (simple). To capture the snapshot take a look here:
http://msdn.microsoft.com/en-us/library/dd742882%28VS.85%29.aspx
Lots of variables here including any possible changes in ambient lighting and any other activity in the field of view. Look at implementing a Canny edge detector (which OpenCV has and also Intel Performance Primitives have as well) to look for the outline of the shape of interest. If you then kinda know where the box will be, you can perhaps sum pixels in the region of interest. If the box can appear anywhere in the field of view, this is more challenging.
This is not something you should start in Java. When I had this kind of problems I would start with Matlab (OpenCV library) or something similar, see if the solution would work there and then port it to Java.
To answer your question I did something similar by XOR-ing the 'reference' image (no crate in your case) with the current image then either work on the histogram (clustered pixels at right means large difference) or just sum the visible pixels and compare them with a threshold. XOR is not really precise but it is fast.
My point is, it took me 2hrs to install Scilab and the toolkits and write a proof of concept. It would have taken me two days in Java and if the first solution didn't work each additional algorithm (already done in Mat-/Scilab) another few hours. IMHO you are approaching the problem from the wrong angle.
If really Java/C++ are just some simple tools that don't matter then drop them and use Scilab or some other Matlab clone - prototyping and fine tuning would be much faster.
There are 2 parts involved in object detection. One is feature extraction, the other is similarity calculation. Some obvious features of the crate are geometry, edge, texture, etc...
So you can find some algorithms to extract these features from your crate image. Then comparing these features with your training sample images.