PAM Face Authentication Musings: Eye Tracking using - Grid Search


http://code.google.com/p/pam-face-authentication/

By Rohan Anil , BITS Pilani Goa Campus

Initially Haar Face detection is run on the Image, and the Haar Eye Detection is run on the out put of face detection, The theory on haar detection is very interesting, I will give you a brief idea , the method was devised by Paul Viola and Michael Jones. In the Initial Stage, thousands of false images and true images are taken. True image means the face images and false images mean background, this could be anything. You can find a lot of papers about how to generate these background images. Now Haar Features are calculated of the true and false Images, Features mean in layman terms, some set of real values that can describe the “Object” in question, We are dealing with face here. Now after these features are calculated , we have to find, which of these Features are good , assign some kind of weights to it. These weights are calculated using adaboost, developed by Freund and Schapire, its a very simple iterative algorithm, which tries to find the weights for the features by trying to minimize the number of misclassification at every stage. Finally, when you perform detection, a search is done on the image, for every window in the search, features are calculated , and the classifier , usually a distance formula output, is calculated and depending on whether its less than a particular threshold , its marked as a face or not face.A Similar process is run for eye detection on the detected face. For the project, I have selected only the biggest face that is detected on the image , since its an authentication project. After the eye is detected. A center of Mass of the Eye Window is Calculated by using (255-Intensity) as the mass. You will be surprised how this concept from Phy 101 , leads to good Localization.



Tracker Model initialization is done after this step, Integral and Variance Projection is calculated in X and Y directions. Integral projection is the sum of Pixels , and Variance projection is the variance of the pixel values. Please not that , the Intensity values are calculated as (255-Intensity), but the figures in the articles are shown with the actual Intensity. Now from the previous step the Anchor Point is taken which is the eye co-ordinates localized and given as a parameter in the initialization step. This process is repeated until the eye detection step fails, at the moment the tracking algorithm is used to localize the eye co-ordinates in the the frame. Before I explain how its done, I will have to introduce some concepts from Math 101, Matching two 2D functions,


y=F(x) is your equation , ie is say the Integral Projection of the model [1]

y'=G(x) is the equation for the Integral Projection in the new frame [2]

now we use two parameters the, the translation parameter and scale parameter, and we will define the Distance Formulae with [1] and [2]

distance = pow((ScaleFactor*G(x) + TranslateFactor – F(x)),2)

So our objective is to find the Scale Factor and Translate Factor which gives you the minimum distance and hence achieving good localization

So a Iterative Grid Search for Scale between (a1,a2) and Translation between (t1,t2) is used. Iterative Algorithm looks at all the points on the Grid, and then decreases the step size by Half , and continues the same process for K number of Iteration, where intial step size (a1-a2)/N and (t1-t2)/N where N =5 for a 5X5 grid size. The code currently uses scale between (.9,1.1) and translate factor between (-4,4) . Although I have not completely explained every part of the algorithm, that is it uses previous angle , width , height, and some other parameters making it a bit more complex to explain.

Here is the Youtube Video.

I am able to get around 10fps on my laptop which is 32bit dual core 1.7ghz from pentium, Currently I am exploring a different algorithm , which would give >30 Fps, but I haven't released any code yet. The above algorithm can be improved a little by using Integral Image from the face detection for eye detection and tracker initialization , OpenCV 1.1.0 supports CV_BIGGEST_OBJECT, that could be used in face detection ,currently because OpenCV 1.0 is more widely used and packaged, I preferred not to swtich.

If you liked the article dont forget to comment and digg it!


5 comments:

Ashish said...

youtube \m/

Thirumal said...

Hi,
Discussions in OpenCV yahoo groups has revealed that OpenCV Haar Face detection is patented. Does your Face detection implementation rely on any of the SURF like patented code?

Regards,

Thirumal

Rohan said...

Hmm... OpenCV implementation of HAAR Face detection is patented ?

Anyways, the tracking code i wrote is under gplv3 and its something i came up with ( basically exhaustive(grid) search of scale and translate parameter ). you are free to use to as you like under gplv3

Nash said...
This comment has been removed by the author.
Nash said...

In your video the eyes are tracked even when the head orientation changes (head roll). However the face detection based on Haar fails during such frames. How is that taken care of?