In our daily lives,we come across multiple situations where we use facial recognition such as unlocking your smartphone,bio-metric attendance etc.
But what exactly is Face Recognition?
Face Recognition is formally defined as identification or verification of identity of a person in a picture or real-time.
How does Face Recognition work?
One simple way me might think about this problem in context of machine learning is : Just train a neural network model with pictures of a person and anytime it sees the person it can verify whether that person exists in database or not
This method was in fact used a decade or so ago,but it had many drawbacks such as:
It was not feasible to have large enough number of pictures of one person,moreover database would have to be enormous to have a network working with great accuracy even if somehow you could manage to get thousands and thousands of images of a person
Another huge drawback was the training cost. Removing or adding one person in the database would require the training of whole network again. That proved to be very inconvenient.
So,what’s the solution?
In 2015,researchers developed a deep neural network called Siamese Network,and it changed the approach to face recognition completely.
Published in FaceNet paper,this network only needed 2 pictures of a person to be able to verify the person’s identity with high accuracy.
What is Siamese Network and how does it work?
Siamese network is one of the simplest model to understand out there. It uses Triplet Loss Function to optimize the embeddings of the image and learns to cluster the two images of same person away from that of another person.
Siamese network needs 3 images to function. Two images of the person and one of another person. These images are called Anchor,Positive and Negative image respectively.
The network takes these images and passes them through few convolutional and pooling layers to reduce their size while highlighting the important features.
Then it simply converts the image into an n-dimensional array of numbers and then into an embedding(array of numbers of a certain size like 64,128,256 etc).
Now,the network applies Triplet Loss Function
Triplet Loss Function is basically the distance between two points in n-dimensions. The model works on optimizing the weights such that the distance between anchor image and positive image is as small as possible and distance between anchor and negative image is as big as possible.
The green vector represents position of anchor image in space,blue represents positive and red represents negative image.
As training goes on the model slowly optimizes its weights to bring positive near the anchor and move negative away.
Siamese Network with Triplet Loss requires less space as only 2 pictures of a person are enough to recognize the person.
It is very easy to add more members in database or remove members from database as there is no need to train the whole model again
Due to these advantages,Siamese Network is widely used in Face Detection applications all over the world.
And,now you have a basic idea about how face detection in your phone works .
About the author
Sanchay Yadav, is an undergraduate pursuing B. Tech in SRMIST. Deep Learning and Blockchain Technologies are two topics he takes great interest in. He also loves anime,k-pop and debating over unnecessary things.
If You find it interesting!! we would really like to hear from you.
Ping us at Instagramemail@example.com
If you want articles on Any topics dm us on insta.
Thanks for reading!!