Facial Recognition Functionality #1989

LackesLab · 2023-03-13T20:45:35Z

LackesLab
Mar 13, 2023

Hey there, with this thread, i want to give some insights into the current development status. Discussion takes part on the discord.

At the current stage, not all functionality is in a final state. I am open to suggestions and improvements.

How does facial recognition work?

At the first stage, a face detection neural network scans an incoming image for faces. All faces are then cut out, resized to a standard size (depending on model, e.g. 128px x128px)
The cropped faces are then aligned, so that eyes and nose are one the same image coordinates in every images)

These “faces” are then fed into a neural network that generates so-called embeddings. These embeddings have certain characteristics so that different faces are mapped to different points in the 512 dimensional space.
The distance between them is determined using cosine distance.

If a new image is now sent to the facial recognition service, the extracted face embeddings are compared to all other currently stored embeddings. If the distance between two embeddings is below a certain threshold, it is considered as a match.

Current dependencies

python:3.10

fastapi
numpy
uvicorn
insightface
loguru
pytest
onnxruntime

What does the facial recognition service do?

In the first version, the api offers the following endpoints:

scan a new image
change the username of a “face/person/embedding/whatever you want to call it”
delete a “face/person/embedding/whatever you want to call it”
re-initialize a face: In the current version, only the first image of a person is stored and used for comparison. Details below
get a list of all stored faces/person/embeddings

Why Re-Initialize?

Storing only one embedding per person is currently the easiest way to perform facial recognition. Due to circumstances, an image is not guaranteed to be taken into the optimal lighting conditions, head angle etc.
Example:
Mike is a friend of mine and I have a lot of photos of him. I queuing all images the first time and the algorithm detects mike at a football game. Image was taken from the side, so we can only see half of the face. But because it is the first appearance of Mike, this image is taken as reference for ALL the following images. Thats not optimal.
As a workaround, an endpoint is exposed where embeddings/faces/persons can be re-initialized using a “perfect” photo.

Improve Rescan by storing information (TBD)

My idea is, that for future rescans of the data collection, I want to store meta information for each scanned image. This meta information contains the position of the found faces as well as the resulting embeddings. This is important because of the possible reinitialization of an embedding.
Storing the generated embeddings, I can simply recalculate the distances for the updated persons and provide the information back to the main application, that stores the “image tagging information”

0rn0lf · 2023-03-14T06:39:52Z

0rn0lf
Mar 14, 2023

Pretty good summary. But I wonder if it uses only the information of this single image. Faces can change for example with Make-up or lighting. And what about kids? My kids look totally different at 1, 2 and 4 years old. Will we be able to add additional faces to a person?

0 replies

jrasm91 · 2023-05-20T23:32:16Z

jrasm91
May 20, 2023
Maintainer

Thanks for kicking this off! Facial recognition has been implemented in #2180 and further enhancements are being discussed in #2472.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Facial Recognition Functionality #1989

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Facial Recognition Functionality #1989

LackesLab Mar 13, 2023

How does facial recognition work?

Current dependencies

What does the facial recognition service do?

Why Re-Initialize?

Improve Rescan by storing information (TBD)

Replies: 2 comments

0rn0lf Mar 14, 2023

jrasm91 May 20, 2023 Maintainer

LackesLab
Mar 13, 2023

0rn0lf
Mar 14, 2023

jrasm91
May 20, 2023
Maintainer