This repository uses tfgo to perform face recognition on an image from file. After a lot of efforts, I came to the conclusion that if one wants to load a deep learning model in PyTorch or Jax in Go, they better think twice before committing a good amount of effort into it. Instead, they are better to first convert their models to TensorFlow and then work with tfgo.
In this repo, the input image is first processed, and then its embeddings are compared against the ones already computed from our dataset. In order to compute and save embeddings from an arbitrary dataset, one can use the QMagFace's repo. Once the embeddings are ready, this repo uses Go in order to do face recognition. If the distance between embeddings falls bellow a specific threshold, then the face is considered as unknown. Otherwise, the proper label will be printed.
This project is tested using Go 1.17
on Ubuntu 20.04. Except for tfgo
, latest version of other packages have been used and installed.
For gocv
, the version of OpenCV
installed is 4.7
. And for tfgo
, I installed this version instead of the official one.
Just run the following command in you project in order to install this package:
go get github.com/modanesh/[email protected]
There are many ways to convert a non-TF model to a TF one. For that purpose, I used ONNX as an intermediary to convert the QMagFace's model from PyTorch to TF.
Use the model_converter.py script to convert the PyTorch model to ONNX first, and then the ONNX model to TF.
Some of the code in the model_converter.py is taken from the official QMagFace's implementation.
For this project, you may download the MTCNN
and MagFace
tensorflow models from the following URL:
Model | URL |
---|---|
MTCNN | Google Drive |
MagFace | Google Drive |
In order to run the model using tfgo, you should know the input and output layers' names. In order to extract such
information, the saved_model_cli
command could be useful. A model exported with tf.saved_model.save()
automatically
comes with the "serve" tag because the SavedModel file format is designed for serving. This tag contains the various
functions exported. Among these, there is always present the "serving_default" signature_def
. This signature def
works exactly like the TF 1.x graph. Get the input tensor and the output tensor, and use them as placeholder to feed
and output to get, respectively.
To get info inside a SavedModel the best tool is saved_model_cli
that comes with the TensorFlow Python package, for
example:
saved_model_cli show --all --dir output/keras
gives, among the others, this info:
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['inputs_input'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 28, 28, 1)
name: serving_default_inputs_input:0
The given SavedModel SignatureDef contains the following output(s):
outputs['logits'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 10)
name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict
Knowing the input and output layers' names, serving_default_inputs_input:0
and StatefulPartitionedCall:0
, is
essential to run the model in tfgo.
This project uses MTCNN for face detection and QMagFace for face recognition. For MTCNN, three stages (PNet, RNet, ONet) have been used in a close fashion similar to FaceNet. Each stage is done in its corresponding function:
- First stage (PNet):
totalBoxes := firstStage(scales, img, pnetModel)
- Second stage (RNet):
squaredBoxes := secondStage(totalBoxes, width, height, img, rnetModel)
- Third stage (ONet):
thirdPickedBoxes, pickedPoints := thirdStage(squaredBoxes, width, height, img, onetModel)
You may download the models from the available Google Drive URLs.
After the face detection stage, there is a face alignment. The function to perform face alignment is pImgs := alignFace(thirdPickedBoxes, pickedPoints, img)
which imitates the steps from here.
Finally, once the face is detected and aligned, the recognition phase can start. It happens at this line: recognizeFace(pImgs, qmfModel, regEmbeddings, bSize, regFiles)
.
Use the bellow command to run the code:
go run main.go IMAGE.jpg path/to/REGISTERED_IMAGES path/to/EMBEDDINGS.npy path/to/MTCNN_MODELS_DIR path/to/MAGFACE_MODEL_DIR
where:
IMAGE.jpg
: path to the given imagepath/to/REGISTERED_IMAGES
: directory containing register imagespath/to/EMBEDDINGS.npy
: the embeddings extracted from the register images using the Python's QMagFace implementationpath/to/MTCNN_MODELS_DIR
: directory containing tensorflow models for MTCNNpath/to/MAGFACE_MODEL_DIR
: directory containing tensorflow model for MagFace
The main challenge thus far was the conversion between gocv.Mat
, tfgo.Tensor
, gonum
, and Go's native slice. The conversion is required as some matrix transformations are only available in gocv
and some in tfgo
. Also, the input to the tfgo
model should be of type tfgo.Tensor
, so inevitably one needs to convert the image read by gocv
to tfgo
. Also, some matrix operations are not available in any of these packages, so I had to implement them myself from scratch. To do so, I had to use Go's native slice. So inevitable conversions between these types are frequent throughout the code.
For example, the function adjustInput()
besides doing some scaling, it also converts a gocv.Mat
to Go's [][][][]float32
. In addition, at this line: inputBufTensor, _ := tf.NewTensor(inputBuf)
a [][][][]float32
slice is converted to a tfgo.Tensor
.
In contrast, these type conversions are done pretty easy and fast in Python.
- Check why recognition model takes so long for a forward pass. In Python, it takes about 0.5 milliseconds while in Go it takes about 5500 milliseconds. For the first run, in Go, the session instantiation takes a long time. For next runs, Go runs pretty fast. Take a look at this issue. The
fakeRun()
function is for that purpose. - Upload the models
- Create a Go package