General:
- How are design files and related scripts organized?
- How much resources are consumed by Efinix TinyML designs?
- Why compile provided example designs using Efinity RISC-V Embedded Software IDE failed?
- How to compile AI inference software app for optimized speed performance?
- Where are AI training and quantization scripts located?
- How to make use of outputs generated from model zoo training and quantization flow for inference purposes?
- How to run inference with or without Efinix TinyML accelerator?
- How to perform profiling of an AI model running on RISC-V?
- How to boot a complete TinyML design from flash?
- How to modify Efinix Vision TinyML demo designs to use Google Coral Camera instead of Raspberry PI Camera v2?
Create Your Own TinyML Solution:
- How to run static input inference on a different test image with provided example quantized models?
- How to add user-defined accelerator?
- How to customize Efinix TinyML accelerator for different resource-performance trade-offs?
- How to train and quantize a different AI model for running on Efinix TinyML platform?
- How to run inference with a different quantized AI model using Efinix TinyML platform?
- How to implement a TinyML solution using Efinix TinyML platform?
The directory structure of Efinix TinyML repo is depicted below:
├── docs
├── model_zoo
│ ├── deep_autoencoder_anomaly_detection
│ ├── ds_cnn_keyword_spotting
│ ├── mediapipe_face_landmark_detection
│ ├── mobilenetv1_person_detection
│ ├── resnet_image_classification
│ └── yolo_person_detection
├── quick_start
├── tinyml_hello_world
│ ├── Ti60F225_tinyml_hello_world
│ | ├── embedded_sw
│ | ├── ip
│ | ├── replace_files
│ | └── source
│ └── Ti180M484_tinyml_hello_world
│ ├── embedded_sw
│ ├── ip
│ ├── replace_files
│ └── source
├── tinyml_vision
│ ├── Ti60F225_mediapipe_face_landmark_demo
│ │ ├── embedded_sw
│ │ ├── ip
│ │ ├── replace_files
│ │ └── source
│ ├── Ti60F225_mobilenetv1_person_detect_demo
│ │ ├── embedded_sw
│ │ ├── ip
│ │ ├── replace_files
│ │ └── source
│ ├── Ti60F225_yolo_person_detect_demo
│ │ ├── embedded_sw
│ │ ├── ip
│ │ ├── replace_files
│ │ └── source
│ ├── Ti180M484_mediapipe_face_landmark_demo
│ │ ├── embedded_sw
│ │ ├── ip
│ │ ├── replace_files
│ │ └── source
│ ├── Ti180M484_mobilenetv1_person_detect_demo
│ │ ├── embedded_sw
│ │ ├── ip
│ │ ├── replace_files
│ │ └── source
│ └── Ti180M484_yolo_person_detect_demo
│ ├── embedded_sw
│ ├── ip
│ ├── replace_files
│ └── source
└── tools
└── tinyml_generator
For TinyML Hello World design, the project structure is depicted below :
├── tinyml_hello_world
│ ├── <device>_tinyml_hello_world
│ | ├── embedded_sw
│ | │ └── SapphireSoc
│ | │ └── software
│ | │ └── standalone
│ | │ ├── common
│ | │ ├── tinyml_fl
│ | │ ├── tinyml_imgc
│ | │ ├── tinyml_kws
│ | │ ├── tinyml_pdti8
│ | │ ├── tinyml_ypd
│ | │ └── tinyml_ad
│ | ├── ip
│ | ├── replace_files
│ | │ ├── bootloader_4MB
│ | │ └── user_def_accelerator
│ | └── source
│ | ├── axi
│ | ├── common
│ | ├── hw_accel
│ | └── tinyml
For TinyML Vision design, the project structure is depicted below:
├── tinyml_vision
│ ├── <device>_<architecture>_<application>_demo
│ │ ├── embedded_sw
│ │ │ └── SapphireSoc
│ │ │ └── software
│ │ │ └── standalone
│ │ │ ├── common
│ │ │ └── evsoc_tinyml_<application_alias>
│ │ ├── ip
│ │ ├── replace_files
│ │ │ └── bootloader_4MB
│ │ └── source
│ │ ├── axi
│ │ ├── cam
│ │ ├── common
│ │ ├── display
│ │ ├── hw_accel
│ │ └── tinyml
Note: Source files for Efinix soft-IP(s) are to be generated using IP Manager in Efinity® IDE, where IP settings files are provided in ip directory in respective project folder.
Resource utilization tables compiled for Efinix Titanium® Ti60F225 device using Efinity® IDE v2022.2 are as follows.
Resource utilization for TinyML Hello World design:
Building Block | XLR | FF | ADD | LUT | MEM (M10K) | DSP |
---|---|---|---|---|---|---|
TinyML Hello World (Total) | 53888 | 27838 | 8869 | 33359 | 186 | 74 |
RISC-V SoC | - | 6712 | 690 | 5565 | 48 | 4 |
DMA Controller | - | 4431 | 772 | 5591 | 45 | 0 |
HyperRAM Controller Core | - | 1153 | 305 | 2096 | 22 | 0 |
Hardware Accelerator* (Dummy) | - | 369 | 273 | 162 | 4 | 2 |
Efinix TinyML Accelerator | - | 14485 | 6817 | 18760 | 67 | 68 |
Resource utilization for Edge Vision TinyML MobileNetV1 Person Detection Demo design:
Building Block | XLR | FF | ADD | LUT | MEM (M10K) | DSP |
---|---|---|---|---|---|---|
Person Detection Demo (Total) | 56387 | 27341 | 8993 | 35971 | 207 | 54 |
RISC-V SoC | - | 6481 | 697 | 5307 | 43 | 4 |
DMA Controller | - | 4339 | 520 | 5967 | 36 | 0 |
HyperRAM Controller Core | - | 1153 | 305 | 2099 | 22 | 0 |
CSI-2 RX Controller Core | - | 844 | 194 | 2053 | 15 | 0 |
DSI TX Controller Core | - | 1736 | 409 | 3491 | 19 | 0 |
Camera | - | 778 | 919 | 663 | 11 | 0 |
Display | - | 341 | 174 | 363 | 8 | 0 |
Hardware Accelerator* | - | 334 | 273 | 123 | 4 | 2 |
Efinix TinyML Accelerator | - | 10483 | 5483 | 14652 | 45 | 48 |
Resource utilization tables compiled for Efinix Titanium® Ti180M484 device using Efinity® IDE v2022.2 are as follows.
Resource utilization for TinyML Hello World design:
Building Block | XLR | FF | ADD | LUT | MEM (M10K) | DSP |
---|---|---|---|---|---|---|
TinyML Hello World (Total) | 130214 | 64441 | 21350 | 79716 | 492 | 186 |
RISC-V SoC | - | 11463 | 699 | 7239 | 87 | 4 |
DMA Controller | - | 9422 | 832 | 14637 | 223 | 0 |
Hardware Accelerator* (Dummy) | - | 352 | 294 | 125 | 4 | 2 |
Efinix TinyML Accelerator | - | 40895 | 19516 | 54163 | 178 | 180 |
Resource utilization for Edge Vision TinyML MobileNetV1 Person Detection Demo design:
Building Block | XLR | FF | ADD | LUT | MEM (M10K) | DSP |
---|---|---|---|---|---|---|
Person Detection Demo (Total) | 123412 | 60303 | 21529 | 74382 | 545 | 166 |
RISC-V SoC | - | 11788 | 769 | 7464 | 87 | 4 |
DMA Controller | - | 10358 | 921 | 15602 | 240 | 0 |
CSI-2 RX Controller Core | - | 611 | 204 | 1602 | 17 | 0 |
Camera | - | 744 | 946 | 662 | 11 | 0 |
Display | - | 762 | 226 | 603 | 46 | 0 |
Hardware Accelerator* | - | 352 | 294 | 136 | 4 | 2 |
Efinix TinyML Accelerator | - | 33333 | 18160 | 45143 | 140 | 160 |
* Hardware accelerator consists of pre-processing blocks for inference. For the MobileNetv1 Person Detection Demo design, the pre-processing blocks are image downscaling, RGB to grayscale conversion, and grayscale pixel packing. Refer to the defines.v for respective design TinyML accelerator configuration
Note: Resource values may vary from compile-to-compile due to PnR and updates in RTL. The presented tables are served as reference purposes.
User is required to generate Sapphire RISC-V SoC IP using IP Manager in Efinity® IDE. RISC-V SoC IP related contents for software are generated in embedded_sw folder.
In Efinity RISC-V Embedded Software IDE, set the environment variables for C/C++ compilation with O3 flag, optimize for speed performance. Go to Efinity RISC-V Embedded Software IDE -> Window -> Preferences -> C/C++ -> Build -> Environment
- BENCH set to yes
- DEBUG set to no
- DEBUG_OG set to no
AI model training and quantization scripts are located in model_zoo directory. Refer to model_zoo directory for more details regarding AI models, training and quantization.
How to make use of outputs generated from model zoo training and quantization flow for inference purposes?
There are two output files generated from the training and post-training quantization flow i.e., <architecture>_<application>_model_data.h and <architecture>_<application>_model_data.cc. The generated output files contain model data of the quantized model. In the provided example/demo designs, they are placed in the <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src/model folder.
The model data header is included in the main.cc in corresponding <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src/model directory. The model data is assigned to TFlite interpreter through the command below:
model = tflite::GetModel(<architecture>_<application>_model_data);
By default, the provided example/demo designs are with Efinix TinyML accelerator enabled, where it is set in define.cc in corresponding <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src/model directory. Note that, define.cc file is generated using Efinix TinyML Generator.
To run AI inference using pure software approach, user can make use of Efinix TinyML Generator to disable Efinix TinyML accelerator accordingly. Alternatively, user may set all the *_mode variables in define.cc to 0.
To perform profiling i.e., to determine execution time of a quantized AI model running on RISC-V, make the following modification in the main.cc of the corresponding <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src directory to enable the profiler.
//error_reporter, nullptr); //Without profiler
error_reporter, &prof); //With profiler
Build and run the particular software app of interest, the profiling result will be printed on the UART terminal.
A complete TinyML design consists of hardware/RTL (FPGA bitstream) and software/firmware (software binary). FPGA bitstream is generated from Efinity® IDE compilation, whereas software binary is generated from Efinity RISC-V Embedded Software IDE compilation. By default, there is a RISC-V bootloader that copies 124KB user binary from flash to main memory for execution upon boot-up.
As AI-related application binary is typically larger than 124KB, the bootloader is to be updated to copy larger software binary size. Bootloader for moving up to 4MB software binary is provided in <proj_directory>/replace_files/bootloader_4MB folder. User is to copy and replace the corresponding files i.e., EfxSapphireSoc.v_toplevel_system_ramA_logic_ram_symbol*.bin in ip/SapphireSoc directory. Then, compile the Efinity project using Efinity® IDE for generating the FPGA bitstream.
Refer to EVSoC User Guide Copy a User Binary to Flash (Efinity Programmer) section for steps to combine FPGA bitstream and user application binary using Efinity Programmer, as well as boot the design from flash.
How to modify Efinix Vision TinyML demo designs to use Google Coral Camera instead of Raspberry PI Camera v2?
To get started, user may refer to the Google Coral designs (<device>_coral_<display>) in EVSoC GitHub repo.
In summary, the required changes to use Google Coral Camera on Efinix Vision TinyML demo designs are as follows:
-
To connect a Google Coral Camera to Efinix development kit, a Google Coral Camera connector daughter card is required.
- For Titanium Ti60 F225 Development Board, connect the Google Coral Camera connector daughter card to P2 header.
- For Titanium Ti180 M484 Development Board, connect the Google Coral Camera connector daughter card to P1 header.
-
Using Efinity Interface Designer,
- Update the GPIO setting for io_cam_scl, io_cam_sda, and o_cam_rstn accordingly. For Ti180 design, to create a new GPIO output block for o_cam_rstn.
- Update the MIPI DPHY RX setting accordingly.
-
Replace RTL source file for camera module cam_picam_v2.v with cam_coral.v from EVSoC Google Coral design. To update Efinity design file list accordingly.
-
Update top-level RTL source file edge_vision_soc.v accordingly.
-
Replace the line:
cam_picam_v2 # (
with:
cam_coral # (
-
For Ti180 design,
-
Add an output port in I/O declaration:
output o_cam_rstn,
-
Add the signal assignment:
assign o_cam_rstn = i_arstn;
-
-
-
Update embedded_sw folder to use the software driver and settings for Google Coral Camera.
-
Copy Google Coral Camera driver CoralCam.c and CoralCam.h from EVSoC Google Coral design to <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src/platform/vision.
-
Refer to common.h in EVSoC Google Coral design for adding CORALCAM_I2C_ADDRESS and i2c_reg_config_t variable to <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src/platform/vision/common.h.
-
Update <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src/main.cc accordingly.
-
Replace the line:
#include "PiCamDriver.h"
with:
#include "CoralCam.h"
-
Replace the line:
PiCam_init();
with:
CoralCam_init();
-
Replace the line:
Set_RGBGain(1,5,3,4);
with:
Set_RGBGain(1,3,3,3);
-
-
In the provided TinyML Hello World example designs, test image input data for static inference is defined in header file placed in corresponding <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src/model folder. For example, quant_airplane.h and quant_bird.h contain the airplane and bird test image, respectively, for the ResNet image classification model.
The test image data header is included in the main.cc in corresponding <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src directory. The image data is assigned to TFLite interpreter input through the command below:
for (unsigned int i = 0; i < quant_airplane_dat_len; ++i)
model_input->data.int8[i] = quant_airplane_dat[i];
User may use a different test input data for inference by creating a header file that contains the corresponding input data. For inference with image input, the input data is typically the grayscale or RGB pixel data of the test image. The input colour format, total data size, data type, etc., are determined during the AI model training/quantization stage. It is important to ensure the provided test data fulfil the input requirement of the quantized AI model used for inference.
RISC-V custom instruction interface includes a 10-bit function ID signal, where up to 1024 custom instructions can be implemented. As coded in the tinyml_top module (<proj_directory>/source/tinyml/tinyml_top.v), function IDs with MSB 0 (with up to 512 custom instructions) are reserved for Efinix TinyML accelerator, whereas the rest of the function IDs can be used to implement user-defined accelerator as per application need.
To demonstrate how to add a user-defined accelerator, a minimum maximum Lite accelerator example is provided in tinyml_hello_world/<proj_directory>/replace_files/user_def_accelerator.
- Copy the files in hardware folder to <proj_directory>/source/tinyml.
- Copy the files in software folder to <proj_directory>/embedded_sw/SapphireSoc/software/standalone/<application_name>/src/tensorflow/lite/kernels/internal/reference.
- Compile the hardware using Efinity® IDE, build the software using Efinity RISC-V Embedded Software IDE, and run the application.
A GUI-based Efinix TinyML Generator is provided to facilitate the customization of Efinix TinyML Accelerator.
Efinix TinyML Accelerator supports two modes, which is customizable by layer type:
- Lite mode - Lightweight accelerator that consumes less resources.
- Standard mode - High performance accelerator that consumes more resources.
Refer to Efinix Model Zoo for examples on how to make use of the training and quantization scripts based on different training frameworks and datasets. The training and quantization examples are provided as Jupyter Notebook, which runs on Google Colab. To make use of the produced quantized model data for inference purposes, refer to this FAQ.
If user has an own pre-trained network (floating point model), the training stage can be skipped. User may proceed with model quantization and perform conversion from .tflite quantized model to the corresponding .h and .cc files for inference purposes.
Refer to this FAQ for training and quantization of a different AI model. To test out the quantized model, it is recommended to try out inference of targeted model using the TinyML Hello World design, which takes in static input data. In addition, it is recommended to run inference in pure software mode i.e., disabled TinyML accelerator (refer to this FAQ), as this would help to isolate potential setting/design issues to either software (TFlite Micro library and inference setup) or hardware (TinyML accelerator).
With TinyML accelerator disabled - pure software inference, some adjustments may be required for running a different AI model. This is due to there might be variations in the overall model size, layers/operations, input/output format, normalization, etc., for different AI models. Followings are some tips for making the necessary adjustments:
- Refer to this FAQ on how to include quantized model for inference purposes.
- Refer to this FAQ on how to include a different test input data.
- If seeing Allocate Tensor Failed error message on UART terminal during inference execution, adjust tensor arena size in main.cc.
- If seeing Insufficient memory region size allocated error message during Efinity RISC-V Embedded Software IDE build project, adjust Application Region Size parameter of Sapphire SoC IP using Efinity® IDE IP Manager accordingly. It is important to ensure the adjusted Application Region Size does not exceed the external memory RAM size.
After running inference successfully with the targeted AI model (with expected inference score/output) in pure software mode, user may enable Efinix TinyML accelerator for hardware speed-up. Refer to Efinix TinyML Generator for enabling/customizing Efinix TinyML accelerator for the targeted model.
To implement a TinyML solution for vision application, user may make use of the presented Efinix Edge Vision TinyML framework. For more details about the flexible domain-specific Edge Vision SoC framework, visit Edge Vision SoC webpage. Furthermore, user may refer to the provided demo design on Edge Vision TinyML framework for the interfacing and integration of a working vision AI system with camera and display.