An anime image tag detector based on modified ML-Decoder. Model trained with cleaned danbooru2021.
- Designed a new TResNet-D structure as backbone to enhance the learning of low-level features.
- Replace the ReLU in backbone with FReLU.
- Using learnable queries for transformer decoder.
https://huggingface.co/7eu7d7/ML-Danbooru
Download the model and run below command:
python demo.py --data <path to image or directory> --model_name tresnet_d --num_of_groups 32 --ckpt <path to ckpt> --thr 0.7 --image_size 640
Keep the image ratio invariant:
python demo.py --data <path to image or directory> --model_name tresnet_d --num_of_groups 32 --ckpt <path to ckpt> --thr 0.7 --image_size 640 --keep_ratio True
python demo_ca.py --data <path to image or directory> --model_name caformer_m36 --ckpt <path to ckpt> --thr 0.7 --image_size 448