Dog-IQA: Standard-guided Zero-shot MLLM for Mix-grained Image Quality Assessment

PyTorch code for our paper "Dog-IQA: Standard-guided Zero-shot MLLM for Mix-grained Image Quality Assessment"

Kai Liu, Ziqing Zhang, Wenbo Li, Renjing Pei, Fenglong Song, Xiaohong Liu, Linghe Kong, and Yulun Zhang

"Dog-IQA: Standard-guided Zero-shot MLLM for Mix-grained Image Quality Assessment", arXiv, 2024

[arXiv] [supplementary material] [visual results]

🔥🔥🔥 News

2024-10-03: Add pipeline figure and results.
2024-10-01: This repo is released! 🎉🎉🎉

Abstract: Image quality assessment (IQA) serves as the golden standard for all models' performance in nearly all computer vision fields. However, it still suffers from poor out-of-distribution generalization ability and expensive training costs. To address these problems, we propose Dog-IQA, a standard-guided zero-shot mix-grained IQA method, which is training-free and utilizes the exceptional prior knowledge of multimodal large language models (MLLMs). To obtain accurate IQA scores, namely scores consistent with humans, we design an MLLM-based inference pipeline that imitates human experts. In detail, Dog-IQA applies two techniques. First, Dog-IQA objectively scores with specific standards that utilize MLLM's behavior pattern and minimize the influence of subjective factors. Second, Dog-IQA comprehensively takes local semantic objects and the whole image as input and aggregates their scores, leveraging local and global information. Our proposed Dog-IQA achieves state-of-the-art (SOTA) performance compared with training-free methods, and competitive performance compared with training-based methods in cross-dataset scenarios. Our code and models will be available at https://github.com/Kai-Liu001/Dog-IQA.

The radar plot in Figure 1 of the main paper shows that our proposed Dog-IQA outperforms all previous training-free IQA methods in all five datasets in terms of both Spearman Rank Correlation Coefficient (SRCC) and Pearson Linear Correlation Coefficient (PLCC).

⚒ TODO

Release code

🔗 Contents

Datasets
Evaluation
Results
Citation
Acknowledgements

🔎 Results

We achieve SOTA performance on various dataset compared with training-free approaches.

Comparisons with Training-free methods. (click to expand)

Quantitative comparisons in Table 2 of the main paper

Comparisons with Training-based methods. (click to expand)

Quantitative comparisons in Table 3 of the main paper

Visual Results. (click to expand)

Visual result in Figure 5 of the main paper.

Visual result in Figure 2 of the supplementary material.

📎 Citation

If you find the code helpful in your resarch or work, please cite the following paper(s).

TBD.

💕💖💕 Acknowledgements

Thanks to mPLUG-Owl3 and SAM2 for their outstanding models.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
figs		figs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dog-IQA: Standard-guided Zero-shot MLLM for Mix-grained Image Quality Assessment

🔥🔥🔥 News

⚒ TODO

🔗 Contents

🔎 Results

📎 Citation

💕💖💕 Acknowledgements

About

Releases 1

Packages

License

Kai-Liu001/Dog-IQA

Folders and files

Latest commit

History

Repository files navigation

Dog-IQA: Standard-guided Zero-shot MLLM for Mix-grained Image Quality Assessment

🔥🔥🔥 News

⚒ TODO

🔗 Contents

🔎 Results

📎 Citation

💕💖💕 Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Packages