GitHub - Qualcomm-AI-research/mobile-video-diffusion

This repository contains the code for a GitHub pages site for the publication Mobile Video Diffusion. The project page is hosted at https://qualcomm-ai-research.github.io/mobile-video-diffusion/.

This paper introduces the first mobile-optimized video diffusion model. By optimizing the spatio-temporal UNet from Stable Video Diffusion (SVD), we reduce memory and computational requirements. We achieve this by lowering the resolution to 512x256 px, incorporating multi-scale temporal representations, and introducing two novel pruning schema to reduce the number of channels and temporal blocks in the UNet. Furthermore, we employ adversarial finetuning to reduce the denoising to a single step. Our model, MobileVD, is 523x more efficient (1817.2 vs. 4.34 TFLOPs) with a slight quality drop (FVD 149 vs. 171), generating latents for a 14x512x256 px clip in 1.7 seconds on a Xiaomi-14 Pro.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
static		static
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

License

Qualcomm-AI-research/mobile-video-diffusion

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages