Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

float8 matmul for inference + torchao fp8 training #28

Open
a-r-r-o-w opened this issue Oct 12, 2024 · 1 comment
Open

float8 matmul for inference + torchao fp8 training #28

a-r-r-o-w opened this issue Oct 12, 2024 · 1 comment

Comments

@a-r-r-o-w
Copy link
Owner

Torch has support for float8 matmul kernels, and it seems like they are faster than bf16 on Ada and above architectures. TorchAO supports training in fp8. This has been explored in a few newer optimization examples of Flux and other larger models to achieve real-time image generation. I think we could explore this for training in CogVideoX and see how it pans out.

Relevant links:

Since this might take some time to profile properly, it is low priority but definitely worth exploring since some other training libraries/UIs are exploring into this too.

@sayakpaul @zRzRzRzRzRzRzR

@sayakpaul
Copy link
Collaborator

Float8 training could be interesting but it also restricts the architectures we can do this on (only Ada and Hopper). When I did https://gist.github.com/sayakpaul/f0358dd4f4bcedf14211eba5704df25a, only Hopper was supported.

Nonetheless, 4090 would be benefit from this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants