Skip to content
forked from loganriggs/sae-rm

Using SAE's to interpret Reward Models (RM)

Notifications You must be signed in to change notification settings

sun-wendy/sae-rm

About

Using SAE's to interpret Reward Models (RM)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 97.0%
  • Python 3.0%