You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DinoV2 with registers backbone, transformer decoder, classifier free guidance film layers training script
☘️ Shoot an email to [email protected] if you'd like to tackle this issue and I'll help as often as I can. Can provide A100 access once script is ready.
DinoV2 with registers backbone, transformer decoder, classifier free guidance film layers training script
☘️ Shoot an email to [email protected] if you'd like to tackle this issue and I'll help as often as I can. Can provide A100 access once script is ready.
Starter Code
Example Doing Identical task but with MaxViT
Resources
Highly-Recommended Guide to Follow
Transformer Head Code
DinoV2 Source Code
Text Guidance with Film
RT1: Robotics Transformers paper
Tokenize Actions (x, y, z, roll, pitch, yaw, grasp)
Transform pattern: (b frames action) -> (b f a bins), bins=255
This is just simple classification not sequence to sequence modeling
Apply MinMax Scaler
Apply kbins
Apply film layers from classifier-free-guidance
Inference pattern: (b f c h w ), str --> (b f a bins)
Example Doing Identical task but with MaxViT
Details
Use the following losses:
Follow-On Work
The text was updated successfully, but these errors were encountered: