Time IDs in SDXL training #9459
christopher5106
started this conversation in
General
Replies: 3 comments
-
cc: @sayakpaul |
Beta Was this translation helpful? Give feedback.
0 replies
-
Important to me, because I'm training a classifier on these latents, for classifier guidance, but I can only train it on crops (because i don't have full data). Then I need to be sure that when I apply the classifier to the whole image, time ids will be the same as in the crop if at the same place in the image. |
Beta Was this translation helpful? Give feedback.
0 replies
-
These are micro-conditions that were introduced in the SDXL paper. You could read more about them in the Section 2.2 of the paper: https://arxiv.org/abs/2307.01952. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I was trying to understand these time ids in SDXL training script. It's an array for each example, that could be [1024, 1034, 0, 10, 1024, 1024]. It's directly fed into Timesteps forward. I'm just wondering what's happening here, how each of these integers original_height, original_width, crop_coord_top, crop_coord_left, target_resolution, target_resolution get a kind of exponential embedding... and what is the expected output of this, there is something I'm missing, thanks for your help !
Beta Was this translation helpful? Give feedback.
All reactions