Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow convergence on training on toy dataset that you provided #685

Open
iLori-Jiang opened this issue Jun 17, 2024 · 2 comments
Open

Slow convergence on training on toy dataset that you provided #685

iLori-Jiang opened this issue Jun 17, 2024 · 2 comments

Comments

@iLori-Jiang
Copy link

iLori-Jiang commented Jun 17, 2024

First of all, thanks for your amazing work! @lllyasviel

However, I met the problem that as I followed your steps to retrain the model to fill the circle with your toy dataset, the model converges really slow.

Below is the sampling result of 3900 steps with batch size of 4 (all the parameters remain unchanged as your tutorial_train.py).

G.T. [reconstruction_gs-003900_e-000000_b-003900]:

reconstruction_gs-003900_e-000000_b-003900

Output [samples_cfg_scale_9 00_gs-003900_e-000000_b-003900]:

samples_cfg_scale_9 00_gs-003900_e-000000_b-003900

The model seems to be able to understand the color, but cannot understand the position of the circle.

\

While continuing the training, the sampling result of 11875 steps:

G.T. [reconstruction_gs-011875_e-000001_b-003000]:

reconstruction_gs-011875_e-000001_b-003000

Output [samples_cfg_scale_9 00_gs-011875_e-000001_b-003000]:

samples_cfg_scale_9 00_gs-011875_e-000001_b-003000

The model finally learns the position of the circle, but seems no longer understand the color anymore.

Do you have any insights on this problem, or do you have any instructions on helping me solve this? Thank you in advance!

@Neltherion
Copy link

I'm also facing this problem. Did further training improve anything? or did you have to change the code somehow?🤔

@crapthings
Copy link

crapthings commented Nov 4, 2024

@iLori-Jiang

wait until 25000 steps

i've tested this dataset and 5w other dataset and 3w dataset.
they all work after > 18000

this use same circle dataset, just run the script and wait
https://github.com/huggingface/diffusers/tree/main/examples/controlnet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants