Training omage with higher resolution

Hi authors,

I wonder if you have trained your pipeline with higher resolution (e.g, 1024) yet. In that case, how can you optimize the training of DiT?

In my case, I naively increase the patch size, but it seems not working well as expected.

I'm looking forward to your answer.