Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance positional encoding adjustment in SparseCtrl loading with exp… #83

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

nazgut
Copy link

@nazgut nazgut commented Mar 16, 2024

Enhanced the model loading process for SparseCtrl models by introducing the expected_seq_len parameter to dynamically adjust positional encoding (PE) dimensions. This improvement ensures the compatibility of positional encodings with models expecting different sequence lengths, enhancing the flexibility and usability of model loading, especially when dealing with models trained with a variety of configurations.

Changes include:

  • Addition of expected_seq_len to SparseSettings, allowing for customizable sequence length settings.
  • Implementation of dynamic adjustment for pos_encoder.pe parameters within the adjust_positional_encoding_parameters function, ensuring that the PE tensors match the expected sequence length.
  • Adaptation of SparseCtrlLoaderAdvanced and SparseCtrlMergedLoaderAdvanced classes to utilize the expected_seq_len setting, providing a seamless integration into the model loading workflow.

This upgrade addresses potential type mismatches and enhances the model's adaptability to different sequence lengths, streamlining the process for users and maintaining robustness across diverse model configurations.

@Kosinkadink
Copy link
Owner

Hey, thanks for the PR! SparseCtrl is something I've wanted to revisit for a while, so I should have time to review this sometime in the next two to three weeks.

PE adjustment is something I was thinking of exposing as one of the options for dealing with context lengths that exceed the built-in 32, but in addition, I also want to enable the use of View Options, and to add an option to make the input images 'context aware'. The context awareness would make it automatically put in one (or more) of the condhint inputs into context windows that have no condhint images if the manual indexes/spread methods don't have an explicit image to use.

For the purpsoes of PE adjustment and View Options support, I plan on copying over a bunch of changes I made in AnimateDiff-Evolved over to SparseCtrl to support it (and to make my life easier by keeping the motion module code as similar as I can between AnimateDiff-Evolved and Advanced-ControlNet), so I'm not sure if I would accept this PR when I do review it, but I do want to acknowledge that something akin to the changes you propose to work with context_lengths greater than 32 will be added one way or another!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants