Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The current instantiation does not trigger the initialization of submodules #2273

Open
dz1iang opened this issue Jan 16, 2025 · 4 comments
Open
Assignees
Labels
discussion Start a discussion triaged This issue has been assigned an owner and appropriate label

Comments

@dz1iang
Copy link

dz1iang commented Jan 16, 2025

Currently, the instantiation method used in Torchtune's configuration does not trigger the initialization of its submodules. The instantiate function from hydra.utils does handle this, but it differs from the current approach in Torchtune. Is there an elegant way to trigger the initialization of submodules?

Below is a simple example of how hydra.utils successfully triggers the initialization of submodules.

  • config
main_component:
  _target_: __main__.MainComponent
  sub_component:
    _target_: __main__.SubComponent
    param_a: 42
    param_b: "example"
  • test code
from omegaconf import OmegaConf
from hydra.utils import instantiate

class SubComponent:
    def __init__(self, param_a, param_b):
        self.param_a = param_a
        self.param_b = param_b
        #breakpoint()
        self.param_c = 1

    def __repr__(self):
        return f"SubComponent(param_a={self.param_a}, param_b={self.param_b})"

class MainComponent:
    def __init__(self, sub_component):
        self.sub_component = sub_component
        self.param_c = self.sub_component.param_c

    def __repr__(self):
        return f"MainComponent(sub_component={self.sub_component})"

nested_config_path = "nested_config.yaml"
config = OmegaConf.load(nested_config_path)

main_component = instantiate(config.main_component)
print(main_component)
print(main_component.param_c)
@RdoubleA
Copy link
Contributor

RdoubleA commented Jan 16, 2025

I love this question! We deliberately decided against allowing submodule initialization akin to Hydra for a few reasons:

  1. At the time during our launch, it was not needed and would've required significant effort to enable from scratch
  2. It greatly increases the space of user errors that we thought wasn't worth enabling. In other words, it becomes "meta-programming" via the config and could lead to great difficulty debugging.
  3. A nice byproduct is that it forces us to not nest classes/components too deep when designing new components. This has definitely become less the case as we onboard new features, especially for multimodal components

You can see the original PR discussion for some context: #406

If it starts becoming imperative to enable submodule initialization for better UX on more nested components for complex features, then I'm open to revisiting this discussion. But so far we've been able to work around this.

I'd be curious to understand what you are trying to achieve with recursive instantiation. One way we typically work around this is having "builder" functions that takes all the flat params that can be easily specified in the config with only one instantiation, and the builder constructs all the components needed for you. This is exactly how many of our component builders work to instantiate any model.

@joecummings joecummings added discussion Start a discussion triaged This issue has been assigned an owner and appropriate label labels Jan 16, 2025
@dz1iang
Copy link
Author

dz1iang commented Jan 17, 2025

I love this question! We deliberately decided against allowing submodule initialization akin to Hydra for a few reasons:

  1. At the time during our launch, it was not needed and would've required significant effort to enable from scratch
  2. It greatly increases the space of user errors that we thought wasn't worth enabling. In other words, it becomes "meta-programming" via the config and could lead to great difficulty debugging.
  3. A nice byproduct is that it forces us to not nest classes/components too deep when designing new components. This has definitely become less the case as we onboard new features, especially for multimodal components

You can see the original PR discussion for some context: #406

If it starts becoming imperative to enable submodule initialization for better UX on more nested components for complex features, then I'm open to revisiting this discussion. But so far we've been able to work around this.

I'd be curious to understand what you are trying to achieve with recursive instantiation. One way we typically work around this is having "builder" functions that takes all the flat params that can be easily specified in the config with only one instantiation, and the builder constructs all the components needed for you. This is exactly how many of our component builders work to instantiate any model.

There are two perspectives:

The first is the need for a quick migration, moving from another algorithm or framework to Torchtune, with support for submodule initialization. The required code changes are minimal, and optimizations for code and performance can be considered later.

The second is that, in the future, users can define the model structure through the configuration, without the need to write code every time.

@dz1iang
Copy link
Author

dz1iang commented Jan 17, 2025

and it can work like this:

  • config
main_component:
  _component_: __main__.main_component
  sub_component:
    _component_: __main__.SubComponent
    param_a: 42
    param_b: "example"
  • code
from omegaconf import OmegaConf
from torch import nn
from torchtune.config import instantiate

class SubComponent:
    def __init__(self, param_a, param_b):
        self.param_a = param_a
        self.param_b = param_b
        #breakpoint()
        self.param_c = 1

    def __repr__(self):
        return f"SubComponent(param_a={self.param_a}, param_b={self.param_b})"

class MainComponent:
    def __init__(self, sub_component):
        self.sub_component = sub_component
        self.param_c = self.sub_component.param_c

    def __repr__(self):
        return f"MainComponent(sub_component={self.sub_component})"

def main_component(
    sub_component: SubComponent
    ) -> nn.Module:
    sub_component = SubComponent(
        param_a=sub_component["param_a"],
        param_b=sub_component["param_b"],
    )
    main_component = MainComponent(
        sub_component=sub_component
    )
    return main_component

config_path = "config.yaml"
config = OmegaConf.load(config_path)

main_component = instantiate(config.main_component)
print(main_component)
print(main_component.param_c)

@RdoubleA
Copy link
Contributor

RdoubleA commented Jan 17, 2025

The first is the need for a quick migration, moving from another algorithm or framework to Torchtune, with support for submodule initialization.

Not sure I fully understand. Are you trying to migrate from another framework that uses recursive instantiation or Hydra?

The second is that, in the future, users can define the model structure through the configuration, without the need to write code every time.

Right, I can see why this is compelling. And while it would be convenient for config power users to define a full architecture in config, it brings you away from the actual python code itself. And I would also say that builder functions can achieve the same thing, without causing bloated configs:

def my_model(
    param_a,
    param_b,
    sub_param_a,
    sub_param_b,
):
    child_module = Submodule(sub_param_a, sub_param_b)
    parent_module = Module(child_module, param_a, param_b)

Then, in the config:

model:
  _component_: path.to.my_model
  param_a: xxx
  param_b: xxx
  sub_param_a: xxx
  sub_param_b: xxx

Basically, in your example main_component would take param_an and param_b directly instead of Subcomponent.

This is just the opinionated design stance we decided to take. Open to discussing any difficulties you may be having doing this for your use case, if it's not working out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Start a discussion triaged This issue has been assigned an owner and appropriate label
Projects
None yet
Development

No branches or pull requests

3 participants