-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profiling #787
base: main
Are you sure you want to change the base?
Profiling #787
Conversation
…ger. Still several TBD Objects but this implementation will capture a torch profile.
…learly separated as well as enable easier extensions.
…a crash that I haven't resolved yet.
…ler, for now, and the most (only?) useful annotation tool
…ode slightly. Ready for draft PR
Some minor updtes to the tools themselves to accomodate instance clearing and refreshing.
/multi-gpu-ci |
FYI - @ktangsali this PR has no additional multi gpu tests, actually! Just single device here. |
/blossom-ci |
@ktangsali can you trigger the CI on this branch? I checked that it passes locally and I thought my authorization went through, but it did not yet. |
/blossom-ci |
Remove nvtx wrapper
/blossom-ci |
/blossom-ci |
…native layer norm.
/blossom-ci |
1 similar comment
/blossom-ci |
…the test, I think, about TELayerNorm
/blossom-ci |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, it looks good. Many of my comments address stylistic changes more than functional changes.
@property | ||
def enabled(self) -> bool: | ||
"""Get whether the profiler is enabled. | ||
|
||
Returns: | ||
bool: True if profiler is enabled, False otherwise | ||
""" | ||
return self._enabled | ||
|
||
@enabled.setter | ||
def enabled(self, value: bool) -> None: | ||
"""Set whether the profiler is enabled. | ||
|
||
Args: | ||
value (bool): True to enable profiler, False to disable | ||
""" | ||
if not isinstance(value, bool): | ||
raise TypeError("enabled must be a boolean value") | ||
self._enabled = value | ||
|
||
@property | ||
def finalized(self) -> bool: | ||
"""Get whether the profiler has been finalized. | ||
|
||
Returns: | ||
bool: True if profiler is finalized, False otherwise | ||
""" | ||
return self._finalized | ||
|
||
@finalized.setter | ||
def finalized(self, value: bool) -> None: | ||
"""Set whether the profiler has been finalized. | ||
|
||
Args: | ||
value (bool): True to mark as finalized, False otherwise | ||
""" | ||
if not isinstance(value, bool): | ||
raise TypeError("finalized must be a boolean value") | ||
self._finalized = value | ||
|
||
@property | ||
def initialized(self) -> bool: | ||
"""Get whether the profiler has been initialized. | ||
|
||
Returns: | ||
bool: True if profiler is initialized, False otherwise | ||
""" | ||
return self._initialized | ||
|
||
@initialized.setter | ||
def initialized(self, value: bool) -> None: | ||
"""Set whether the profiler has been initialized. | ||
|
||
Args: | ||
value (bool): True to mark as initialized, False otherwise | ||
""" | ||
if not isinstance(value, bool): | ||
raise TypeError("initialized must be a boolean value") | ||
self._initialized = value | ||
|
||
@property | ||
def is_decorator(self): | ||
""" | ||
Flag to declare if this profiling instance supports function decoration | ||
""" | ||
return self._is_decorator | ||
|
||
@is_decorator.setter | ||
def is_decorator(self, value: bool) -> None: | ||
"""Set whether the profiler supports function decoration. | ||
|
||
Args: | ||
value (bool): True to support function decoration, False otherwise | ||
""" | ||
if not isinstance(value, bool): | ||
raise TypeError("is_decorator must be a boolean value") | ||
|
||
@property | ||
def is_context(self): | ||
""" | ||
Flag to declare if this profiling instance supports context-based profiling | ||
""" | ||
return self._is_context | ||
|
||
@is_context.setter | ||
def is_context(self, value: bool) -> None: | ||
"""Set whether the profiler supports context-based profiling. | ||
|
||
Args: | ||
value (bool): True to support context-based profiling, False otherwise | ||
""" | ||
if not isinstance(value, bool): | ||
raise TypeError("is_context must be a boolean value") | ||
self._is_context = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are properties (each with their own setters and getters) needed? I would find it simpler to leave them as attributes and document each one in the class docstring.
def enable(self) -> None: | ||
"""Enable the profiler. | ||
|
||
Sets the internal enabled flag to True to activate profiling. | ||
""" | ||
self._enabled = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is an internal flag, should we prefix it with an underscore (i.e., _enable
)? Also, similar to the above comment, do we need a helper method for this—especially since it is an internal flag?
if cls not in cls._instances: | ||
with cls._lock: | ||
# Double-checked locking pattern | ||
if cls not in cls._instances: | ||
cls._instances[cls] = super().__call__(*args, **kwargs) | ||
return cls._instances[cls] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the GIL will effectively make this thread-safe, I wouldn't rely on this—especially with the possibility of a free-threaded mode in the future. I would wrap the entire section with the lock.
if cls in cls._instances: | ||
del cls._instances[cls] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though this is mainly for testing purposes, we should hold the lock for this method.
return instance | ||
|
||
else: | ||
raise Exception(f"ProfilerRegistry has no profiler under the key {key}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use a KeyError
rather than a generic exception.
if key in cls._instances: | ||
# Find instance of matching type in list | ||
for instance in cls._instances: | ||
if instance == key: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comparing classes should use is
(see E721).
# writing outputs, etc. Only want to trigger this once | ||
_finalized: bool | ||
|
||
exit_stack = ExitStack() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it intended that exit_stack
be shared with all instances? This would mean that on any context exit would exit all profiler contexts.
_is_decorator: Whether this profiler supports decorator usage | ||
""" | ||
|
||
__metaclass__ = _Profiler_Singleton |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the keyword argument for the class definition like LineProfileWrapper
?
if not self.enabled: | ||
return | ||
|
||
# Avoid finalizing if we never initialized: | ||
if not self.initialized: | ||
return | ||
|
||
# Prevent double finalization: | ||
if self.finalized: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably merge this into one if condition for brevity.
@property | ||
def enabled(self) -> bool: | ||
"""Get whether the profiler is enabled. | ||
|
||
Returns: | ||
bool: True if profiler is enabled, False otherwise | ||
""" | ||
return self._enabled | ||
|
||
@enabled.setter | ||
def enabled(self, value: bool) -> None: | ||
"""Set whether the profiler is enabled. | ||
|
||
Args: | ||
value (bool): True to enable profiler, False to disable | ||
""" | ||
if not isinstance(value, bool): | ||
raise TypeError("enabled must be a boolean value") | ||
self._enabled = value | ||
|
||
@property | ||
def finalized(self) -> bool: | ||
"""Get whether the profiler has been finalized. | ||
|
||
Returns: | ||
bool: True if profiler is finalized, False otherwise | ||
""" | ||
return self._finalized | ||
|
||
@finalized.setter | ||
def finalized(self, value: bool) -> None: | ||
"""Set whether the profiler has been finalized. | ||
|
||
Args: | ||
value (bool): True to mark as finalized, False otherwise | ||
""" | ||
if not isinstance(value, bool): | ||
raise TypeError("finalized must be a boolean value") | ||
self._finalized = value | ||
|
||
@property | ||
def initialized(self) -> bool: | ||
"""Get whether the profiler has been initialized. | ||
|
||
Returns: | ||
bool: True if profiler is initialized, False otherwise | ||
""" | ||
return self._initialized | ||
|
||
@initialized.setter | ||
def initialized(self, value: bool) -> None: | ||
"""Set whether the profiler has been initialized. | ||
|
||
Args: | ||
value (bool): True to mark as initialized, False otherwise | ||
""" | ||
if not isinstance(value, bool): | ||
raise TypeError("initialized must be a boolean value") | ||
self._initialized = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the lifecycle of a profiler always enabled
→ initialized
→ finalized
? Perhaps it would be simpler to have a single state variable of type Literal["enabled", "initialized", "finalized"]
such that we cannot accidentally have an erroneous state.
Reminder to change the target to |
Modulus Pull Request
Description
This pull request adds a utility to modulus,
modulus.utils.profiling
, which serves as a single point of entry for a few common profiling tools. This PR will also add tutorials on using these hooks, running and analyzing profilers, and showing how to extend the hooks to add custom user-specific tools. A high level summary of the additions:modulus.utils.profiling.core.ProfileRegistry
is singleton factory to let users conveniently access profiler instances from their code, wherever the call from.modulus.utils.profiling.core.ModulusProfilerWrapper
is a base class for profiler tools. Several common tools inherit this (like the torch profiler) and users can inherit and extend it.modulus.utils.profiling.interface.Profiler
is a singleton, one-stop-shop for access the configured profiling utilities. Let's users apply profiling hooks and annotations, but also lets users modify default profiler configurations when necessary.The profiler interface works like this: Users can instrument their code anywhere, anytime:
annotate
is always a pointer tonvtx.annotate
if it's installed, otherwise it's a null op.profile
is a rename ofProfiler().__call__
, which is a (delayed) function decorator. The delay is because model code may often be imported and evaluated by the interpetter before the Profiler is fully initialized. To prevent the profiler from failing in these cases, the function signatures it wraps are captured and the decoration is replayed right after the profiler is initialized.Checklist
Dependencies