-
Notifications
You must be signed in to change notification settings - Fork 45
Finalize the toolsAPI helper methods and classes #1893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Ahmed Hussein (amahussein) <[email protected]> Fixes NVIDIA#1887 Simplifies the interface in order to use the Tools-API more efficiently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request refactors the API v1 report handler system to simplify and modernize the interface for using the Tools-API. The main change removes the legacy APIHelpers
builder pattern in favor of direct class-based instantiation for core report handlers like QualCore
and ProfCore
, resulting in cleaner and more maintainable code.
Key changes:
- Removed
APIHelpers
builder class and replaced with direct instantiation of core handlers - Introduced new
CombinedCSVBuilder
class with metaclass to replaceCombinedDFBuilder
- Updated all imports and usages throughout the codebase to use the new handler classes directly
Reviewed Changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
test_result_handler_steps.py | Updated test imports and handler instantiation to use new API |
qualx/util.py | Replaced APIHelpers with direct QualCore/QualWrapper usage |
qualx/qualx_main.py | Updated imports and CombinedDFBuilder usage to new CombinedCSVBuilder |
qualx/preprocess.py | Changed ProfWrapper import from APIHelpers to direct import |
qualification_stats_report.py | Simplified CSV builder usage with new handler methods |
result_handler.py | Added get_raw_metrics_path method to base handler |
qual_handler.py | Removed duplicate get_raw_metrics_path method |
builder.py | Major refactor: removed APIHelpers, added new builder classes and metaclasses |
init.py | Updated exports to include new handler classes |
rapids_tool.py | Updated core_handler property to return APIResHandler type |
prediction.py | Updated qual_handler property to use direct QualCore instantiation |
qualification_stats.py | Updated QualCore import and instantiation |
qualification_core.py | Updated type annotations for new QualCore class |
qualification.py | Replaced APIHelpers usage with direct handler methods |
profiling_core.py | Updated type annotations for new ProfCore class |
profiling.py | Updated TXT report usage to use new handler methods |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One spelling nitpick, otherwise LGTM. API looks much cleaner!
) as c_builder: | ||
with self.core_handler.csv_combiner( | ||
'clusterInfoJSONReport' | ||
).supress_failure() as c_builder: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "suppress_failure"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! Thanks @leewyang
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @amahussein for this neat implementation !
Just some doubts regarding an edge case and the usage of Meta functions.
Makes sense from a code segregation point of view. Highlighting it through comments.
""" | ||
A generic internal class for building API result handlers. | ||
Metaclass for CombinedCSVBuilder to register subclasses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Metaclass here extracts the ToolResultHandler in case a generic APIResultHandler is passed.
I don't see it registering any subclasses. The comment could be updated
*args, | ||
**kwargs | ||
) -> 'CombinedCSVBuilder': | ||
if isinstance(handlers, list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic assumes that the input is either a list of ToolsResultHandlerT or a list of ApiHandler[ToolsResultHandlerT] hence trying to extract the handler when it sees a list.
The CombinedCsvBuilder natively accepts ToolsResultHandlerT or a list[ToolsResultHandlerT] in which case I am assuming a dev can resort to passing list[QualCoreHandler, ProfCoreHandler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the docstrings with more details about the usage of the case of the list 'APIResHandler[ToolResultHandlerT]']
.
I am assuming a dev can resort to passing list[QualCoreHandler, ProfCoreHandler.
This is correct if he is calling the init of the object itself (not the metaClass__call__
)
handlers_arg = [r.handler for r in handlers] | ||
else: | ||
handlers_arg = handlers | ||
instance: 'CombinedCSVBuilder' = super(CombinedCSVBuilderMeta, cls).__call__( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions in my head, since the Meta function adds handler extraction, what's the benefit of adding a layer of complexity by the use of Meta class vs keeping this functions simple and doing this in either the init or call of the CombinedCsvBuilder itself.
I get the part of how this keeps preprocessing separate from object initialization. Just feel a bit of an overkill for a preprocessing logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason in doing this is to provide an initializer wrapper that hides the other fields in each subclass.
Then, the metaClass can use the class registry in checking the instances and handlem them on the fly.
I believe that part of the confusion here is because I haven't put the "class registry" yet. This is okay because the class registry should not affect how the user is using the API.
|
||
Parameters: | ||
table (str): The table label to load from each handler. | ||
handlers (ToolResultHandlerT or List[ToolResultHandlerT]): One or more result handlers to aggregate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a comment here in reference to the comment about passing a list of ToolsResultHandlerT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I regenerated the docstings
detail_reason = 'a None DataFrame' if final_res.data is None else 'an empty DataFrame' | ||
raise ValueError( | ||
f'Loading report {self.table} on dataset returned {detail_reason}.') | ||
# If we reach this point, then there is no raise on exceptions, just proceed with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed comments explaining the flow of build
""" | ||
REPORT_LABEL: ClassVar[str] = '' | ||
_out_path: Optional[Union[str, BoundedCspPath]] = None | ||
_report_id: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit confused by the usage of REPORT_LABEL and _report_id here.
The LABEL is used to decide to build the Handler or not which internally does the same thing of storing the LABEL value as the _report_id.
I see the usage of ClassVar here assuming this helps in case of using multiple handlers ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_report_id
is a field captured in the object itself. While REPORT_LABEL
is the classVar.
This redundancy is intended to allow separate representation between internal representation and what the users are facing.
# call the metaclass constructor to create the instance | ||
instance = type(cls).__call__(cls, out_path=out_path) | ||
# set the report ID and build the handler | ||
instance = instance.report(report_id).build() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is similar to what we are doing in the Meta Functions call. Hence the confusion regarding the usage of REPORT_LABEL and report_id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is a special case to unlock the unit-tests.
- users are typically going to use the metaClass to instantiate their objects. This more convenient so they don't have to deal with the labels.
- In unit tests, to make it easier for scenarios, the report_id is based as an argument. Therefore, I created this backdoor to pick the right object. Another approach was to loop on all the classes to figure out which one has the valid label. However, it will cause ambiguity because the string argument passed to the constructor in both cases is a string (out_path Vs, report_label)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sayedbilalbari for the feedback.
# call the metaclass constructor to create the instance | ||
instance = type(cls).__call__(cls, out_path=out_path) | ||
# set the report ID and build the handler | ||
instance = instance.report(report_id).build() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is a special case to unlock the unit-tests.
- users are typically going to use the metaClass to instantiate their objects. This more convenient so they don't have to deal with the labels.
- In unit tests, to make it easier for scenarios, the report_id is based as an argument. Therefore, I created this backdoor to pick the right object. Another approach was to loop on all the classes to figure out which one has the valid label. However, it will cause ambiguity because the string argument passed to the constructor in both cases is a string (out_path Vs, report_label)
""" | ||
REPORT_LABEL: ClassVar[str] = '' | ||
_out_path: Optional[Union[str, BoundedCspPath]] = None | ||
_report_id: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_report_id
is a field captured in the object itself. While REPORT_LABEL
is the classVar.
This redundancy is intended to allow separate representation between internal representation and what the users are facing.
|
||
Parameters: | ||
table (str): The table label to load from each handler. | ||
handlers (ToolResultHandlerT or List[ToolResultHandlerT]): One or more result handlers to aggregate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I regenerated the docstings
handlers_arg = [r.handler for r in handlers] | ||
else: | ||
handlers_arg = handlers | ||
instance: 'CombinedCSVBuilder' = super(CombinedCSVBuilderMeta, cls).__call__( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason in doing this is to provide an initializer wrapper that hides the other fields in each subclass.
Then, the metaClass can use the class registry in checking the instances and handlem them on the fly.
I believe that part of the confusion here is because I haven't put the "class registry" yet. This is okay because the class registry should not affect how the user is using the API.
*args, | ||
**kwargs | ||
) -> 'CombinedCSVBuilder': | ||
if isinstance(handlers, list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the docstrings with more details about the usage of the case of the list 'APIResHandler[ToolResultHandlerT]']
.
I am assuming a dev can resort to passing list[QualCoreHandler, ProfCoreHandler.
This is correct if he is calling the init of the object itself (not the metaClass__call__
)
Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>
Signed-off-by: Ahmed Hussein (amahussein) [email protected]
Fixes #1887
Simplifies the interface in order to use the Tools-API more efficiently.
This pull request refactors the API v1 report handler system in the RAPIDS PyTools codebase to simplify and modernize the way core report handlers are created and used. The main change is the removal of the legacy
APIHelpers
builder pattern in favor of direct class-based instantiation for report handlers (e.g.,QualCore
,ProfCore
). This results in cleaner, more readable code and a more consistent interface for handling reports. Additionally, theCombinedDFBuilder
is replaced with a newCombinedCSVBuilder
class and metaclass for combining CSV reports.Key changes include:
API Handler Refactor and Simplification:
APIHelpers
builder class and associated indirection. Now, core report handlers likeQualCore
andProfCore
are instantiated directly, leading to more straightforward and maintainable code. [1] [2]APIHelpers
. [1] [2] [3] [4] [5] [6] [7]Core Handler Property Changes:
core_handler
property inRapidsJarTool
and related classes to return the newAPIResHandler
-based handler types, and updated instantiation logic accordingly. [1] [2] [3] [4]Report and Combiner Usage Updates:
APIHelpers.CombinedDFBuilder
with the newCombinedCSVBuilder
class and updated the logic to use the new combiner interface. [1] [2] [3] [4] [5]self.core_handler.csv(...)
andself.core_handler.txt(...)
instead of the old report classes. [1] [2] [3]Cleanup and Documentation:
APIResultHandler
and related builder logic, and updated documentation and examples to reflect the new usage patterns. [1] [2]These changes make the codebase easier to maintain, reduce boilerplate, and provide a more Pythonic interface for working with core report handlers and CSV combiners.