Skip to content

Conversation

sayedbilalbari
Copy link
Collaborator

@sayedbilalbari sayedbilalbari commented Aug 29, 2025

Fixes #1894

Context ->

  • Currently there is not way in tools to configure log file for the user_tools run
  • This is rightly configured as each tools run renders a new file with the global uuid to identify the matching run
  • However, for some dev use cases where a user might want to tail a single file.
  • Or configure the log file in cases of using tools as library, a way to configure log file location would come handy and useful

Changes ->

  • The current code changes update tools to respect any pre-existing log_file location by means of an env variable - RAPIDS_USER_TOOLS)LOG_FILE
  • Adds validation to make sure that the file-
    • Is local
    • No CSP file supported
    • The file path is correct ( no directory or parent path is passed )
    • It has to be a fully qualified file path
    • If parent does not exist, it will be created
  • Updates tools to share a consistent run_id which is updated by means of ENV variables
  • Log format is updated to have a run_id filter which makes it easy to find logs related to a particular run

Testing ->

  • The changes have been tested in local to make sure that the code respects the LOG_FILE env variable and the logs are added in append mode to the existing file

Log Format changes ->

The updated console logs -

15:07:48 INFO rapids.tools.qualification.stats [run_id=qual_20250903150709_Ebc0EC1F]: Reading CSV files...
15:07:48 INFO rapids.tools.qualification.stats [run_id=qual_20250903150709_Ebc0EC1F]: Using QualCoreHandler to read data...
15:07:48 INFO rapids.tools.qualification.stats [run_id=qual_20250903150709_Ebc0EC1F]: Reading data using QualCoreHandler completed.

The updated file logs -

2025-09-03 15:07:48 INFO rapids.tools.qualification [run_id=qual_20250903150709_Ebc0EC1F]: Generating GPU Estimated Speedup: as /Users/sbari/IdeaProjects/scratch-folder/issue-1702/output-directory/qual_20250903150709_Ebc0EC1F/qualification_summary.csv
2025-09-03 15:07:48 INFO rapids.tools.qualification [run_id=qual_20250903150709_Ebc0EC1F]: ======= [Collecting-Results]: Finished =======
2025-09-03 15:07:48 INFO rapids.tools.qualification [run_id=qual_20250903150709_Ebc0EC1F]: ******* [Archiving Tool Output]: Starting *******

Caveats ->

  • By default console logs are only ERROR/WARNING and File logs are DEBUG
  • A major part of any tools execution is the segregation for each individual tools run( profiler, qual, profiler_1, qual_2 )
  • The usage of this requires keen understanding that all/any subsequent tools logs will be redirected to this file till it is set
  • So the onus is on the dev to use this in an informed way - perhaps use the file_name in a way that allows segregation between individual runs.

Signed-off-by: Sayed Bilal Bari <[email protected]>
@Copilot Copilot AI review requested due to automatic review settings August 29, 2025 23:56
@github-actions github-actions bot added the user_tools Scope the wrapper module running CSP, QualX, and reports (python) label Aug 29, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR modifies the tools initialization to respect a pre-existing RAPIDS_USER_TOOLS_LOG_FILE environment variable instead of always overriding it. This allows external libraries or users to specify their own log file location.

Key changes:

  • Added logic to check for existing LOG_FILE environment variable before setting a default
  • Differentiated messaging between externally configured and default log file locations
  • Preserved existing functionality while adding flexibility for external configuration

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sayedbilalbari !
Can we have more details in the description? It is not clear what is the purpose of the change given that users are setting the file names (not directory name)
What is the use case that user manage naming the log file themselves? The idea to link between the log_file and the directory name to know which run produced which logs.
Otherwise, how can someone link between the them?

# LOG_FILE already set by external caller, respect it
log_file = existing_log_file
log_dir = str(Path(log_file).parent)
log_message_prefix = 'Location (External)'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not very useful message

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, updated to remove this

@amahussein
Copy link
Collaborator

Thanks @sayedbilalbari ! Can we have more details in the description? It is not clear what is the purpose of the change given that users are setting the file names (not directory name) What is the use case that user manage naming the log file themselves? The idea to link between the log_file and the directory name to know which run produced which logs. Otherwise, how can someone link between the them?

What I am trying to understand here is what is the context that we give users two options:

  1. set the directory: then the tools will name the file based on the uuid of the run.
  2. set the log_file: which is a full path

The usage is not clear and confusing and there is no documentation on what to expect from the two variables and how they can overlap with each other.
If we give the users option to set the log directory, then we cannot have another option to set the full path of log-file. It should only be one of them. That's why the original code did not read the log-file env-variable before setting it.

@sayedbilalbari
Copy link
Collaborator Author

Thanks @amahussein for the feedback. Some clarification points.
The existing code -

log_dir = f'{tools_home_dir}/logs'
log_file = f'{log_dir}/{short_name}_{uuid}.log'
Utils.set_rapids_tools_env('LOG_FILE', log_file)

The env variable is the fully qualified path of the log file.
The expected behavior is to allow the users to set the fully qualified path(with name) of the log file.
The change adds that functionality where -

  • If the env variables is set, it is expected to be the fully qualified path of the LOG file and we use that as is.
  • As part of the logic, there is code to extract the directory of the log file and create it if not already created
  • There is no logic to allow the user to set the log directory separately and the log file name separately ( perhaps the existing code where the home directory is set as ~/.spark_rapids_tools created the confusion - this change does not interfere with that )
  • Taking from this, perhaps there could be extra validation to ensure that the user sets the complete log file path as the env variable.
    Updated code for reference -
existing_log_file = Utils.get_rapids_tools_env('LOG_FILE')
if existing_log_file:
    # LOG_FILE already set by external caller, respect it
    log_file = existing_log_file
    log_dir = str(Path(log_file).parent)
    log_message_prefix = 'Location (External)'
    usage_message = 'Using externally configured log file location.\n'

@sayedbilalbari
Copy link
Collaborator Author

Also the other pre-existing logic to set the HOME directory,

home_dir = Utils.get_sys_env_var('HOME', '/tmp')
tools_home_dir = FSUtil.build_path(home_dir, '.spark_rapids_tools')
Utils.set_rapids_tools_env('HOME', tools_home_dir)

it never checks for any pre-existing env variables. So whatever the user sets, cannot override the HOME directory. So no way to configure the directory of the log files.

Signed-off-by: Sayed Bilal Bari <[email protected]>
@amahussein
Copy link
Collaborator

Thanks @amahussein for the feedback. Some clarification points. The existing code -

log_dir = f'{tools_home_dir}/logs'
log_file = f'{log_dir}/{short_name}_{uuid}.log'
Utils.set_rapids_tools_env('LOG_FILE', log_file)

The env variable is the fully qualified path of the log file. The expected behavior is to allow the users to set the fully qualified path(with name) of the log file. The change adds that functionality where -

* If the env variables is set, it is expected to be the fully qualified path of the LOG file and we use that as is.

* As part of the logic, there is code to extract the directory of the log file and create it if not already created

* There is no logic to allow the user to set the log directory separately and the log file name separately ( perhaps the existing code where the home directory is set as `~/.spark_rapids_tools` created the confusion - this change does not interfere with that )

* Taking from this, perhaps there could be extra validation to ensure that the user sets the complete log file path as the env variable.
  Updated code for reference -
existing_log_file = Utils.get_rapids_tools_env('LOG_FILE')
if existing_log_file:
    # LOG_FILE already set by external caller, respect it
    log_file = existing_log_file
    log_dir = str(Path(log_file).parent)
    log_message_prefix = 'Location (External)'
    usage_message = 'Using externally configured log file location.\n'

I am still missing the point.
The issue/PR description reads as if it is not "correctly taking user env variable", which is more like a bug.
The tools was designed to "not to allow setting the log-file" by the user. That was on purpose because the log-file should have self identity to map to the tools-run that generated it. In other words, the user cannot name the log-file before actually running the tool-cmd.

  • If we are making a new feature, then please we need to have more proper description in the issue/PR, and not re-fit functionalities to do things that were not designed to do. Along with assumptions like:
    • how can someone tell the log-file of a specific run?
    • whether it is user responsibility to make sure that the parent directory exist or not?
  • If there is a bug that the log-file generated by the tools does not have the correct uuid, then this is a bug that was introduced recently, and we need to fix it accordingly.

@sayedbilalbari
Copy link
Collaborator Author

Thanks @amahussein . Very valid points.

  • Not a bug in the existing tools configuration.
  • Tools correctly resolves the log file name to map it to a {short_name}_{uuid} format.
  • This is a feature request where if set, tools redirects logs to the file set in the ENV var.
  • This feature is oriented from a dev perspective where instead of always resolving to a new log file, in case set, the log file is used in an append only mode and new logs are added to it and the user gets to tail it.
  • Any old logs to the file never get overwritten
  • Assumption in this case is, if a person is using this feature, they know what they are doing.
  • Will add more description to the issue and the PR. Will also add more validations to make sure that the user cannot set invalid file paths as log file.
  • For the logic to figure out the run information from the log file, if we add meta information to the log file name, that defeats the purpose of setting the file as env variable.
  • If we resort to passing log folder separately and log file name separately, that can be very confusing.
  • Another solution is to add the run uuid in the log format to make it easy for the File logs to be searchable for a particular run user_tools run - but in this case if tools is being used as a library in another project that might have its own global run id, this creates a mismatched global ID with an internal ID tools UUID that a user might not have context to.

Hence the cleanest way without too much confusion was leaving it upto the user to decide the file_name.
Will update the documentation with detailed usage guide.

Signed-off-by: Sayed Bilal Bari <[email protected]>
Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets take other devs feedback.
I am going to yield to other devs to vote that up.

@@ -101,6 +101,52 @@ def stringify_path(fpath) -> str:
return os.path.abspath(expanded_path)


def validate_local_log_file_path(log_file_path: str) -> str:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is necessary. Seems to be an overkill to validate all those cases for the log-file.
The point that log-file should be very straightforward to log any issues during initializations. Too much complex around the log-files might break before even setting the log-file. which defies the purpose.
Anyway, I am fine with keeping it that way.

print(Utils.gen_report_sec_header('Application Logs'))
print(f'Location: {log_file}')
print(f'Location : {log_file}')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra white space before column

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks !

@sayedbilalbari
Copy link
Collaborator Author

Thanks @amahussein . Have updated the code to have a common RUN_ID across each run to move the meta information previously contained in the log file name () to the loggers.
This solves the problem of having tying logs related to a run to each execution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
user_tools Scope the wrapper module running CSP, QualX, and reports (python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Update tools to respects pre-existing RAPIDS_USER_TOOLS_LOG_FILE
2 participants